O'Reilly Data Show Podcast-logo

O'Reilly Data Show Podcast

Technology Podcasts

The O'Reilly Data Show Podcast explores the opportunities and techniques driving big data, data science, and AI.

Location:

Sebastopol, CA

Description:

The O'Reilly Data Show Podcast explores the opportunities and techniques driving big data, data science, and AI.

Twitter:

@strataconf

Language:

English


Episodes

Machine learning for operational analytics and business intelligence

10/10/2019
In this episode of the Data Show, I speak with Peter Bailis, founder and CEO of Sisu, a startup that is using machine learning to improve operational analytics. Bailis is also an assistant professor of computer science at Stanford University, where he conducts research into data-intensive systems and where he is co-founder of the DAWN […]

Duration:00:51:38

Machine learning and analytics for time series data

9/26/2019
In this episode of the Data Show, I speak with Arun Kejariwal of Facebook and Ira Cohen of Anodot (full disclosure: I’m an advisor to Anodot). This conversation stemmed from a recent online panel discussion we did, where we discussed time series data, and, specifically, anomaly detection and forecasting. Both Kejariwal (at Machine Zone, Twitter, […]

Duration:00:40:30

Understanding deep neural networks

9/12/2019
In this episode of the Data Show, I speak with Michael Mahoney, a member of RISELab, the International Computer Science Institute, and the Department of Statistics at UC Berkeley. A physicist by training, Mahoney has been at the forefront of many important problems in large-scale data analysis. On the theoretical side, his works spans algorithmic […]

Duration:00:39:31

Becoming a machine learning practitioner

8/29/2019
In this episode of the Data Show, I speak with Kesha Williams, technical instructor at A Cloud Guru, a training company focused on cloud computing. As a full stack web developer, Williams became intrigued by machine learning and started teaching herself the ML tools on Amazon Web Services. Fast forward to today, Williams has built […]

Duration:00:33:21

Labeling, transforming, and structuring training data sets for machine learning

8/15/2019
In this episode of the Data Show, I speak with Alex Ratner, project lead for Stanford’s Snorkel open source project; Ratner also recently garnered a faculty position at the University of Washington and is currently working on a company supporting and extending the Snorkel project. Snorkel is a framework for building and managing training data. […]

Duration:00:40:51

Make data science more useful

8/1/2019
In this episode of the Data Show, I speak with Cassie Kozyrkov, technical director and chief decision scientist at Google Cloud. She describes “decision intelligence” as an interdisciplinary field concerned with all aspects of decision-making, and which combines data science with the behavioral sciences. Most recently she has been focused on developing best practices that […]

Duration:00:35:03

Acquiring and sharing high-quality data

7/18/2019
In this episode of the Data Show, I spoke with Roger Chen, co-founder and CEO of Computable Labs, a startup focused on building tools for the creation of data networks and data exchanges. Chen has also served as co-chair of O’Reilly’s Artificial Intelligence Conference since its inception in 2016. This conversation took place the day […]

Duration:00:39:20

Tools for machine learning development

7/3/2019
In this week’s episode of the Data Show, we’re featuring an interview Data Show host Ben Lorica participated in for the Software Engineering Daily Podcast, where he was interviewed by Jeff Meyerson. Their conversation mainly centered around data engineering, data architecture and infrastructure, and machine learning (ML). Here are a few highlights: Tools for productive […]

Duration:00:39:24

Enabling end-to-end machine learning pipelines in real-world applications

6/20/2019
In this episode of the Data Show, I spoke with Nick Pentreath, principal engineer at IBM. Pentreath was an early and avid user of Apache Spark, and he subsequently became a Spark committer and PMC member. Most recently his focus has been on machine learning, particularly deep learning, and he is part of a group […]

Duration:00:42:53

Bringing scalable real-time analytics to the enterprise

6/9/2019
In this episode of the Data Show, I spoke with Dhruba Borthakur (co-founder and CTO) and Shruti Bhat (SVP of Product) of Rockset, a startup focused on building solutions for interactive data science and live applications. Borthakur was the founding engineer of HDFS and creator of RocksDB, while Bhat is an experienced product and marketing […]

Duration:00:37:11

Applications of data science and machine learning in financial services

5/23/2019
In this episode of the Data Show, I spoke with Jike Chong, chief data scientist at Acorns, a startup focused on building tools for micro-investing. Chong has extensive experience using analytics and machine learning in financial services, and he has experience building data science teams in the U.S. and in China. We had a great […]

Duration:00:42:32

Real-time entity resolution made accessible

5/9/2019
In this episode of the Data Show, I spoke with Jeff Jonas, CEO, founder and chief scientist of Senzing, a startup focused on making real-time entity resolution technologies broadly accessible. He was previously a fellow and chief scientist of context computing at IBM. Entity resolution (ER) refers to techniques and tools for identifying and linking […]

Duration:00:27:09

Why companies are in need of data lineage solutions

4/25/2019
In this episode of the Data Show, I spoke with Neelesh Salian, software engineer at Stitch Fix, a company that combines machine learning and human expertise to personalize shopping. As companies integrate machine learning into their products and systems, there are important foundational technologies that come into play. This shouldn’t come as a shock, as […]

Duration:00:34:28

What data scientists and data engineers can do with current generation serverless technologies

4/11/2019
In this episode of the Data Show, I spoke with Avner Braverman, co-founder and CEO of Binaris, a startup that aims to bring serverless to web-scale and enterprise applications. This conversation took place shortly after the release of a seminal paper from UC Berkeley (“Cloud Programming Simplified: A Berkeley View on Serverless Computing”), and this […]

Duration:00:36:32

It’s time for data scientists to collaborate with researchers in other disciplines

3/28/2019
In this episode of the Data Show, I spoke with Forough Poursabzi-Sangdeh, a postdoctoral researcher at Microsoft Research New York City. Poursabzi works in the interdisciplinary area of interpretable and interactive machine learning. As models and algorithms become more widespread, many important considerations are becoming active research areas: fairness and bias, safety and reliability, security […]

Duration:00:36:08

Algorithms are shaping our lives—here’s how we wrest back control

3/14/2019
In this episode of the Data Show, I spoke with Kartik Hosanagar, professor of technology and digital business, and professor of marketing at The Wharton School of the University of Pennsylvania. Hosanagar is also the author of a newly released book, A Human’s Guide to Machine Intelligence, an interesting tour through the recent evolution of […]

Duration:00:44:15

Why your attention is like a piece of contested territory

2/28/2019
In this episode of the Data Show, I spoke with P.W. Singer, strategist and senior fellow at the New America Foundation, and a contributing editor at Popular Science. He is co-author of an excellent new book, LikeWar: The Weaponization of Social Media, which explores how social media has changed war, politics, and business. The book […]

Duration:00:43:05

The technical, societal, and cultural challenges that come with the rise of fake media

2/14/2019
In this episode of the Data Show, I spoke with Siwei Lyu, associate professor of computer science at the University at Albany, State University of New York. Lyu is a leading expert in digital media forensics, a field of research into tools and techniques for analyzing the authenticity of media files. Over the past year, […]

Duration:00:30:53

Using machine learning and analytics to attract and retain employees

1/31/2019
In this episode of the Data Show, I spoke with Maryam Jahanshahi, research scientist at TapRecruit, a startup that uses machine learning and analytics to help companies recruit more effectively. In an upcoming survey, we found that a “skills gap” or “lack of skilled people” was one of the main bottlenecks holding back adoption of […]

Duration:00:46:54

How machine learning impacts information security

1/17/2019
In this episode of the Data Show, I spoke with Andrew Burt, chief privacy officer and legal engineer at Immuta, a company building data management tools tuned for data science. Burt and cybersecurity pioneer Daniel Geer recently released a must-read white paper (“Flat Light”) that provides a great framework for how to think about information […]

Duration:00:39:48