O'Reilly Data Show-logo

O'Reilly Data Show

Technology Podcasts >

The O'Reilly Data Show explores the opportunities and techniques driving big data and data science. Through interviews and analysis, we highlight the people putting data to work.

The O'Reilly Data Show explores the opportunities and techniques driving big data and data science. Through interviews and analysis, we highlight the people putting data to work.
More Information

Location:

Sebastopol, CA

Description:

The O'Reilly Data Show explores the opportunities and techniques driving big data and data science. Through interviews and analysis, we highlight the people putting data to work.

Twitter:

@strataconf

Language:

English


Episodes

Machine learning for operational analytics and business intelligence

10/10/2019
More
The O’Reilly Data Show Podcast: Peter Bailis on data management, ML benchmarks, and building next-gen tools for analysts. In this episode of the Data Show, I speak with Peter Bailis, founder and CEO of Sisu, a startup that is using machine learning to improve operational analytics. Bailis is also an assistant professor of computer science at Stanford University, where he conducts research into data-intensive systems and where he is co-founder of the DAWN Lab. We had a great conversation...

Duration:00:51:38

Machine learning and analytics for time series data

9/26/2019
More
The O’Reilly Data Show Podcast: Arun Kejariwal and Ira Cohen on building large-scale, real-time solutions for anomaly detection and forecasting. In this episode of the Data Show, I speak with Arun Kejariwal of Facebook and Ira Cohen of Anodot (full disclosure: I’m an advisor to Anodot). This conversation stemmed from a recent online panel discussion we did, where we discussed time series data, and, specifically, anomaly detection and forecasting. Both Kejariwal (at Machine Zone, Twitter,...

Duration:00:40:30

Understanding deep neural networks

9/12/2019
More
The O’Reilly Data Show Podcast: Michael Mahoney on developing a practical theory for deep learning. In this episode of the Data Show, I speak with Michael Mahoney, a member of RISELab, the International Computer Science Institute, and the Department of Statistics at UC Berkeley. A physicist by training, Mahoney has been at the forefront of many important problems in large-scale data analysis. On the theoretical side, his works spans algorithmic and statistical methods for matrices, graphs,...

Duration:00:39:31

Becoming a machine learning practitioner

8/29/2019
More
The O’Reilly Data Show Podcast: Kesha Williams on how she added machine learning to her software developer toolkit. In this episode of the Data Show, I speak with Kesha Williams, technical instructor at A Cloud Guru, a training company focused on cloud computing. As a full stack web developer, Williams became intrigued by machine learning and started teaching herself the ML tools on Amazon Web Services. Fast forward to today, Williams has built some well-regarded Alexa skills, mastered ML...

Duration:00:33:21

Labeling, transforming, and structuring training data sets for machine learning

8/15/2019
More
The O’Reilly Data Show Podcast: Alex Ratner on how to build and manage training data with Snorkel. In this episode of the Data Show, I speak with Alex Ratner, project lead for Stanford’s Snorkel open source project; Ratner also recently garnered a faculty position at the University of Washington and is currently working on a company supporting and extending the Snorkel project. Snorkel is a framework for building and managing training data. Based on our survey from earlier this year,...

Duration:00:40:51

Make data science more useful

8/1/2019
More
The O’Reilly Data Show Podcast: Cassie Kozyrkov on connecting data and AI to business. In this episode of the Data Show, I speak with Cassie Kozyrkov, technical director and chief decision scientist at Google Cloud. She describes "decision intelligence" as an interdisciplinary field concerned with all aspects of decision-making, and which combines data science with the behavioral sciences. Most recently she has been focused on developing best practices that can help practitioners make...

Duration:00:35:03

Acquiring and sharing high-quality data

7/18/2019
More
The O’Reilly Data Show Podcast: Roger Chen on the fair value and decentralized governance of data. In this episode of the Data Show, I spoke with Roger Chen, co-founder and CEO of Computable Labs, a startup focused on building tools for the creation of data networks and data exchanges. Chen has also served as co-chair of O'Reilly's Artificial Intelligence Conference since its inception in 2016. This conversation took place the day after Chen and his collaborators released an interesting...

Duration:00:39:20

Tools for machine learning development

7/3/2019
More
The O'Reilly Data Show: Ben Lorica chats with Jeff Meyerson of Software Engineering Daily about data engineering, data architecture and infrastructure, and machine learning. In this week's episode of the Data Show, we're featuring an interview Data Show host Ben Lorica participated in for the Software Engineering Daily Podcast, where he was interviewed by Jeff Meyerson. Their conversation mainly centered around data engineering, data architecture and infrastructure, and machine learning...

Duration:00:39:24

Enabling end-to-end machine learning pipelines in real-world applications

6/20/2019
More
The O’Reilly Data Show Podcast: Nick Pentreath on overcoming challenges in productionizing machine learning models. In this episode of the Data Show, I spoke with Nick Pentreath, principal engineer at IBM. Pentreath was an early and avid user of Apache Spark, and he subsequently became a Spark committer and PMC member. Most recently his focus has been on machine learning, particularly deep learning, and he is part of a group within IBM focused on building open source tools that enable...

Duration:00:42:53

Bringing scalable real-time analytics to the enterprise

6/6/2019
More
The O’Reilly Data Show Podcast: Dhruba Borthakur and Shruti Bhat on enabling interactive analytics and data applications against live data. In this episode of the Data Show, I spoke with Dhruba Borthakur (co-founder and CTO) and Shruti Bhat (SVP of Marketing) of Rockset, a startup focused on building solutions for interactive data science and live applications. Borthakur was the founding engineer of HDFS and creator of RocksDB, while Bhat is an experienced product and marketing executive...

Duration:00:37:11

Applications of data science and machine learning in financial services

5/23/2019
More
The O’Reilly Data Show Podcast: Jike Chong on the many exciting opportunities for data professionals in the U.S. and China. In this episode of the Data Show, I spoke with Jike Chong, chief data scientist at Acorns, a startup focused on building tools for micro-investing. Chong has extensive experience using analytics and machine learning in financial services, and he has experience building data science teams in the U.S. and in China. We had a great conversation spanning many topics,...

Duration:00:42:32

Real-time entity resolution made accessible

5/9/2019
More
The O’Reilly Data Show Podcast: Jeff Jonas on the evolution of entity resolution technologies. In this episode of the Data Show, I spoke with Jeff Jonas, CEO, founder and chief scientist of Senzing, a startup focused on making real-time entity resolution technologies broadly accessible. He was previously a fellow and chief scientist of context computing at IBM. Entity resolution (ER) refers to techniques and tools for identifying and linking manifestations of the same...

Duration:00:27:09

Why companies are in need of data lineage solutions

4/25/2019
More
The O’Reilly Data Show Podcast: Neelesh Salian on data lineage, data governance, and evolving data platforms. In this episode of the Data Show, I spoke with Neelesh Salian, software engineer at Stitch Fix, a company that combines machine learning and human expertise to personalize shopping. As companies integrate machine learning into their products and systems, there are important foundational technologies that come into play. This shouldn’t come as a shock, as current machine learning...

Duration:00:34:28

What data scientists and data engineers can do with current generation serverless technologies

4/11/2019
More
The O’Reilly Data Show Podcast: Avner Braverman on what’s missing from serverless today and what users should expect in the near future. In this episode of the Data Show, I spoke with Avner Braverman, co-founder and CEO of Binaris, a startup that aims to bring serverless to web-scale and enterprise applications. This conversation took place shortly after the release of a seminal paper from UC Berkeley (“Cloud Programming Simplified: A Berkeley View on Serverless Computing”), and this...

Duration:00:36:32

It’s time for data scientists to collaborate with researchers in other disciplines

3/28/2019
More
The O’Reilly Data Show Podcast: Forough Poursabzi Sangdeh on the interdisciplinary nature of interpretable and interactive machine learning. In this episode of the Data Show, I spoke with Forough Poursabzi-Sangdeh, a postdoctoral researcher at Microsoft Research New York City. Poursabzi works in the interdisciplinary area of interpretable and interactive machine learning. As models and algorithms become more widespread, many important considerations are becoming active research areas:...

Duration:00:36:08

Algorithms are shaping our lives—here’s how we wrest back control

3/14/2019
More
The O’Reilly Data Show Podcast: Kartik Hosanagar on the growing power and sophistication of algorithms. In this episode of the Data Show, I spoke with Kartik Hosanagar, professor of technology and digital business, and professor of marketing at The Wharton School of the University of Pennsylvania. Hosanagar is also the author of a newly released book, A Human’s Guide to Machine Intelligence, an interesting tour through the recent evolution of AI applications that draws from his extensive...

Duration:00:44:15

Why your attention is like a piece of contested territory

2/28/2019
More
The O’Reilly Data Show Podcast: P.W. Singer on how social media has changed, war, politics, and business. In this episode of the Data Show, I spoke with P.W. Singer, strategist and senior fellow at the New America Foundation, and a contributing editor at Popular Science. He is co-author of an excellent new book, LikeWar: The Weaponization of Social Media, which explores how social media has changed war, politics, and business. The book is essential reading for anyone interested in how...

Duration:00:43:05

The technical, societal, and cultural challenges that come with the rise of fake media

2/14/2019
More
The O’Reilly Data Show Podcast: Siwei Lyu on machine learning for digital media forensics and image synthesis. In this episode of the Data Show, I spoke with Siwei Lyu, associate professor of computer science at the University at Albany, State University of New York. Lyu is a leading expert in digital media forensics, a field of research into tools and techniques for analyzing the authenticity of media files. Over the past year, there have been many stories written about the rise of tools...

Duration:00:30:53

Using machine learning and analytics to attract and retain employees

1/31/2019
More
The O’Reilly Data Show Podcast: Maryam Jahanshahi on building tools to help improve efficiency and fairness in how companies recruit. In this episode of the Data Show, I spoke with Maryam Jahanshahi, research scientist at TapRecruit, a startup that uses machine learning and analytics to help companies recruit more effectively. In an upcoming survey, we found that a “skills gap” or “lack of skilled people” was one of the main bottlenecks holding back adoption of AI technologies. Many...

Duration:00:46:54

How machine learning impacts information security

1/17/2019
More
The O’Reilly Data Show Podcast: Andrew Burt on the need to modernize data protection tools and strategies. In this episode of the Data Show, I spoke with Andrew Burt, chief privacy officer and legal engineer at Immuta, a company building data management tools tuned for data science. Burt and cybersecurity pioneer Daniel Geer recently released a must-read white paper (“Flat Light”) that provides a great framework for how to think about information security in the age of big data and AI....

Duration:00:39:48