O'Reilly Data Show-logo

O'Reilly Data Show

Technology Podcasts >

The O'Reilly Data Show explores the opportunities and techniques driving big data and data science. Through interviews and analysis, we highlight the people putting data to work.

The O'Reilly Data Show explores the opportunities and techniques driving big data and data science. Through interviews and analysis, we highlight the people putting data to work.
More Information

Location:

Sebastopol, CA

Description:

The O'Reilly Data Show explores the opportunities and techniques driving big data and data science. Through interviews and analysis, we highlight the people putting data to work.

Twitter:

@strataconf

Language:

English


Episodes

Enabling end-to-end machine learning pipelines in real-world applications

6/20/2019
More
The O’Reilly Data Show Podcast: Nick Pentreath on overcoming challenges in productionizing machine learning models. In this episode of the Data Show, I spoke with Nick Pentreath, principal engineer at IBM. Pentreath was an early and avid user of Apache Spark, and he subsequently became a Spark committer and PMC member. Most recently his focus has been on machine learning, particularly deep learning, and he is part of a group within IBM focused on building open source tools that enable...

Duration:00:42:53

Bringing scalable real-time analytics to the enterprise

6/6/2019
More
The O’Reilly Data Show Podcast: Dhruba Borthakur and Shruti Bhat on enabling interactive analytics and data applications against live data. In this episode of the Data Show, I spoke with Dhruba Borthakur (co-founder and CTO) and Shruti Bhat (SVP of Marketing) of Rockset, a startup focused on building solutions for interactive data science and live applications. Borthakur was the founding engineer of HDFS and creator of RocksDB, while Bhat is an experienced product and marketing executive...

Duration:00:37:11

Applications of data science and machine learning in financial services

5/23/2019
More
The O’Reilly Data Show Podcast: Jike Chong on the many exciting opportunities for data professionals in the U.S. and China. In this episode of the Data Show, I spoke with Jike Chong, chief data scientist at Acorns, a startup focused on building tools for micro-investing. Chong has extensive experience using analytics and machine learning in financial services, and he has experience building data science teams in the U.S. and in China. We had a great conversation spanning many topics,...

Duration:00:42:32

Real-time entity resolution made accessible

5/9/2019
More
The O’Reilly Data Show Podcast: Jeff Jonas on the evolution of entity resolution technologies. In this episode of the Data Show, I spoke with Jeff Jonas, CEO, founder and chief scientist of Senzing, a startup focused on making real-time entity resolution technologies broadly accessible. He was previously a fellow and chief scientist of context computing at IBM. Entity resolution (ER) refers to techniques and tools for identifying and linking manifestations of the same...

Duration:00:27:09

Why companies are in need of data lineage solutions

4/25/2019
More
The O’Reilly Data Show Podcast: Neelesh Salian on data lineage, data governance, and evolving data platforms. In this episode of the Data Show, I spoke with Neelesh Salian, software engineer at Stitch Fix, a company that combines machine learning and human expertise to personalize shopping. As companies integrate machine learning into their products and systems, there are important foundational technologies that come into play. This shouldn’t come as a shock, as current machine learning...

Duration:00:34:28

What data scientists and data engineers can do with current generation serverless technologies

4/11/2019
More
The O’Reilly Data Show Podcast: Avner Braverman on what’s missing from serverless today and what users should expect in the near future. In this episode of the Data Show, I spoke with Avner Braverman, co-founder and CEO of Binaris, a startup that aims to bring serverless to web-scale and enterprise applications. This conversation took place shortly after the release of a seminal paper from UC Berkeley (“Cloud Programming Simplified: A Berkeley View on Serverless Computing”), and this...

Duration:00:36:32

It’s time for data scientists to collaborate with researchers in other disciplines

3/28/2019
More
The O’Reilly Data Show Podcast: Forough Poursabzi Sangdeh on the interdisciplinary nature of interpretable and interactive machine learning. In this episode of the Data Show, I spoke with Forough Poursabzi-Sangdeh, a postdoctoral researcher at Microsoft Research New York City. Poursabzi works in the interdisciplinary area of interpretable and interactive machine learning. As models and algorithms become more widespread, many important considerations are becoming active research areas:...

Duration:00:36:08

Algorithms are shaping our lives—here’s how we wrest back control

3/14/2019
More
The O’Reilly Data Show Podcast: Kartik Hosanagar on the growing power and sophistication of algorithms. In this episode of the Data Show, I spoke with Kartik Hosanagar, professor of technology and digital business, and professor of marketing at The Wharton School of the University of Pennsylvania. Hosanagar is also the author of a newly released book, A Human’s Guide to Machine Intelligence, an interesting tour through the recent evolution of AI applications that draws from his extensive...

Duration:00:44:15

Why your attention is like a piece of contested territory

2/28/2019
More
The O’Reilly Data Show Podcast: P.W. Singer on how social media has changed, war, politics, and business. In this episode of the Data Show, I spoke with P.W. Singer, strategist and senior fellow at the New America Foundation, and a contributing editor at Popular Science. He is co-author of an excellent new book, LikeWar: The Weaponization of Social Media, which explores how social media has changed war, politics, and business. The book is essential reading for anyone interested in how...

Duration:00:43:05

The technical, societal, and cultural challenges that come with the rise of fake media

2/14/2019
More
The O’Reilly Data Show Podcast: Siwei Lyu on machine learning for digital media forensics and image synthesis. In this episode of the Data Show, I spoke with Siwei Lyu, associate professor of computer science at the University at Albany, State University of New York. Lyu is a leading expert in digital media forensics, a field of research into tools and techniques for analyzing the authenticity of media files. Over the past year, there have been many stories written about the rise of tools...

Duration:00:30:53

Using machine learning and analytics to attract and retain employees

1/31/2019
More
The O’Reilly Data Show Podcast: Maryam Jahanshahi on building tools to help improve efficiency and fairness in how companies recruit. In this episode of the Data Show, I spoke with Maryam Jahanshahi, research scientist at TapRecruit, a startup that uses machine learning and analytics to help companies recruit more effectively. In an upcoming survey, we found that a “skills gap” or “lack of skilled people” was one of the main bottlenecks holding back adoption of AI technologies. Many...

Duration:00:46:54

How machine learning impacts information security

1/17/2019
More
The O’Reilly Data Show Podcast: Andrew Burt on the need to modernize data protection tools and strategies. In this episode of the Data Show, I spoke with Andrew Burt, chief privacy officer and legal engineer at Immuta, a company building data management tools tuned for data science. Burt and cybersecurity pioneer Daniel Geer recently released a must-read white paper (“Flat Light”) that provides a great framework for how to think about information security in the age of big data and AI....

Duration:00:39:48

In the age of AI, fundamental value resides in data

1/3/2019
More
The O’Reilly Data Show Podcast: Haoyuan Li on accelerating analytic workloads, and innovation in data and AI in China. In this episode of the Data Show, I spoke with Haoyuan Li, CEO and founder of Alluxio, a startup commercializing the open source project with the same name (full disclosure: I’m an advisor to Alluxio). Our discussion focuses on the state of Alluxio (the open source project that has roots in UC Berkeley’s AMPLab), specifically emerging use cases here and in China. Given...

Duration:00:29:41

Trends in data, machine learning, and AI

12/20/2018
More
The O’Reilly Data Show Podcast: Ben Lorica looks ahead at what we can expect in 2019 in the big data landscape. For the end-of-year holiday episode of the Data Show, I turned the tables on Data Show host Ben Lorica to talk about trends in big data, machine learning, and AI, and what to look for in 2019. Lorica also showcased some highlights from our upcoming Strata Data and Artificial Intelligence conferences. Here are some highlights from our conversation: Real-world use cases for new...

Duration:00:28:36

Tools for generating deep neural networks with efficient network architectures

12/6/2018
More
The O’Reilly Data Show Podcast: Alex Wong on building human-in-the-loop automation solutions for enterprise machine learning. In this episode of the Data Show, I spoke with Alex Wong, associate professor at the University of Waterloo, and co-founder of DarwinAI, a startup that uses AI to address foundational challenges with deep learning in the enterprise. As the use of machine learning and analytics become more widespread, we’re beginning to see tools that enable data scientists and data...

Duration:00:32:20

Building tools for enterprise data science

11/21/2018
More
The O’Reilly Data Show Podcast: Vitaly Gordon on the rise of automation tools in data science. In this episode of the Data Show, I spoke with Vitaly Gordon, VP of data science and engineering at Salesforce. As the use of machine learning becomes more widespread, we need tools that will allow data scientists to scale so they can tackle many more problems and help many more people. We need automation tools for the many stages involved in data science, including data preparation, feature...

Duration:00:31:28

Lessons learned while helping enterprises adopt machine learning

11/8/2018
More
The O’Reilly Data Show Podcast: Francesca Lazzeri and Jaya Mathew on digital transformation, culture and organization, and the team data science process. In this episode of the Data Show, I spoke with Francesca Lazzeri, an AI and machine learning scientist at Microsoft, and her colleague Jaya Mathew, a senior data scientist at Microsoft. We conducted a couple of surveys this year—“How Companies Are Putting AI to Work Through Deep Learning” and “The State of Machine Learning Adoption in...

Duration:00:31:30

Machine learning on encrypted data

10/25/2018
More
The O’Reilly Data Show Podcast: Alon Kaufman on the interplay between machine learning, encryption, and security. In this episode of the Data Show, I spoke with Alon Kaufman, CEO and co-founder of Duality Technologies, a startup building tools that will allow companies to apply analytics and machine learning to encrypted data. In a recent talk, I described the importance of data, various methods for estimating the value of data, and emerging tools for incentivizing data sharing across...

Duration:00:41:22

How social science research can inform the design of AI systems

10/11/2018
More
The O’Reilly Data Show Podcast: Jacob Ward on the interplay between psychology, decision-making, and AI systems. In this episode of the Data Show, I spoke with Jacob Ward, a Berggruen Fellow at Stanford University. Ward has an extensive background in journalism, mainly covering topics in science and technology, at National Geographic, Al Jazeera, Discovery Channel, BBC, Popular Science, and many other outlets. Most recently, he’s become interested in the interplay between research in...

Duration:00:45:30

Why it’s hard to design fair machine learning models

9/27/2018
More
The O’Reilly Data Show Podcast: Sharad Goel and Sam Corbett-Davies on the limitations of popular mathematical formalizations of fairness. In this episode of the Data Show, I spoke with Sharad Goel, assistant professor at Stanford, and his student Sam Corbett-Davies. They recently wrote a survey paper, “A Critical Review of Fair Machine Learning,” where they carefully examined the standard statistical tools used to check for fairness in machine learning models. It turns out that each of...

Duration:00:34:23