O'Reilly Data Show-logo

O'Reilly Data Show

Technology News >

The O'Reilly Data Show explores the opportunities and techniques driving big data and data science. Through interviews and analysis, we highlight the people putting data to work.

The O'Reilly Data Show explores the opportunities and techniques driving big data and data science. Through interviews and analysis, we highlight the people putting data to work.
More Information

Location:

Sebastopol, CA

Description:

The O'Reilly Data Show explores the opportunities and techniques driving big data and data science. Through interviews and analysis, we highlight the people putting data to work.

Twitter:

@strataconf

Language:

English


Episodes

Lessons learned while helping enterprises adopt machine learning

11/8/2018
More
The O’Reilly Data Show Podcast: Francesca Lazzeri and Jaya Mathew on digital transformation, culture and organization, and the team data science process. In this episode of the Data Show, I spoke with Francesca Lazzeri, an AI and machine learning scientist at Microsoft, and her colleague Jaya Mathew, a senior data scientist at Microsoft. We conducted a couple of surveys this year—“How Companies Are Putting AI to Work Through Deep Learning” and “The State of Machine Learning Adoption in the...

Duration:00:31:30

Machine learning on encrypted data

10/25/2018
More
The O’Reilly Data Show Podcast: Alon Kaufman on the interplay between machine learning, encryption, and security. In this episode of the Data Show, I spoke with Alon Kaufman, CEO and co-founder of Duality Technologies, a startup building tools that will allow companies to apply analytics and machine learning to encrypted data. In a recent talk, I described the importance of data, various methods for estimating the value of data, and emerging tools for incentivizing data sharing across...

Duration:00:41:22

How social science research can inform the design of AI systems

10/11/2018
More
The O’Reilly Data Show Podcast: Jacob Ward on the interplay between psychology, decision-making, and AI systems. In this episode of the Data Show, I spoke with Jacob Ward, a Berggruen Fellow at Stanford University. Ward has an extensive background in journalism, mainly covering topics in science and technology, at National Geographic, Al Jazeera, Discovery Channel, BBC, Popular Science, and many other outlets. Most recently, he’s become interested in the interplay between research in...

Duration:00:45:30

Why it’s hard to design fair machine learning models

9/27/2018
More
The O’Reilly Data Show Podcast: Sharad Goel and Sam Corbett-Davies on the limitations of popular mathematical formalizations of fairness. In this episode of the Data Show, I spoke with Sharad Goel, assistant professor at Stanford, and his student Sam Corbett-Davies. They recently wrote a survey paper, “A Critical Review of Fair Machine Learning,” where they carefully examined the standard statistical tools used to check for fairness in machine learning models. It turns out that each of...

Duration:00:34:23

Using machine learning to improve dialog flow in conversational applications

9/13/2018
More
The O’Reilly Data Show Podcast: Alan Nichol on building a suite of open source tools for chatbot developers. In this episode of the Data Show, I spoke with Alan Nichol, co-founder and CTO of Rasa, a startup that builds open source tools to help developers and product teams build conversational applications. About 18 months ago, there was tremendous excitement and hype surrounding chatbots, and while things have quieted lately, companies and developers continue to refine and define tools...

Duration:00:45:22

Building accessible tools for large-scale computation and machine learning

8/30/2018
More
The O’Reilly Data Show Podcast: Eric Jonas on Pywren, scientific computation, and machine learning. In this episode of the Data Show, I spoke with Eric Jonas, a postdoc in the new Berkeley Center for Computational Imaging. Jonas is also affiliated with UC Berkeley’s RISE Lab. It was at a RISE Lab event that he first announced Pywren, a framework that lets data enthusiasts proficient with Python run existing code at massive scale on Amazon Web Services. Jonas and his collaborators are...

Duration:00:53:31

Simplifying machine learning lifecycle management

8/16/2018
More
The O’Reilly Data Show Podcast: Harish Doddi on accelerating the path from prototype to production. In this episode of the Data Show, I spoke with Harish Doddi, co-founder and CEO of Datatron, a startup focused on helping companies deploy and manage machine learning models. As companies move from machine learning prototypes to products and services, tools and best practices for productionizing and managing models are just starting to emerge. Today’s data science and data engineering teams...

Duration:00:37:24

How privacy-preserving techniques can lead to more robust machine learning models

8/2/2018
More
The O’Reilly Data Show Podcast: Chang Liu on operations research, and the interplay between differential privacy and machine learning. In this episode of the Data Show, I spoke with Chang Liu, applied research scientist at Georgian Partners. In a previous post, I highlighted early tools for privacy-preserving analytics, both for improving decision-making (business intelligence and analytics) and for enabling automation (machine learning). One of the tools I mentioned is an open source...

Duration:00:36:43

Specialized hardware for deep learning will unleash innovation

7/19/2018
More
The O’Reilly Data Show Podcast: Andrew Feldman on why deep learning is ushering a golden age for compute architecture. In this episode of the Data Show, I spoke with Andrew Feldman, founder and CEO of Cerebras Systems, a startup in the blossoming area of specialized hardware for machine learning. Since the release of AlexNet in 2012, we have seen an explosion in activity in machine learning, particularly in deep learning. A lot of the work to date happened primarily on general purpose...

Duration:00:41:17

Data regulations and privacy discussions are still in the early stages

7/5/2018
More
The O’Reilly Data Show Podcast: Aurélie Pols on GDPR, ethics, and ePrivacy. In this episode of the Data Show, I spoke with Aurélie Pols of Mind Your Privacy, one of my go-to resources when it comes to data privacy and data ethics. This interview took place at Strata Data London, a couple of days before the EU General Data Protection Regulation (GDPR) took effect. I wanted her perspective on this landmark regulation, as well as her take on trends in data privacy and growing interest in...

Duration:00:33:18

Managing risk in machine learning models

6/21/2018
More
The O’Reilly Data Show Podcast: Andrew Burt and Steven Touw on how companies can manage models they cannot fully explain. In this episode of the Data Show, I spoke with Andrew Burt, chief privacy officer at Immuta, and Steven Touw, co-founder and CTO of Immuta. Burt recently co-authored an upcoming white paper on managing risk in machine learning models, and I wanted to sit down with them to discuss some of the proposals they put forward to organizations that are deploying machine...

Duration:00:32:33

The real value of data requires a holistic view of the end-to-end data pipeline

6/7/2018
More
The O’Reilly Data Show Podcast: Ashok Srivastava on the emergence of machine learning and AI for enterprise applications. In this episode of the Data Show, I spoke with Ashok Srivastava, senior vice president and chief data officer at Intuit. He has a strong science and engineering background, combined with years of applying machine learning and data science in industry. Prior to joining Intuit, he led the teams responsible for data and artificial intelligence products at Verizon. I...

Duration:00:31:04

The evolution of data science, data engineering, and AI

5/24/2018
More
The O’Reilly Data Show Podcast: A special episode to mark the 100th episode. This episode of the Data Show marks our 100th episode. This podcast stemmed out of video interviews conducted at O’Reilly’s 2014 Foo Camp. We had a collection of friends who were key members of the data science and big data communities on hand and we decided to record short conversations with them. We originally conceived of using those initial conversations to be the basis of a regular series of video...

Duration:00:30:14

Companies in China are moving quickly to embrace AI technologies

5/10/2018
More
The O’Reilly Data Show Podcast: Jason Dai on the first year of BigDL and AI in China. In this episode of the Data Show, I spoke with Jason Dai, CTO of Big Data Technologies at Intel, and one of my co-chairs for the AI Conference in Beijing. I wanted to check in on the status of BigDL, specifically how companies have been using this deep learning library on top of Apache Spark, and discuss some newly added features. It turns out there are quite a number of companies already using BigDL in...

Duration:00:28:52

Teaching and implementing data science and AI in the enterprise

4/26/2018
More
The O’Reilly Data Show Podcast: Jerry Overton on organizing data teams, agile experimentation, and the importance of ethics in data science. In this episode of the Data Show, I spoke with Jerry Overton, senior principal and distinguished technologist at DXC Technology. I wanted the perspective of someone who works across industries and with a variety of companies. I specifically wanted to explore the current state of data science and AI within companies and public sector agencies. As much...

Duration:00:38:45

The importance of transparency and user control in machine learning

4/12/2018
More
The O’Reilly Data Show Podcast: Guillaume Chaslot on bias and extremism in content recommendations. In this episode of the Data Show, I spoke with Guillaume Chaslot, an ex-YouTube engineer and founder of AlgoTransparency, an organization dedicated to helping the public understand the profound impact algorithms have on our lives. We live in an age when many of our interactions with companies and services are governed by algorithms. At a time when their impact continues to grow, there are...

Duration:00:23:19

What machine learning engineers need to know

3/29/2018
More
The O’Reilly Data Show Podcast: Jesse Anderson and Paco Nathan on organizing data teams and next-generation messaging with Apache Pulsar. In this episode of the Data Show, I spoke Jesse Anderson, managing director of the Big Data Institute, and my colleague Paco Nathan, who recently became co-chair of Jupytercon. This conversation grew out of a recent email thread the three of us had on machine learning engineers, a new job role that LinkedIn recently pegged as the fastest growing job in...

Duration:00:20:53

How to train and deploy deep learning at scale

3/15/2018
More
The O’Reilly Data Show Podcast: Ameet Talwalkar on large-scale machine learning. In this episode of the Data Show, I spoke with Ameet Talwalkar, assistant professor of machine learning at CMU and co-founder of Determined AI. He was an early and key contributor to Spark MLlib and a member of AMPLab. Most recently, he helped conceive and organize the first edition of SysML, a new academic conference at the intersection of systems and machine learning (ML). We discussed using and deploying...

Duration:00:39:09

Using machine learning to monitor and optimize chatbots

3/6/2018
More
The O’Reilly Data Show Podcast: Ofer Ronen on the current state of chatbots. In this episode of the Data Show, I spoke with Ofer Ronen, GM of Chatbase, a startup housed within Google’s Area 120. With tools for building chatbots becoming accessible, conversational interfaces are becoming more prevalent. As Ronen highlights in our conversation, chatbots are already enabling companies to automate many routine tasks (mainly in customer interaction). We are still in the early days of chatbots,...

Duration:00:27:46

Unleashing the potential of reinforcement learning

3/1/2018
More
The O’Reilly Data Show Podcast: Danny Lange on how reinforcement learning can accelerate software development and how it can be democratized. In this episode of the Data Show, I spoke with Danny Lange, VP of AI and machine learning at Unity Technologies. Lange previously led data and machine learning teams at Microsoft, Amazon, and Uber, where his teams were responsible for building data science tools used by other developers and analysts within those companies. When I first heard that he...

Duration:00:33:24