O'Reilly Data Show-logo

O'Reilly Data Show

500 Favorites

The O'Reilly Data Show explores the opportunities and techniques driving big data and data science. Through interviews and analysis, we highlight the people putting data to work.

The O'Reilly Data Show explores the opportunities and techniques driving big data and data science. Through interviews and analysis, we highlight the people putting data to work.
More Information

Location:

Sebastopol, CA

Description:

The O'Reilly Data Show explores the opportunities and techniques driving big data and data science. Through interviews and analysis, we highlight the people putting data to work.

Twitter:

@strataconf

Language:

English


Episodes

Vehicle-to-vehicle communication networks can help fuel smart cities

10/12/2017
More
The O’Reilly Data Show Podcast: Bruno Fernandez-Ruiz on the importance of building the ground control center of the future. In this episode of the Data Show, I spoke with Bruno Fernandez-Ruiz, co-founder and CTO of Nexar. We first met when he was leading Yahoo! technical teams charged with delivering a variety of large-scale, real-time data products. His new company is helping build out critical infrastructure for the emerging transportation sector. While some question whether V2X...

Duration: 00:45:01


Transforming organizations through analytics centers of excellence

9/28/2017
More
The O’Reilly Data Show Podcast: Carme Artigas on helping enterprises transform themselves with big data tools and technologies. In this episode of the Data Show, I spoke with Carme Artigas, co-founder and CEO of Synergic Partners (a Telefonica company). As more companies adopt big data technologies and techniques, it’s useful to remember that the end goal is to extract information and insight. In fact, as with any collection of tools and technologies, the main challenge is identifying and...

Duration: 00:38:41


The state of machine learning in Apache Spark

9/14/2017
More
The O’Reilly Data Show Podcast: Ion Stoica and Matei Zaharia explore the rich ecosystem of analytic tools around Apache Spark. In this episode of the Data Show, we look back to a recent conversation I had at the Spark Summit in San Francisco with Ion Stoica (UC Berkeley professor and executive chairman of Databricks) and Matei Zaharia (assistant professor at Stanford and chief technologist of Databricks). Stoica and Zaharia were core members of UC Berkeley’s AMPLab, which originated...

Duration: 00:21:40


Effective mechanisms for searching the space of machine learning algorithms

8/31/2017
More
The O’Reilly Data Show Podcast: Kenneth Stanley on neuroevolution and other principled ways of exploring the world without an objective. In this episode of the Data Show, I spoke with Ken Stanley, founding member of Uber AI Labs and associate professor at the University of Central Florida. Stanley is an AI researcher and a leading pioneer in the field of neuroevolution—a method for evolving and learning neural networks through evolutionary algorithms. In a recent survey article, Stanley...

Duration: 00:45:40


How Ray makes continuous learning accessible and easy to scale

8/17/2017
More
The O’Reilly Data Show Podcast: Robert Nishihara and Philipp Moritz on a new framework for reinforcement learning and AI applications. In this episode of the Data Show, I spoke with Robert Nishihara and Philipp Moritz, graduate students at UC Berkeley and members of RISE Lab. I wanted to get an update on Ray, an open source distributed execution framework that makes it easy for machine learning engineers and data scientists to scale reinforcement learning and other related continuous...

Duration: 00:18:28


Why AI and machine learning researchers are beginning to embrace PyTorch

8/3/2017
More
The O’Reilly Data Show Podcast: Soumith Chintala on building a worthy successor to Torch and on deep learning within Facebook. In this episode of the Data Show, I spoke with Soumith Chintala, AI research engineer at Facebook. Among his many research projects, Chintala was part of the team behind DCGAN (Deep Convolutional Generative Adversarial Networks), a widely cited paper that introduced a set of neural network architectures for unsupervised learning. Our conversation centered around...

Duration: 00:36:56


How big data and AI will reshape the automotive industry

7/20/2017
More
The O’Reilly Data Show Podcast: Evangelos Simoudis on next-generation mobility services. In this episode of the Data Show, I spoke with Evangelos Simoudis, co-founder of Synapse Partners and a frequent contributor to O’Reilly. He recently published a book entitled The Big Data Opportunity in Our Driverless Future, and I wanted get his thoughts on the transportation industry and the role of big data and analytics in its future. Simoudis is an entrepreneur, and he also advises and invests...

Duration: 00:51:05


A framework for building and evaluating data products

7/6/2017
More
The O’Reilly Data Show Podcast: Pinterest data scientist Grace Huang on lessons learned in the course of machine learning product launches. In this episode of the Data Show, I spoke with Grace Huang, data science lead at Pinterest. With its combination of a large social graph, enthusiastic users, and multimedia data, I’ve long regarded Pinterest as a fascinating lab for data science. Huang described the challenge of building a sustainable content ecosystem and shared lessons from the...

Duration: 00:22:17


Building a next-generation platform for deep learning

6/29/2017
More
The O’Reilly Data Show Podcast: Naveen Rao on emerging hardware and software infrastructure for AI. In this episode of the Data Show, I speak with Naveen Rao, VP and GM of the Artificial Intelligence Products Group at Intel. In an earlier episode, we learned that scaling current deep learning models requires innovations in both software and hardware. Through his startup Nervana (since acquired by Intel), Rao has been at the forefront of building a next generation platform for deep...

Duration: 00:27:49


A scalable time-series database that supports SQL

6/22/2017
More
The O’Reilly Data Show Podcast: Michael Freedman on TimescaleDB and scaling SQL for time-series. In this episode of the Data Show, I spoke with Michael Freedman, CTO of Timescale and professor of computer science at Princeton University. When I first heard that Freedman and his collaborators were building a time-series database, my immediate reaction was: “Don’t we have enough options already?” The early incarnation of Timescale was a startup focused on IoT, and it was while building...

Duration: 00:49:12


Programming collective intelligence for financial trading

6/15/2017
More
The O’Reilly Data Show Podcast: Geoffrey Bradway on building a trading system that synthesizes many different models. In this episode of the Data Show, I spoke with Geoffrey Bradway, VP of engineering at Numerai, a new hedge fund that relies on contributions of external data scientists. The company hosts regular competitions where data scientists submit machine learning models for classification tasks. The most promising submissions are then added to an ensemble of models that the company...

Duration: 00:26:52


Creating large training data sets quickly

6/8/2017
More
The O’Reilly Data Show Podcast: Alex Ratner on why weak supervision is the key to unlocking dark data. In this episode of the Data Show, I spoke with Alex Ratner, a graduate student at Stanford and a member of Christopher Ré’s Hazy research group. Training data has always been important in building machine learning algorithms, and the rise of data-hungry deep learning models has heightened the need for labeled data sets. In fact, the challenge of creating training data is ongoing for many...

Duration: 00:47:09


Data science and deep learning in retail

5/25/2017
More
The O’Reilly Data Show Podcast: Jeremy Stanley on hiring and leading machine learning engineers to build world-class data products. In this episode of the Data Show, I spoke with Jeremy Stanley, VP of data science at Instacart, a popular grocery delivery service that is expanding rapidly. As Stanley describes it, Instacart operates a four-sided marketplace comprised of retail stores, products within the stores, shoppers assigned to the stores, and customers who order from Instacart. The...

Duration: 00:49:27


Language understanding remains one of AI’s grand challenges

5/11/2017
More
The O’Reilly Data Show Podcast: David Ferrucci on the evolution of AI systems for language understanding. In this episode of the Data Show, I spoke with David Ferrucci, founder of Elemental Cognition and senior technologist at Bridgewater Associates. Ferrucci served as principal investigator of IBM’s DeepQA project and led the Watson team that became champion of the Jeopardy! quiz show. Elemental Cognition (EC) is a research group focused on building an AI system that will be equipped...

Duration: 00:38:04


Data preparation in the age of deep learning

5/4/2017
More
The O’Reilly Data Show Podcast: Lukas Biewald on why companies are spending millions of dollars on labeled data sets. In this episode of the Data Show, I spoke with Lukas Biewald, co-founder and chief data scientist at CrowdFlower. In a previous episode we covered how the rise of deep learning is fueling the need for large labeled data sets and high-performance computing systems. CrowdFlower has a service that many leading companies have come to rely on to provide them with labeled data...

Duration: 00:36:16


Scaling machine learning

4/20/2017
More
The O’Reilly Data Show Podcast: Reza Zadeh on deep learning, hardware/software interfaces, and why computer vision is so exciting. In this episode of the Data Show, I spoke with Reza Zadeh, adjunct professor at Stanford University, co-organizer of ScaledML, and co-founder of Matroid, a startup focused on commercial applications of deep learning and computer vision. Zadeh also is the co-author of the forthcoming book TensorFlow for Deep Learning (now in early release). Our conversation...

Duration: 00:56:46


Architecting and building end-to-end streaming applications

4/6/2017
More
The O’Reilly Data Show Podcast: Karthik Ramasamy on Heron, DistributedLog, and designing real-time applications. In this episode of the Data Show, I spoke with Karthik Ramasamy, adjunct faculty member at UC Berkeley, former engineering manager at Twitter, and co-founder of Streamlio. Ramasamy managed the team that built Heron, an open source, distributed stream processing engine, compatible with Apache Storm. While Ramasamy has seen firsthand what it takes to build and deploy large-scale...

Duration: 00:45:10


Becoming a machine learning engineer

3/30/2017
More
The O’Reilly Data Show Podcast: Aurélien Géron on enabling companies to use machine learning in real-world products. In this episode of the Data Show, I spoke with Aurélien Géron, a serial entrepreneur, data scientist, and author of a popular, new book entitled Hands-on Machine Learning with Scikit-Learn and TensorFlow. Géron’s book is aimed at software engineers who want to learn machine learning and start deploying machine learning models in real-world products. As more companies adopt...

Duration: 00:41:03


Natural language analysis using Hierarchical Temporal Memory

3/23/2017
More
The O’Reilly Data Show Podcast: Francisco Webber on building HTM-based enterprise applications. In this episode of the Data Show, I spoke with Francisco Webber, founder of Cortical.io, a startup that is applying tools based on Hierarchical Temporal Memory (HTM) to natural language understanding. While HTM has been around for more than a decade, there aren’t many companies that have released products based on it (at least compared to other machine learning methods). Numenta, an...

Duration: 00:51:03


Saving the world—or at least the world’s scientific and government data

3/14/2017
More
The O’Reilly Data Show Podcast: Max Ogden on data preservation, distributed trust, and bringing cutting-edge technology to journalism. In this special episode of the Data Show, O'Reilly's Jenn Webb speaks with Maxwell Ogden, director of Code for Science and Society. Recently, Ogden and Code for Science have been working on the ongoing rescue of data.gov and assisting with other data rescue projects, such as Data Refuge; they’re also the nonprofit developers supporting Dat, a data...

Duration: 00:40:33

See More