pixel
Data Skeptic Podcast-logo

Data Skeptic Podcast

613 Favorites

More Information

Location:

United States

Description:

The Data Skeptic Podcast features conversations with researchers and other professionals active in applying data science to real world problems. The topics relate to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches. The podcast has an alternating format with even episodes featuring long for conversations, and odd episodes featuring short discussions about topics related to data science which are aimed at listeners who might not be familiar with some of the topics discussed on the show.

Language:

English


Episodes

[MINI] Recurrent Neural Networks

8/18/2017
More
RNNs are a class of deep learning models designed to capture sequential behavior. An RNN trains a set of weights which depend not just on new input but also on the previous state of the neural network. This directed cycle allows the training phase to find solutions which rely on the state at a previous time, thus giving the network a form of memory. RNNs have been used effectively in language analysis, translation, speech recognition, and many other tasks.

Duration: 00:17:05


Project Common Voice

8/11/2017
More
Thanks to our sponsor Springboard. In this week's episode, guest Andre Natal from Mozilla joins our host, Kyle Polich, to discuss a couple exciting new developments in open source speech recognition systems, which include Project Common Voice. In June 2017, Mozilla launched a new open source project, Common Voice, a novel complementary project to the TensorFlow-based DeepSpeech implementation. DeepSpeech is a deep learning-based voice recognition system that was designed by Baidu, which...

Duration: 00:31:13


MINI: Bayesian Belief Networks

8/4/2017
More
A Bayesian Belief Network is an acyclic directed graph composed of nodes that represent random variables and edges that imply a conditional dependence between them. It's an intuitive way of encoding your statistical knowledge about a system and is efficient to propagate belief updates throughout the network when new information is added.

Duration: 00:17:02


pix2code

7/28/2017
More
In this episode, Tony Beltramelli of UIzard Technologies joins our host, Kyle Polich, to talk about the ideas behind his latest app that can transform graphic design into functioning code, as well as his previous work on spying with wearables.

Duration: 00:26:58


[MINI] Conditional Independence

7/21/2017
More
In statistics, two random variables might depend on one another (for example, interest rates and new home purchases). We call this conditional dependence. An important related concept exists called conditional independence. This phrase describes situations in which two variables are independent of one another given some other variable. For example, the probability that a vendor will pay their bill on time could depend on many factors such as the company's market cap. Thus, a statistical...

Duration: 00:15:24


Estimating Sheep Pain with Facial Recognition

7/14/2017
More
Animals can't tell us when they're experiencing pain, so we have to rely on other cues to help treat their discomfort. But it is often difficult to tell how much an animal is suffering. The sheep, for instance, is the most inscrutable of animals. However, scientists have figured out a way to understand sheep facial expressions using artificial intelligence. On this week's episode, Dr. Marwa Mahmoud from the University of Cambridge joins us to discuss her recent study, "Estimating Sheep...

Duration: 00:27:04


CosmosDB

7/7/2017
More
This episode collects interviews from my recent trip to Microsoft Build where I had the opportunity to speak with Dharma Shukla and Syam Nair about the recently announced CosmosDB. CosmosDB is a globally consistent, distributed datastore that supports all the popular persistent storage formats (relational, key/value pair, document database, and graph) under a single streamlined API. The system provides tunable consistency, allowing the user to make choices about how consistency trade-offs...

Duration: 00:33:32


[MINI] The Vanishing Gradient

6/30/2017
More
This episode discusses the vanishing gradient - a problem that arises when training deep neural networks in which nearly all the gradients are very close to zero by the time back-propagation has reached the first hidden layer. This makes learning virtually impossible without some clever trick or improved methodology to help earlier layers begin to learn.

Duration: 00:15:15


Doctor AI

6/23/2017
More
hen faced with medical issues, would you want to be seen by a human or a machine? In this episode, guest Edward Choi, co-author of the study titled Doctor AI: Predicting Clinical Events via Recurrent Neural Network shares his thoughts. Edward presents his team’s efforts in developing a temporal model that can learn from human doctors based on their collective knowledge, i.e. the large amount of Electronic Health Record (EHR) data.

Duration: 00:41:49


Activation Functions

6/16/2017
More
In a neural network, the output value of a neuron is almost always transformed in some way using a function. A trivial choice would be a linear transformation which can only scale the data. However, other transformations, like a step function allow for non-linear properties to be introduced. Activation functions can also help to standardize your data between layers. Some functions such as the sigmoid have the effect of "focusing" the area of interest on data. Extreme values are placed...

Duration: 00:14:09


MS Build 2017

6/9/2017
More
This episode recaps the Microsoft Build Conference. Kyle recently attended and shares some thoughts on cloud, databases, cognitive services, and artificial intelligence. The episode includes interviews with Rohan Kumar and David Carmona.

Duration: 00:27:36


[MINI] Max-pooling

6/2/2017
More
Max-pooling is a procedure in a neural network which has several benefits. It performs dimensionality reduction by taking a collection of neurons and reducing them to a single value for future layers to receive as input. It can also prevent overfitting, since it takes a large set of inputs and admits only one value, making it harder to memorize the input. In this episode, we discuss the intuitive interpretation of max-pooling and why it's more common than mean-pooling or (theoretically)...

Duration: 00:12:31


Unsupervised Depth Perception

5/26/2017
More
This episode is an interview with Tinghui Zhou. In the recent paper "Unsupervised Learning of Depth and Ego-motion from Video", Tinghui and collaborators propose a deep learning architecture which is able to learn depth and pose information from unlabeled videos. We discuss details of this project and its applications.

Duration: 00:23:42


[MINI] Convolutional Neural Networks

5/19/2017
More
CNNs are characterized by their use of a group of neurons typically referred to as a filter or kernel. In image recognition, this kernel is repeated over the entire image. In this way, CNNs may achieve the property of translational invariance - once trained to recognize certain things, changing the position of that thing in an image should not disrupt the CNN's ability to recognize it. In this episode, we discuss a few high-level details of this important architecture.

Duration: 00:14:53


Mutli-Agent Diverse Generative Adversarial Networks

5/12/2017
More
Despite the success of GANs in imaging, one of its major drawbacks is the problem of 'mode collapse,' where the generator learns to produce samples with extremely low variety. To address this issue, today's guests Arnab Ghosh and Viveka Kulharia proposed two different extensions. The first involves tweaking the generator's objective function with a diversity enforcing term that would assess similarities between the different samples generated by different generators. The second comprises...

Duration: 00:29:17


[MINI] Generative Adversarial Networks

5/5/2017
More
GANs are an unsupervised learning method involving two neural networks iteratively competing. The discriminator is a typical learning system. It attempts to develop the ability to recognize members of a certain class, such as all photos which have birds in them. The generator attempts to create false examples which the discriminator incorrectly classifies. In successive training rounds, the networks examine each and play a mini-max game of trying to harm the performance of the other. In...

Duration: 00:09:50


Opinion Polls for Presidential Elections

4/28/2017
More
Recently, we've seen opinion polls come under some skepticism. But is that skepticism truly justified? The recent Brexit referendum and US 2016 Presidential Election are examples where some claims the polls "got it wrong". This episode explores this idea.

Duration: 00:52:58


OpenHouse

4/21/2017
More
No reliable, complete database cataloging home sales data at a transaction level is available for the average person to access. To a data scientist interesting in studying this data, our hands are complete tied. Opportunities like testing sociological theories, exploring economic impacts, study market forces, or simply research the value of an investment when buying a home are all blocked by the lack of easy access to this dataset. OpenHouse seeks to correct that by centralizing and...

Duration: 00:26:16


[MINI] GPU CPU

4/14/2017
More
There's more than one type of computer processor. The central processing unit (CPU) is typically what one means when they say "processor". GPUs were introduced to be highly optimized for doing floating point computations in parallel. These types of operations were very useful for high end video games, but as it turns out, those same processors are extremely useful for machine learning. In this mini-episode we discuss why.

Duration: 00:11:02


[MINI] Backpropagation

4/7/2017
More
Backpropagation is a common algorithm for training a neural network. It works by computing the gradient of each weight with respect to the overall error, and using stochastic gradient descent to iteratively fine tune the weights of the network. In this episode, we compare this concept to finding a location on a map, marble maze games, and golf.

Duration: 00:15:12

See More