Linear Digressions-logo

Linear Digressions

Technology Podcasts >

In each episode, your hosts explore machine learning and data science through interesting (and often very unusual) applications.

In each episode, your hosts explore machine learning and data science through interesting (and often very unusual) applications.
More Information

Location:

United States

Description:

In each episode, your hosts explore machine learning and data science through interesting (and often very unusual) applications.

Language:

English


Episodes

Re - Release: How To Lose At Kaggle

8/12/2018
More
We've got a classic for you this week as we take a week off for the dog days of summer. See you again next week! Competing in a machine learning competition on Kaggle is a kind of rite of passage for data scientists. Losing unexpectedly at the very end of the contest is also something that a lot of us have experienced. It's not just bad luck: a very specific combination of overfitting on popular competitions can take someone who is in the top few spots in the final days of a contest and bump...

Duration:00:17:54

Troubling Trends In Machine Learning Scholarship

8/5/2018
More
There's a lot of great machine learning papers coming out every day--and, if we're being honest, some papers that are not as great as we'd wish. In some ways this is symptomatic of a field that's growing really quickly, but it's also an artifact of strange incentive structures in academic machine learning, and the fact that sometimes machine learning is just really hard. At the same time, a high quality of academic work is critical for maintaining the reputation of the field, so in this...

Duration:00:29:34

Can Fancy Running Shoes Cause You To Run Faster?

7/29/2018
More
The stars aligned for me (Katie) this past weekend: I raced my first half-marathon in a long time and got to read a great article from the NY Times about a new running shoe that Nike claims can make its wearers run faster. Causal claims like this one are really tough to verify, because even if the data suggests that people wearing the shoe are faster that might be because of correlation, not causation, so I loved reading this article that went through an analysis of thousands of runners'...

Duration:00:28:37

Compliance Bias

7/22/2018
More
When you're using an AB test to understand the effect of a treatment, there are a lot of assumptions about how the treatment (and control, for that matter) get applied. For example, it's easy to think that everyone who was assigned to the treatment arm actually gets the treatment, everyone in the control arm doesn't, and that the two groups get their treatment instantaneously. None of these things happen in real life, and if you really care about measuring your treatment effect then that's...

Duration:00:23:27

AI Winter

7/15/2018
More
Artificial Intelligence has been widely lauded as a solution to almost any problem. But as we justapose the hype in the field against the real-world benefits we see, it raises the question: Are we coming up on an AI winter

Duration:00:19:02

Rerelease: How to Find New Things to Learn

7/8/2018
More
We like learning on vacation. And we're on vacation, so we thought we'd re-air this episode about how to learn. Original Episode: https://lineardigressions.com/episodes/2017/5/14/how-to-find-new-things-to-learn Original Summary: If you're anything like us, you a) always are curious to learn more about data science and machine learning and stuff, and b) are usually overwhelmed by how much content is out there (not all of it very digestible). We hope this podcast is a part of the solution for...

Duration:00:18:32

Rerelease: Space Codes

7/2/2018
More
We're on vacation on Mars, so we won't be communicating with you all directly this week. Though, if we wanted to, we could probably use this episode to help get started. Original Episode: http://lineardigressions.com/episodes/2017/3/19/space-codes Original Summary: It's hard to get information to and from Mars. Mars is very far away, and expensive to get to, and the bandwidth for passing messages with Earth is not huge. The messages you do pass have to traverse millions of miles, which...

Duration:00:24:30

Rerelease: Anscombe's Quartet

6/24/2018
More
We're on vacation, so we hope you enjoy this episode while we each sip cocktails on the beach. Original Episode: http://lineardigressions.com/episodes/2017/6/18/anscombes-quartet Original Summary: Anscombe's Quartet is a set of four datasets that have the same mean, variance and correlation but look very different. It's easy to think that having a good set of summary statistics (like mean, variance and correlation) can tell you everything important about a dataset, or at least enough to know...

Duration:00:16:14

Rerelease: Hurricanes Produced

6/18/2018
More
Now that hurricane season is upon us again (and we are on vacation), we thought a look back on our hurricane forecasting episode was prudent. Stay safe out there.

Duration:00:28:12

GDPR

6/10/2018
More
By now, you have probably heard of GDPR, the EU's new data privacy law. It's the reason you've been getting so many emails about everyone's updated privacy policy. In this episode, we talk about some of the potential ramifications of GRPD in the world of data science.

Duration:00:18:23

Analytics Maturity

5/20/2018
More
Data science and analytics are hot topics in business these days, but for a lot of folks looking to bring data into their organization, it can be hard to know where to start and what it looks like when they're succeeding. That was the motivation for writing a whitepaper on the analytics maturity of an organization, and that's what we're talking about today. In particular, we break it down into five attributes of an organization that contribute (or not) to their success in analytics, and...

Duration:00:19:32

SHAP: Shapley Values in Machine Learning

5/13/2018
More
Shapley values in machine learning are an interesting and useful enough innovation that we figured hey, why not do a two-parter? Our last episode focused on explaining what Shapley values are: they define a way of assigning credit for outcomes across several contributors, originally to understand how impactful different actors are in building coalitions (hence the game theory background) but now they're being cross-purposed for quantifying feature importance in machine learning models. This...

Duration:00:19:12

Game Theory for Model Interpretability: Shapley Values

5/6/2018
More
As machine learning models get into the hands of more and more users, there's an increasing expectation that black box isn't good enough: users want to understand why the model made a given prediction, not just what the prediction itself is. This is motivating a lot of work into feature important and model interpretability tools, and one of the most exciting new ones is based on Shapley Values from game theory. In this episode, we'll explain what Shapley Values are and how they make a cool...

Duration:00:27:06

AutoML

4/29/2018
More
If you were a machine learning researcher or data scientist ten years ago, you might have spent a lot of time implementing individual algorithms like decision trees and neural networks by hand. If you were doing that work five years ago, the algorithms were probably already implemented in popular open-source libraries like scikit-learn, but you still might have spent a lot of time trying different algorithms and tuning hyperparameters to improve performance. If you're doing that work today,...

Duration:00:15:23

CPUs, GPUs, TPUs: Hardware for Deep Learning

4/22/2018
More
A huge part of the ascent of deep learning in the last few years is related to advances in computer hardware that makes it possible to do the computational heavy lifting required to build models with thousands or even millions of tunable parameters. This week we'll pretend to be electrical engineers and talk about how modern machine learning is enabled by hardware.

Duration:00:12:40

A Technical Introduction to Capsule Networks

4/15/2018
More
Last episode we talked conceptually about capsule networks, the latest and greatest computer vision innovation to come out of Geoff Hinton's lab. This week we're getting a little more into the technical details, for those of you ready to have your mind stretched.

Duration:00:31:27

A Conceptual Introduction to Capsule Networks

4/8/2018
More
Convolutional nets are great for image classification... if this were 2016. But it's 2018 and Canada's greatest neural networker Geoff Hinton has some new ideas, namely capsule networks. Capsule nets are a completely new type of neural net architecture designed to do image classification on far fewer training cases than convolutional nets, and they're posting results that are competitive with much more mature technologies. In this episode, we'll give a light conceptual introduction to...

Duration:00:14:05

Convolutional Neural Nets

4/1/2018
More
If you've done image recognition or computer vision tasks with a neural network, you've probably used a convolutional neural net. This episode is all about the architecture and implementation details of convolutional networks, and the tricks that make them so good at image tasks.

Duration:00:21:55

Google Flu Trends

3/25/2018
More
It's been a nasty flu season this year. So we were remembering a story from a few years back (but not covered yet on this podcast) about when Google tried to predict flu outbreaks faster than the Centers for Disease Control by monitoring searches and looking for spikes in searches for flu symptoms, doctors appointments, and other related terms. It's a cool idea, but after a few years turned into a cautionary tale of what can go wrong after Google's algorithm systematically overestimated flu...

Duration:00:12:46

How to pick projects for a professional data science team

3/18/2018
More
This week's episodes is for data scientists, sure, but also for data science managers and executives at companies with data science teams. These folks all think very differently about the same question: what should a data science team be working on? And how should that decision be made? That's the subject of a talk that I (Katie) gave at Strata Data in early March, about how my co-department head and I select projects for our team to work on. We have several goals in data science project...

Duration:00:31:17