The Analytics Engineering Podcast-logo

The Analytics Engineering Podcast

Technology Podcasts

Tristan Handy has been curating the Analytics Engineering Roundup newsletter since 2015, pulling together the internet’s best data science & analytics articles. Tristan and co-host Julia Schottenstein now bring the Roundup to real life, hosting biweekly conversations with data practitioners inventing the future of analytics engineering. You can view full episode summaries and read back issues of the Roundup newsletter at https://roundup.getdbt.com. The podcast is sponsored by dbt labs, makers of the data transformation framework dbt. To reach our team, drop a note to podcast@dbtlabs.com.

Tristan Handy has been curating the Analytics Engineering Roundup newsletter since 2015, pulling together the internet’s best data science & analytics articles. Tristan and co-host Julia Schottenstein now bring the Roundup to real life, hosting biweekly conversations with data practitioners inventing the future of analytics engineering. You can view full episode summaries and read back issues of the Roundup newsletter at https://roundup.getdbt.com. The podcast is sponsored by dbt labs, makers of the data transformation framework dbt. To reach our team, drop a note to podcast@dbtlabs.com.

Location:

United States

Description:

Tristan Handy has been curating the Analytics Engineering Roundup newsletter since 2015, pulling together the internet’s best data science & analytics articles. Tristan and co-host Julia Schottenstein now bring the Roundup to real life, hosting biweekly conversations with data practitioners inventing the future of analytics engineering. You can view full episode summaries and read back issues of the Roundup newsletter at https://roundup.getdbt.com. The podcast is sponsored by dbt labs, makers of the data transformation framework dbt. To reach our team, drop a note to podcast@dbtlabs.com.

Twitter:

@getdbt

Language:

English


Episodes

Katie Bauer: Data Scientists Are Not Pizza

7/29/2022
Katie was a founding member of Reddit's data science team and, currently, as Twitter’s Data Science Manager, she leads the company’s infrastructure data science and analytics organization. In this conversation with Tristan and Julia, Katie explores how, as a manager, to help data people (especially those new to the field!) do their best work. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics...

Duration:00:43:21

Data Activation Everywhere (w/ Julie Beynon of Clearbit)

7/15/2022
As Head of Analytics at Clearbit, Julie serves as a data team of one in a 200+ person company (wow!). In this conversation with Tristan and Julia, Julie dives into how she's helped Clearbit implement data activation throughout the business, and realize the glorious dream of self-serve analytics. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

Duration:00:43:19

The Personal Data Warehouse (w/ Jordan Tigani of MotherDuck)

7/1/2022
Jordan Tigani is an expert in large-scale data processing, having spent a decade+ in the development and growth of BigQuery, and later SingleStore. Today, Jordan and his team at MotherDuck are in the early days of working on commercial applications for the open source DuckDB OLAP database. In this conversation with Tristan and Julia, Jordan dives into the origin story of BigQuery, why he thinks we should do away with the concept of working in files, and how truly performant “data apps”...

Duration:00:51:36

Making Sense of the Last 2 Years in Data

6/17/2022
Matt Bornstein and Jennifer Li (and their co-author Martin Casado) of a16z have compiled arguably the most nuanced diagram of the data ecosystem ever made. They recently refreshed their classic 2020 post, "Emerging Architectures for Modern Data Infrastructure" and in this conversation, Tristan attempts to pin down: what does all of this innovation in tooling mean for data people + the work we're capable of doing? When will the glorious future come to our laptops? For full show notes and to...

Duration:00:47:02

Building an Open Source Company (w/ Aaron Katz of ClickHouse)

6/3/2022
ClickHouse, the lightning-fast open source OLAP database, was initially released in 2016 as an open source project out of Yandex, the Russian search giant. In 2021, Aaron Katz helped form a group to spin it out of Yandex as an independent company, dedicated to the development + commercialization of the open source project. In this conversation with Tristan and Julia, Aaron gets into why he believes open source, independent software companies are the future. And of course, this conversation...

Duration:00:38:47

"To Move, or Not to Move" (Data). That is the Question.

5/20/2022
Justin Borgman is the co-founder, Chairman and CEO of Starburst, and has almost a decade spent in senior executive roles building new businesses in the data warehousing and analytics space. In this conversation with Tristan and Julia, Justin dives into the nuts and bolts of Trino, the open source distributed query engine, and explores how teams are adopting a data mesh architecture without making a mess. For full show notes and to read 6+ years of back issues of the podcast's companion...

Duration:00:40:24

What’s The Role Of AI in BI?

5/6/2022
Amit Prakash is Co-founder and CTO at ThoughtSpot. He has a deep background in search, having previously led the AdSense engineering team at Google and served on the early Bing team at Microsoft. In this conversation with Tristan and Julia, Amit gets real about the promise of AI in data: which applications are being widely used today, and which are still a few years out? For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to...

Duration:00:44:47

Automating Away Your Work w/ Configuration-as-Code (w/ Sarah Krasnik)

4/22/2022
Most recently leading a data engineering team at Perpay, Sarah has built and managed data platforms end to end by working closely with internal engineering, product, and operational teams. She recently left her role to pursue a wide variety of endeavors, including writing on her Substack (https://sarahsnewsletter.substack.com/). In this conversation with Tristan and Julia, Sarah dives into how configuration-as-code can automate away data work, why you might want to consider adding a data...

Duration:00:43:56

The Hard Problems™️ of Data Observability w/ Kevin Hu of Metaplane

4/8/2022
As a PhD candidate at MIT, Kevin (and friends) published Sherlock, a data type detection engine (a surprisingly bedeviling problem) for data cleaning + data discovery. Now as co-founder and CEO of Metaplane, a data observability startup, Kevin applies these same automated data discovery methods to help data teams keep their data healthy. In this conversation with Tristan & Julia, Kevin wins the coveted award for “most crystal-clear explanations of complex technical concepts through physics...

Duration:00:43:14

The Bundling vs Unbundling Debate w/ Tristan, Benn Stancil and David Jayatillake

3/25/2022
A debate has erupted on data Twitter and data Substack - should the modern data stack remain unbundled, or should it consolidate? In this conversation, Benn Stancil (Mode), David Jayatillake (Avora) and our host Tristan Handy try to make some sense of this debate, and play with various future scenarios for the modern data stack. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering...

Duration:00:43:32

One Database to Rule All Workloads? With Jon "Natty" Natkins of dbt Labs

3/11/2022
Will the dream of a mythical database to handle all workloads (transactional + analytical) ever become a reality, or does it violate the laws of physics? This question sparked a hearty debate internally at dbt Labs, and Jon "Natty" Natkins joins Julia here to continue the conversation. Natty knows databases, and this episode will take you on a historical romp through the rise and fall of Hadoop, the transition to cloud data warehouses, and what's waiting for us next in database-land. For...

Duration:00:36:27

Ashley Sherwood (AE @ Hubspot): Permissionless Innovation for Data Teams

2/25/2022
Ashley is a Principal Analytics Engineer at Hubspot, and has helped lead their implementation of dbt. Ashley makes unique connections in her writing and work. On her Substack, "syntax error at or near ❤️," Ashley might be found comparing growing companies to butterflies, or going deep on how to accommodate sensitive people in the workplace. In this conversation with Tristan & Julia, Ashley dives into the nuts and bolts of her trajectory pushing data innovation forward at Hubspot. For full...

Duration:00:45:36

Tristan in the Hot Seat

12/17/2021
In this very special episode, we’ll be turning the spotlight on co-host Tristan Handy, the CEO & Co-founder of dbt Labs. In this AMA with Julia, you’ll get to know more about Tristan as a human, as a writer, and as the CEO of dbt Labs helping to push the analytics engineering practice forward. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.

Duration:00:38:51

[COALESCE] Down With "Data Science" w/ Emilie Schario of Amplify Partners

12/9/2021
Your company has one definition for revenue across the organization, one definition of the customer, and one definition of sign-up. For people whose jobs are so defined by ensuring we’re aligned, we can’t seem to standardize on one definition for the Data Scientist. In this talk, Emilie Schario (Data Strategist-in-Residence at Amplify Partners and longtime dbt community member) proposes we lobby against the title Data Scientist, instead choosing some variation of the Core Four Data Roles:...

Duration:00:45:59

[COALESCE] Peeking Into the Future of Data Analytics w/ Julia

12/9/2021
How is the data landscape evolving, what trends should you pay attention to and which should you ignore? In this panel, Julia Schottenstein (our fearless co-host and dbt Labs product manager) catches up with Sarah Catanzaro, Jennifer Li and Astasia Myers to dive into the trends playing out in our work. Register to catch the rest of Coalesce, the Analytics Engineering Conference, at https://coalesce.getdbt.com. The Analytics Engineering Podcast is brought to you by dbt Labs.

Duration:00:45:10

[COALESCE] The Modern Data Experience w/ Benn Stancil of Mode

12/9/2021
In this talk, former podcast guest Benn Stancil walks through what he believe the next evolution of the modern data stack should look like - and more importantly, how those who use it should experience it. Register to catch the rest of Coalesce, the Analytics Engineering Conference, at https://coalesce.getdbt.com. The Analytics Engineering Podcast is brought to you by dbt Labs.

Duration:00:30:31

[COALESCE] Data Analytics In A Snowflake World ft. Christian Kleinerman

12/9/2021
Where does Snowflake go from here? What meta trends and technologies play into that vision? How does that impact the world of data analytics? Christian and Tristan have no shortage of opinions or ideas. This is your chance to hear some of them, live and unfiltered. Register to catch the rest of Coalesce, the Analytics Engineering Conference, at https://coalesce.getdbt.com. The Analytics Engineering Podcast is brought to you by dbt Labs.

Duration:00:24:06

[COALESCE] You Don’t Need Another Database W/ Reynold Xin of Databricks and Drew Banin of dbt Labs

12/7/2021
Reynold Xin is a technical co-founder and Chief Architect at Databricks. He’s also a co-creator and the top contributor to the Apache Spark project. In this casual conversation with Drew Banin, co-founder and Chief Product Officer at dbt Labs, the two will be discussing the data infrastructure trends they find most interesting. Register to catch the rest of Coalesce, the Analytics Engineering Conference, at https://coalesce.getdbt.com. The Analytics Engineering Podcast is brought to you...

Duration:00:30:12

[COALESCE] How big is this wave? Ft. Martin Casado of a16z

12/7/2021
The modern data stack is the third generation of data analysis products to come to prominence since the 90's. The prior waves—data warehouse appliances and then Hadoop—were both big steps forwards but ultimately failed to live up to their initial promise. Is the modern data stack just another iteration in a long string of “trendy technologies” in data––waves that crash upon the shore but ultimately recede? Or is it somehow more permanent? Register to catch the rest of Coalesce, the...

Duration:00:44:44

[COALESCE] Scaling Knowledge > Scaling Bodies: Why dbt Labs is making the bet on a data literate organization (ft. Erica Louie of dbt Labs!)

12/7/2021
What is it like to build a data team for a company in the data space? This talk is centered around how dbt Labs is building their data team. We will cover how our team is structured, how we operate and interact with the greater organization, and how we set expectations and responsibilities that are helping us become a self-service organization. Register to catch the rest of Coalesce, the Analytics Engineering Conference, at https://coalesce.getdbt.com. The Analytics Engineering Podcast is...

Duration:00:26:25