Plumbers of Data Science-logo

Plumbers of Data Science

Technology Podcasts

Data Engineering is the plumbing of data science. Almost invisible, but super important and a big mess when done wrong. We talk about interesting Data Engineering trends and topics. I also train Data Engineering in my Data Engineering Academy at LearnDataEngineering.com

Location:

United States

Description:

Data Engineering is the plumbing of data science. Almost invisible, but super important and a big mess when done wrong. We talk about interesting Data Engineering trends and topics. I also train Data Engineering in my Data Engineering Academy at LearnDataEngineering.com

Language:

English


Episodes
Ask host to enable sharing for playback control

#90 Taylor McGrath - The Future of the Modern Data Stack

1/25/2023
Super happy to have Taylor with me on this stream. She is the VP of Data Labs at Rivery and therefore has a lot of experience with data platforms. We'll talk about the modern data stack and where it's going. I'm excited to hear her experience about the changes that are happening in the data space, and what that means for data engineers & data teams.

Duration:00:47:00

Ask host to enable sharing for playback control

#89 Piyush Sachdeva - Getting Into Google After Eight Rejections from Amazon!

1/16/2023
In this video I talk to Piyush who's an engineer at Google and has his own YouTube channel: "Tech Tutorials with Piyush". He's a really good guy and I love how he's dedicated to teaching engineering. We are talking about some awesome topics like: Have fun! You can also check this on out on YouTube: https://youtu.be/FZemVaqQcnM If you want to get into Data Engineering check out my Academy at https://learndataengineering.com

Duration:00:42:07

Ask host to enable sharing for playback control

#88 - Wouter Trappers - How to Realize a Data Strategy Like a Pro!

4/12/2022
I have seen people doing that wrong a few times. Luckily Wouter Trappers who is helping companies as a professional can help. We talked about The steps you need to take from value proposition to dashboards. Wouter is really knowledgeable and it was super fun talking with him and hearing his approach.

Duration:00:39:48

Ask host to enable sharing for playback control

#87 - Dhruba Borthakur - From Hadoop to real time analytics

4/12/2022
Dhruba Borthakur is CTO at Rockset and a passionate Data Engineer. Before co-founding Rockset he played a big role in development of Hadoop HDFS at Yahoo as well as HBase and RocksDB at Facebook. His current project is the serverless Rockset platform where you can gain real time analytics insight into your data. I tried it out before our talk and really liked it.

Duration:01:05:37

Ask host to enable sharing for playback control

#86 The Ultimate Data Engineering Introduction

1/14/2021
The Podcast is back!!!! I promise I am going to keep it up to date this time ;) In this episode I talk about my newest Data Engineering course. I think it's the ultimate 1 hour 15 minutes introduction to Data Engineering. There were also a ton of questions from the chat that I answered. Think you really enjoy this.

Duration:01:14:35

Ask host to enable sharing for playback control

#085 Big Data and Data Science Landscape plus trying to read Tweets with Nifi

5/28/2019
We are looking into the network communication protocol map. I first saw this like 10 years ago and its awesome. Then we check out the Big Data and Data Science Landscape image. It shows you all the tools available to do data science, machine learning and data engineering. Which is very helpful if you are researching for tools to use. Before using the Twitter API you got to create a developer account. So, I show you how I created one. After that I tried to get Nifi to download Tweets but it...

Duration:00:43:06

Ask host to enable sharing for playback control

#084 Behind the scenes: Audio podcast, free transcriptions and GitHub

5/27/2019
Today's podcast is a bit of a behind the scenes. What it takes to do a audio podcast. How you can get audio to text transcriptions for free. .Also Github questions on how to work with branches on the Cookbook

Duration:00:51:21

Ask host to enable sharing for playback control

#083 Data Engineering at OLX Case Study

5/27/2019
Today a case study about OLX with a guest it was super fun! Here are the slides Alexeyand I talked about: https://www.slideshare.net/mobile/AlexeyGrigorev/image-models-infrastructure-at-olx

Duration:01:10:53

Ask host to enable sharing for playback control

#082 Reading Tweets With Apache Nifi & IaaS vs PaaS vs SaaS

5/27/2019
In this episode we install the Nifi docker container and look into how we can extract the twitter data. We are also talking about the differences between infrastructure as a service, platform as a service and application as a service.

Duration:01:19:06

Ask host to enable sharing for playback control

#081 How to get tweets from the Twitter API

5/27/2019
In this episode we look into the Twitter API documentation, which I love by the way. How can we get old tweets for a certain hashtags and how to get current live tweets for these hashtags.

Duration:01:09:47

Ask host to enable sharing for playback control

#080 How To Find A Job In Germany & Answering Mails

5/27/2019
Tips on how you find a job in Germany and two super interesting mails.

Duration:00:54:54

Ask host to enable sharing for playback control

#079 Trying to stay true to myself and making the cookbook public on GitHub

5/27/2019
The cookbook my Youtube, it will be for free, forever! Check out the data engineering cookbook on GitHub: https://github.com/andkret/Cookbook

Duration:00:24:34

Ask host to enable sharing for playback control

#078 Cookbook collaboration and updates

5/27/2019
Updates of the cookbook and how to collaborate on it

Duration:00:31:08

Ask host to enable sharing for playback control

#077 Lambda and Kappa Architecture

5/27/2019
In this episode we talk about the lambda architecture with stream and batch processing as well as a alternative the Kappa Architecture that consists only of streaming. Also Data engineer vs data scientist and we discuss Andrew Ng's AI Transformation Playbook

Duration:01:22:01

Ask host to enable sharing for playback control

#076 Cloud vs On Premise How To Decide

5/27/2019
How do you choose between Cloud vs On-Premise, pros and cons and what you have to think about. Because there are good reasons to not go cloud. Also thoughts on how to choose between the cloud providers by just comparing instance prices. Otherwise the comparison will drive you insane.

Duration:01:15:56

Ask host to enable sharing for playback control

#075 Creating the Course Structure For My Data Engineering Course

5/27/2019
In this episode we go over the ideas I have for the data engineering course structure. It was your chance for you to influence what we put in there.

Duration:00:53:18

Ask host to enable sharing for playback control

#074 Starting My Data Engineering Online Course

5/27/2019
In this video we go over some of the 100+ comments I received on LinkedIn about a data engineering training.

Duration:01:01:19

Ask host to enable sharing for playback control

#073 Data Engineering At LinkedIn Case Study

5/27/2019
Let's check out how LinkedIn is processing data

Duration:01:12:21

Ask host to enable sharing for playback control

#072 Data Engineering At Twitter Case Study

5/27/2019
How is Twitter doing Data Engineering? Oh man, they have a lot of cool things to share these tweets.

Duration:00:56:27

Ask host to enable sharing for playback control

#071 Data Engineering At Spotify Case Study

5/27/2019
In this episode we are looking at the data engineering at Spotify, my favorite music streaming service. How do they process all that data?

Duration:00:43:04