
Location:
United States
Description:
Datacast follows the narrative journey of founders, operators, and investors in the data and AI infrastructure space to unpack the careers that they have built. James Le hosts the show.
Twitter:
@james_aka_yale
Language:
English
Contact:
5852868783
Website:
https://datacast.simplecast.com
Email:
khanhle.1013@gmail.com
Episodes
Episode 131: Data Infrastructure for Consumer Platforms, Algorithmic Governance, and Responsible AI with Krishna Gade
1/11/2024
Show Notes
text document clusteringpattern discoveryhdevelopmentscalingreal-time analyticsETL-as-a-Servicean A/B testing platformthe major ML model performance issues while running Facebook's feed ranking platformalgorithmic governance from FacebookFiddler AImission is to build trust into AIAI needs a new developer stackthe Model Performance Management (MPM) frameworkmodel monitoringexplainable AIanalyticsfairnessmonitoring for NLP and Computer Vision modelson Fiddler's approach to model governance for the modern enterprisev the right peoplealigned with Fiddler's culturethe early design partnersdefining a new category of Responsible AIKrishna's Contact Info
LinkedInTwitterMediumFiddler's Resources
WebsiteLinkedInTwitterYouTubeAboutCustomersCareersAI ObservabilityModel MonitoringExplainable AIFairnessAnalyticsBlogDocsResourcesMentioned Content
People
Goku MohamandasKrishnaram KenthapadiBooks
The Hard Thing About Hard ThingsThe Five Dysfunctions of A TeamNotes
My conversation with Krishna was recorded more than a year ago. Since then, I'd recommend checking out these Fiddler's resources:
Alteryx VenturesMozilla VenturesDentsu VenturesScale Asia Venturesan end-to-end workflow for robust Generative AILLMOpsthe missing link in Generative AI
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:01:21:27
Episode 130: Towards Accessible Data Analysis with Emanuel Zgraggen
11/21/2023
Show Notes
Credit SuisseHSR (University of Applied Sciences Rapperswil)Andy van DamInteractive Data ExplorationMicrosoft ResearchProfessor Tim Kraska’s groupthe CSAIL at MITEinblickreal-time remote collaboration through video-enabled data whiteboardscollaborative visual canvasEmanuel's Contact Info
WebsiteGoogle ScholarLinkedInTwitterEinblick's Resources
WebsiteTwitterLinkedInDocsBlogChartGen AINotebook Feature ReleaseVideo-Based Collaboration ReleaseMentioned Content
Papers and Projects
Infovis 2014 PaperVideoCHI 2015 PaperSummary VideoVLDB Demo 2015 PaperHealth VideoElection VideoCHI 2016 LBW PaperVideoTowards Accessible Data AnalysisSIGMOD DEEM PaperVideoPeople
Wes McKinneyFei-Fei LiBooks
The Book of WhyThe Signal and The NoiseNotes
My conversation with Emanuel was recorded back in late 2022. Since then, I recommend checking out the launch of Einblick Prompt and ChartGenAI.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:00:54:05
Episode 129: Early-Stage Product Management, Product-Led Revenue, and Startup Monologues with Diana Hsieh
11/15/2023
Timestamps
Morgan StanleyNorwest Venture PartnersCockroach Labsthe challenges and learning curvesa non-technical first product hireTimescaleDBfor the most dissatisfied customers firstthe majority nextthe full need in the long runthe founding story of CorrelatedTim GeisenheimerJohn Penathe notion of Product-Led RevenueCorrelated worksproduct-led playbooksleveraged customer feedbackPQL Scoring that leverages machine learning to identify the best leadssuccessful communication in Product Managementlearnings on customer discovery at early-stage startupsthe early signs of product-market fitDiana's Contact Info
LinkedInTwitterMediumSubstackCorrelated's Resources
WebsiteLinkedInTwitterProduct OverviewHow Correlated WorksBlogPodcastDocsPLG Playbook LibraryCorrelated Launches to Bring Product-Led Revenue to Market with $8.3M in FundingWhat Is Product-Led Revenue?Correlated launches PQL Scoring to accelerate your product-led strategyMentioned Content
Blog Posts
The Standard Due Diligence ProcessMistakes to Avoid when Pitching to a VCMy Startup Litmus TestWhy I left VC to join Cockroach LabsMy First 90 Days as the First Product HireCoding != Technical: What It Means to be Technical as a PMHow learning to sell makes for a better product managerRoadmap Planning: Users First, Features SecondBuild something people will use more than onceFocus on the unhappiest, most dissatisfied customers firstBuild for the majorityWhy user interviews can fail you when starting a startupTackling the challenges of communicating effectively in product managementSome Learnings on Customer Discovery at Early-Stage Startups4 early signs of product-market fitGive customers what they want, but not what they ask forPeople
Lenny RachitskyJulie ZhuoNate StewartJeff SposettiBook
Crossing The ChasmAbout the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:01:02:29
Episode 128: Building Trust with Founders, VC Funding For The Cloud, and The Next Platform For Data Apps with Jason Risch
10/27/2023
Timestamps
Mathematical and Computational Sciencethe Mayfield FellowshipOpendoorAI Fund startup studioGreylock Partnersseed investment in OnehouseSeries A investment in BasetenGreylock’s Castles in the Cloud projectVC Funding for the Cloud"The Next Cloud Data Platform"Jason's Contact Info
LinkedInTwitterGreylockMentioned Content
Books
MoneyballWhy The West Rules For NowSnow CrashCryptonomiconTermination ShockPrinciples for Dealing with the Changing World OrderPeople
David LuanAlex RatnerFrank SlootmanClement DelangueNotes
My conversation with Jason was recorded back in late 2022. Since then, I recommend checking out these resources:
the next platform opportunity in cybersecurityLlamaIndexAbout the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:00:56:12
Episode 127: Data Intelligence for Insurance Transformation with Heather Wentworth
10/8/2023
Show Notes
Liberty Mutual InsuranceCrum & ForsterInnoviskWills Towers WatsonBrown and Brown InsuranceAccelerant HoldingsAccelerant risk exchangeData IntelligenceHeather's Contact Info
LinkedInAccelerantAboutRelevant Reading
Eldridge Bets on Accelerant at $2.2 Billion ValuationAccelerant: An insurtech that defies categoriesAccelerant establishes $175 million sidecar reinsurerData Intelligence is Key to Understanding our Customers – Chief Data Officer, Accelerant HoldingsInsurance Innovators Top 100Accelerant Launches the Accelerant Risk Exchange to Reimagine InsuranceMentioned Resources
People
Zhamak DehganiCassie KozyrkovAllie MillerBook
Data Mesh: Delivering Data-Driven Value at ScaleAbout the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:01:01:01
Episode 126: Vector Search Engine, Building An Open-Source Business, and Digital Technology Through The Lens of Language with Bob Van Luijt
9/12/2023
Show Notes
control(human, data, sound)TEDx talk the founding story of Weaviatea business model around the open-source projectpricing modelthe AI-first database ecosystembeing a remote-first companybuilding an open-source brandBob's Contact Info
WikipediaLinkedInTwitterGitHubWTF Medium BlogYouTubeWeaviate's Resources
WebsiteTwitterSlackForumGitHubBlogPodcastPlaybookMentioned Content
People
Sam RamjiPaul GrahamBook
Hackers and PaintersNotes
My conversation with Bob was recorded back in late 2022. Since then, I recommend checking out these resources:
SeMI Tech becomes WeaviateThe $50M Series B funding led by Index with participation from BatteryThe public beta of Weaviate Cloud Serviceorganic growth4th birthdayAbout the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:01:21:23
Episode 125: The Next Wave of Developer Platforms, Data Products, and Software Infrastructure with Sakib Dadi
9/3/2023
Show Notes
the University of PennsylvaniaInnova DynamicsMorgan StanleyBessemer Venture PartnersViagogoLaunchDarklyPagerDutyCoiledPrefectArcion LabsGuild EducationTribeBessemer's roadmap on data infrastructurethe products that help abstract away complexity from data engineering problemsthe tools that power the next generation of data scientiststhe emergence and evolution of metadata managementthe evolution of ML infrastructurethe key trends and opportunities that will define the next wave of BI and data analytics softwareclimate changestudent buildersSakib's Contact Info
Profile PageLinkedInTwitterMentioned Content
People
Sarah CatanzaroEd SimMike SpeiserBook
The Idea FactoryNotes
My conversation with Sakib was recorded back in late 2022. Since then, I recommend checking out these resources:
the era of intelligent searchAI Roadmapthe ChatBVP bot2023 Cloud 100 Benchmarks Report
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:01:02:24
Episode 124: The Open-Source Cloud Playbook, The Modular Future of Data and AI Infrastructure, and Meta-Learning as a VC with Casber Wang
8/30/2023
Show Notes
in business at UC BerkeleyEtch.aiWishBank of America Merrill LynchSapphire VenturesSapphire Venturesthe Series F round of JumpCloudthe Series B round of Uptycsthe Series B round of Tetratethe Series A of Zestythe Series D round of Dremiothe three strategies software companies can borrow from the open-source cloud playbookthe Open Data Ecosystemthe modular future of AI infrastructurethe dynamic evolution of the software development lifecycleCasber's Contact Info
Sapphire Ventures ProfileTwitterLinkedInMentioned Content
Articles
3 Strategies Software Companies Can Borrow from the Open-Source Cloud PlaybookWhat is the Open Data Ecosystem and Why It's Here to StayThe Future of AI Infrastructure is Becoming Modular: Why Best-of-Breed MLOps Solutions are Taking Off and Top Players to WatchEvolution of the Software Development Lifecycle and the Future of DevOpsBooks
The Power LawEngines That Move MarketsNotes
My conversation with Casber was recorded back in late 2022. Since then, I recommend checking out these resources:
Casber's appearance on Bloomberg NewsCasber's analysis of the next wave of cybersecurityHuntressWeights & Biasesnew $1B fund to invest in AI-powered enterprise tech startupsAbout the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:01:12:40
Episode 123: AI Monitoring, Product-Oriented Data Science, and the Israeli ML community with Itai Bar Sinai
8/17/2023
Show Notes
The Hebrew University of JerusalemLia's KitchenMona Labsthe Mona monitoring platforma comprehensive monitoring strategygranular tracking and avoiding noiseItai's Contact Info
LinkedInMona Labs' Resources
WebsiteLinkedInTwitterYouTubeAboutCustomersCareersPlatformBlogCase StudiesDocsMentioned Content
Blog Posts and Talks
We are building Mona to bring ML observability to production AIThe definitive guide to AI/ML monitoringThe secret to successful AI monitoring: Get granular, but avoid noiseTaking AI from good to great by understanding it in the real worldData drift, concept drift, and how to monitor for themThe issues ML model retraining won't solveCommon pitfalls to avoid when evaluating an ML monitoring solutionIntroducing automated exploratory data analysis powered by MonaBest practices for setting up monitoring operations for your AI teamThe challenges of specificity in monitoring AIIs your LLM application ready for the public?Overcoming cultural shifts from data science to prompt engineeringPeople
Goku MohandasVille TuulosNimrod TamirNotes
My conversation with Itai was recorded back in October 2022. Since then, Mona Labs has introduced a new self-service monitoring solution for GPT! Read Itai's blog post for the technical details.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:01:16:32
Episode 122: The Evolution of Data Visualization, Scaling Data Culture, and The Future of Data Transformation with Gabi Steele
7/31/2023
Show Notes
Parsons School of DesignThe Washington PostWeWorkthe Data Cult initiativeData Culturewell-defined blueprint for each client engagementData Culture's Studiowith Kode with Klossyscale their respective data culturesthe story behind the founding of PreqlGabi's Contact Info
LinkedInTwitterPreql's Resources
WebsiteTwitterLinkedInIntroducing Preql: The Future of Data TransformationMentioned Resources
People
Umi SyamGiorgia LupiSusie LuBook
Invisible Women: Exploring Data Bias in a World Designed for MenNotes
My conversation with Gabi was recorded back in August 2022. Since then, Preql has officially launched and currently supports strategy and operations teams at B2B and vertical SaaS companies!
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:01:06:44
Episode 121: High-Performance Processing Engine, Modern Data Streaming, and Propelling Minority in Tech with Alex Gallego
7/13/2023
Show Notes
NYU’s Polytechnic School of EngineeringFactSet Research SystemYieldMoConcordShinji KimRobert BlaffordAkamaiSMFthe storyRedpanda Dataopen-source Redpanda in November 2020contributionthe open-source libraryRedpanda's Intelligent Data API the right peopleAlex's Contact Info
LinkedInTwitterWebsiteGitHubRedpanda's Resources
WebsiteTwitterLinkedInSlackGitHubContributing DocAbout RedpandaPlatform CapabilitiesCustomersDocsRedpanda UniversityReports and GuidesBenchmarksHack The Planet ScholarshipMentioned Content
Blog Posts
Redpanda raison d'etreThread-per-core buffer management for a modern Kafka-API storage systemRedpanda is now free and Source AvailableRedpanda creates Redpanda, the Intelligent Data API Platform, backed by $15.5M initial funding from Lightspeed Venture Partners and GVThe Intelligent Data APIRedpanda Wasm engine architectureWe raised an additional $50M to drive the future of streaming data. Join us!Redpanda gives Kafka a Run for Its MoneyAlex Gallego Builds Redpanda To Simplify And Unify Real-Time Streaming DataTalks
Distributed Stream Processing over thousands of DatacentersHow to Build the Fastest RPCCo-designing Raft + thread-per-core execution model for the Kafka-APIPeople
Andy PavloLeslie LamportKyle KingsburyNotes
My conversation with Alex was recorded back in August 2022. Since then, I recommend checking out these resources:
The $100M Series C funding announcementThis guide for developers on streaming dataLaceworkExeinSmartLunchcost of ownership comparisondata sovereigntythis holistic comparison
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:01:35:54
Episode 120: Next-Generation Experimentation, Statistics Engineering, and The Modern Growth Stack with Chetan Sharma
6/24/2023
Show Notes
QuantcastStanford Center for Minds, Brain, and ComputationAcumenAirbnb's original ETL framework for online risk mitigationKnowledge Repowhy an experimentation program is the most impactful thing a data team can doits inception in 2014taking a year off from work to travelSaltboxWebflowthe story behind the founding of Eppothe problems caused by long experiment durationsto bend time in experimentsthe core elements of the modern experimentation stackthe experiment overheadthe designer gap in experimentation toolsabout metric strategyChetan's Contact Info
LinkedInTwitterGitHubAngelListEppo's Resources
WebsiteTwitterLinkedInBlogUpdatesDocAboutCareersExperimentation ProductFeature Flagging ProductMentioned Content
Articles
Travel Year Facts and SuperlativesWhy I Started EppoReducing Experiment DurationsThe Designer Gap in Experimentation ToolsWe're Hiring A Statistics Engineer!Should You Always Run An ExperimentStop Micromanaging Product StrategyThe most impactful thing a data team can do is establish an experimentation programBending Time in ExperimentationWe Raised $19.5M!Experimentation for the Modern Growth Stack: Our Investment in EppoPeople
Mike KaminskySean TaylorJeremy HowardBook
The Mom's TestNotes
My conversation with Chetan was recorded back in August 2022. Since then, Eppo has launched feature flagging, and now offers the first "flags on top of your warehouse" experimentation platform. They also have Miro, Twitch, DraftKings, and Zapier as customers.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:01:26:36
Episode 119: Experimentation Culture, Immutable Data Warehouse, The Data Collaboration Problem, and The Rise of Data Contracts with Chad Sanderson
6/16/2023
Show Notes
SubwaySEPHORAMicrosoftConvoytheir decision to choose Amundsensetting up a flexible experimentation platform at Convoythe problems with Change Data Capturean internal change management platform called Chassisthe existential threat of data quality the modern data warehouse is brokenImmutable Data Warehousethe death of data modelingthe knowledge layerthe data collaboration problemcustomer-centriccomponents of a high-quality Data UX functionChad's Contact Info
LinkedInData Products SubstackData Quality CampMentioned Content
Talks
Aligning Experimentation Across Product Development and MarketingChassis: Entities, Events, and Change Management1,000 Experiments Club with AB TastyData Discovery at Lyft and ConvoyThe growth of the data platform product manager roleImplementing Amundsen at ConvoyGetting ROI from Experimentation: How AB Experimentation plays out in OrganizationsWhy are we so bad at this modern data stack?Articles
Experimentation not only protects your KPIs but your job as wellIs The Modern Data Warehouse Broken?The Existential Threat of Data QualityThe Death of Data ModelingData Collaboration ProblemThe Rise of Data ContractsPeople
Barr MosesJuan SequedaAdrian KreuzigerBook
Agile Data Warehouse DesignNotes
My conversation with Chad was recorded back in July 2022. Since then, I'd recommend looking at:
Part 1Part 2The Data Quality Camp communityhow scale kills data teamsData Facade
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:01:22:44
Episode 118: Overcoming Hardships, Confident Learning, Dataset Improvement, and The Ph.D. Rapper with Curtis Northcutt
6/9/2023
Show Notes
Vanderbilt Universityembark on a Ph.D. in Computer Science at MITIsaac Chuangthe inventor of the first working quantum computerCAMEO Detection AlgorithmPh.D. researchconfident learningcleanlabChipBrainCleanlablabelerrors.comcleanlabCleanlab StudioCleanlab VizzyCleanlab’s culturebuild affordable state-of-the-art deep learning machinesPomDP the Ph.D. rapperhis success had been due to a function of grit, resourcefulness, and friends made along the wayCurtis' Contact Info
Academic WebsiteLinkedInTwitterFacebookInstagramGoogle ScholarGitHubPhD RapperYouTubeSpotifySoundCloudFacebookTwitterInstagramL7 Machine Learning BlogCleanlab's Resources
WebsiteGitHubSlackTwitterLinkedInBlogResearchDocAboutCareersCleanlab StudioCleanlab VizzyThe Cleanlab CultureMentioned Content
Papers
Computers & EducationpapercodearXivpapercodefree-access33rd Conference on Uncertainty in Artificial Intelligence (UAI 2017)papercodeJournal of Artificial Intelligence Research (JAIR), Vol. 70 (2021)papercodeblog35th Conference on Neural Information Processing Systems (NeurIPS 2021) Track on Datasets and BenchmarkspaperdemocodeblogBlog Posts
Founder’s Medal recipient chooses MIT over MicrosoftBuild a Pro Deep Learning Workstation... for Half the PriceAn Introduction to Confident Learning: Finding and Learning with Label Errors in DatasetsAnnouncing cleanlab: a Python Package for ML and Deep Learning on Datasets with Label ErrorsDouble Deep Learning Speed by Changing the Position of your GPUsBenchmarking: Which GPU for Deep Learning?The Best 4-GPU Deep Learning Rig only costs $7000 not $11,000Pervasive Label Errors in ML Datasets Destabilize BenchmarksCleanlab: The History, Present, and Futurecleanlab 2.0: Automatically Find Errors in ML DatasetsHow We Built Cleanlab VizzyTalks and Podcasts
Tedx Talk: The MIT Rap ChallengeTalk at NLP SummitTalk at Data + AI SummitMLOps Coffee ChatTalk at Snorkel's Future of Data-Centric AI ConferenceOpen-Source Startup PodcastPeople
Leslie KaelblingGeoff HintonJeff DeanBook
Play Bigger: How Pirates, Dreamers, and Innovators Create and Dominate MarketsNotes
My conversation with Curtis was recorded back in August 2022. The Cleanlab team has had some important announcements in 2023 that I recommend looking at:
CleanVisionDatalabActiveLabThis blog post on using Cleanlab to improve LLMsHis new single "Clarity In My Vision"Cleanlab's partnership with DatabricksVideoCleanlab is about to announce its Series A announcement soon. Stay on the look for it!
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits....
Duration:01:48:02
Episode 117: Vector Databases, The Embeddings Revolution, and Working in China with Frank Liu
5/28/2023
Show Notes
indoor localization and navigationnormal lifethe pandemic storyZillizthe notion of vector databasesMilvusTowheeopen-source projectZilliz CloudFrank's Contact Info
LinkedInTwitterGitHubWebsiteZilliz's Resources
WebsiteTwitterLinkedInGitHubYouTubeZilliz Cloud DatabaseMilvusDocsGitHubTowheeDocsGitHubMentioned Content
Articles and Presentations
A Gentle Introduction to Vector DatabasesMy Experience Living and Working in China, Part IMy Experience Living and Working in China, Part IIMaking ML More Accessible for Application DevelopersUnderstanding Neural Network EmbeddingsBuilding An Open-Source Platform for Generating Embedding VectorsPeople
Yann LeCunYangqing JiaSoumith ChintalaBook
A Short History of Nearly EverythingNotes
My conversation with Frank was recorded back in August 2022. The Zilliz team has had some important announcements in 2023 that I recommend looking at:
The landing page of Zilliz CloudThe beta launch of Milvus 2.3The development of GPTCacheThe OSS Chat demo applicationAbout the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:01:09:35
Episode 116: Distributed Databases, Open-Source Standards, and Streaming Data Lakehouse with Vinoth Chandar
5/11/2023
Show Notes
Madras Institute of TechnologyComputer Sciencehigh-bandwidth content distributionlarge-scale parallel processing with shell pipesOracleLinkedInVoldemortUberUber's case for incremental processing on Hadoopthe initial design and implementation of Hudithe evolution of HudiApache Software Foundationestablishing standards for open-source data projectsvaluable leadership lessonsConfluentksqlDBthe vision for Apache Hudi as a Streaming Data Lake platformthe Hudi roadmapengaging an open-source communityOnehouseOnehouse's commitment towards opennessVinoth's Contact Info
LinkedInTwitterOnehouse's Resources
WebsiteTwitterLinkedInAboutProductBlogCareersApache Hudi's Resources
User DocsTechnical WikiRoadmapGitHubTwitterSlackMentioned Content
Articles and Presentations
Voldemort : Prototype to ProductionUber's Case for Incremental Processing on HadoopHoodie: An Open Source Incremental Processing Framework From UberThe Past, Present, and Future of Efficient Data Lake ArchitecturesHighly Available, Fault-Tolerant Pull Queries in ksqlDBApache Hudi - The Data Lake PlatformIntroducing OnehouseAutomagic Data Lake InfrastructureOnehouse Commitment to OpennessPeople
Leslie LamportJeff DeanMichael StonebreakerBook
Zero To OneNotes
My conversation with Vinoth was recorded back in August 2022. The Onehouse team has had some announcements in 2023 that I recommend looking at:
The Launch Announcement of OnetableThe $25M Series A Funding AnnouncementOnehouse Availability in AWS MarketplaceOnehouse Product Demo on building a data lake for GitHub analytics at scaleWalmart's recent study on different open-source data lakehouse formatsThis discussion around the Hudi 1.x visionAbout the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:01:32:18
Episode 115: Product-Led Sales, Community-Led Category Creation, and Unlocking Revenue Data with Alexa Grabell
5/2/2023
Show Notes
VanderbiltEngineering ScienceKPMGDataminrStanford Graduate School of BusinessMonte CarloPocusProduct-Led SalesProduct-Qualified LeadsSales-Assistthe Product-Led Sales communityevolving a category, a community, and a product all at onceAlexa' Contact Info
LinkedInTwitterPocus' Resources
WebsiteTwitterLinkedInYouTubeAboutProductBlogCareersCommunityNewsletterMentioned Content
Blog Posts
What is Product-Led Sales?The Myth of "No Sales" at PLG CompaniesWhen To Add A Sales Team to Your PLG CompanyPart 1Part 2Part 3What Is The Sales-Assist Role?Introducing Pocus' PLS PlatformProduct-Led Sales Community Wisdom Highlights 2021Notes on Community-Led Category Creation with Pocus' Co-Founder, Alexa GrabellSneak Peek at Pocus' PLS PlatformAnnouncing $23M to Transform How GTM Teams Use Data to Drive RevenueYear One: The Product-Led Sales Platform is Here to StayPeople
Kyle PoyarMelissa RossAaron GellerNotes
My conversation with Alexa was recorded back in July 2022. The Pocus team has had some announcements in 2023 that I recommend looking at:
The launch announcement of Pocus' Revenue Data PlatformThe Product-Led Sales Playbook Volume 2The Unlocking Revenue podcastThe Playbook Library for product-led go-to-marketAbout the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:01:08:16
Episode 114: Building Data Products and Unlocking Data Insights with Carlos Aguilar
4/19/2023
Show Notes
Cornell UniversityKiva Systems his first data product at Kivaafter the Kiva acquisitionFlatiron Healthbuilding Flatiron's Data Insights team from scratchGleanthe pain points in data visualization/exploration the product features of GleanDataOpsbroken dashboardsCarlos' Contact Info
TwitterLinkedInGitHubWebsiteMediumGlean's Resources
WebsiteTwitterLinkedInAboutDocsBlogInteractive Public DemoDataOpsMentioned Content
Blog Posts
How the Data Insights team helps Flatiron build useful data productsThe biggest mistake making your first data hire: not interviewing for productHow to interview your first data hireMy hack for getting started with data as a productIntroducing GleanYour dashboard is probably brokenPeople
Vicki BoykisAnthony GoldbloomWes McKinneyBook
The Toyota Way: 14 Management Principles from the World's Greatest ManufacturerNotes
My conversation with Carlos was recorded back in June 2022. The Glean team has had some announcements in 2023 that I recommend looking at:
The recently launched, interactive public demo siteThis recent integration with DuckDBThis post about Version Control for BITheir Public RoadmapAbout the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:01:18:07
Episode 113: Data Applications, Real-Time Analytics, and Cloud Product Management with Shruti Bhat
4/12/2023
Show Notes
Hewlett-PackardIBMUCLA Anderson School of ManagementVMwareRavello SystemsOracle'sRocksetreal-time analyticsdata applicationsRockset architecturethe modern real-time data stackuse casesShruti's Contact Info
LinkedInTwitterForbesRockset's Resources
WebsiteTwitterLinkedInFacebookDocsBlogCommunityProductArchitectureCustomersReal-Time Analytics ExplainedWhat Is A Data Application?Mentioned Content
Articles
Building Data Applications Powered by Real-Time AnalyticsHow startups can create a culture where women can winStreaming Data and the Modern Real-Time Data StackPeople
Barr MosesJay KrepsAlex DeBrieDynamoDB ExpertBook
Competing Against LuckNotes
My conversation with Shruti was recorded back in June 2022. Since then, a lot has happened. I recommend looking at the resources below:
The launch of compute-compute separation for real-time analyticsThis benchmark on top real-time analytics databases in 2023This talk on emerging architectures for real-time CDCAbout the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:01:17:13
Episode 112: Distributed Systems Research, The Philosophy of Computational Complexity, and Modern Streaming Database with Arjun Narayan
4/7/2023
Show Notes
UWC Mahindra CollegeWillams Collegethe University of CambridgeUniversity of PennsylvaniaAndreas HaeberlenPh.D. dissertationCockroach LabsCockroachDB Performance GuideRocksDB deep-divedatabase transaction isolation semanticslog-structured merge treesThe Philosophy of Computational ComplexityMaterializeTimely DataflowDifferential DataflowFrank McSherrythe architecture designStreaming SQLopen-source projectenterprise-grade featuresMaterialize CloudMaterialize’s unbundled cloud architectureArjun's Contact Info
LinkedInTwitterGitHubGoogle ScholarMaterialize's Resources
WebsiteTwitterLinkedInSlackDocsGitHubBlogEventsGuidesCareersMentioned Content
Research + Articles
Distributed Differential Privacy and ApplicationsPerformance Report: Benchmarking CockroachDB's TPC-C PerformanceWhy We Built CockroachDB on top of RocksDBA History of Transaction HistoriesA Brief History of Log Structured Merge TreesPeople
Kyle KingsburyBob MugliaFrank McSherryBook
Zero To OneNotes
My conversation with Arjun was recorded back in May 2022. Since then, a lot has happened. I recommend looking at the resources below:
About Materialize webpageGuide: What is a Streaming DatabasewhyCase Study: Real-time Delivery Tracking UI in a Single Sprint at OnwardTech Demo: CI/CD Workflows for dbt+MaterializeAnnouncing The Next Generation of MaterializeAbout the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
About the show
Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.
Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email khanhle.1013@gmail.com.
Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:
Listen on SpotifyListen on Apple PodcastsListen on Google PodcastsIf you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.
Duration:01:49:17