Machine Learning Guide-logo

Machine Learning Guide

Technology Podcasts

Machine learning audio course, teaching the fundamentals of machine learning and artificial intelligence. It covers intuition, models (shallow and deep), math, languages, frameworks, etc. Where your other ML resources provide the trees, I provide the forest. Consider MLG your syllabus, with highly-curated resources for each episode's details at ocdevel.com. Audio is a great supplement during exercise, commute, chores, etc.

Location:

United States

Description:

Machine learning audio course, teaching the fundamentals of machine learning and artificial intelligence. It covers intuition, models (shallow and deep), math, languages, frameworks, etc. Where your other ML resources provide the trees, I provide the forest. Consider MLG your syllabus, with highly-curated resources for each episode's details at ocdevel.com. Audio is a great supplement during exercise, commute, chores, etc.

Language:

English


Episodes
Ask host to enable sharing for playback control

MLA 027 AI Video End-to-End Workflow

7/14/2025
How to maintain character consistency, style consistency, etc in an AI video. Prosumers can use Google Veo 3’s "High-Quality Chaining" for fast social media content. Indie filmmakers can achieve narrative consistency by combining Midjourney V7 for style, Kling for lip-synced dialogue, and Runway Gen-4 for camera control, while professional studios gain full control with a layered ComfyUI pipeline to output multi-layer EXR files for standard VFX compositing. Links ocdevel.com/mlg/mla-27 Try a walking deskDescript AI Audio Tool Selection Music:SunoUdioSound Effects:ElevenLabs' SFXSFX EngineVoice:ElevenLabsMurf.aiPlay.htOpen-Source TTS:StyleTTS 2Coqui's XTTS-v2Piper TTS I. Prosumer Workflow: Viral Video Goal: Rapidly produce branded, short-form video for social media. This method bypasses Veo 3's weaker native "Extend" feature. ToolchainImage Concept:Video Generation:Soundtrack:Assembly:WorkflowCreate Character Sheet (GPT-4o):Generate Video (Veo 3):Create Music (Udio):Final Edit (CapCut): II. Indie Filmmaker Workflow: Narrative Shorts Goal: Create cinematic short films with consistent characters and storytelling focus, using a hybrid of specialized tools. ToolchainVisual Foundation:Dialogue Scenes:B-Roll/Action:Voice Generation:Edit & Color:WorkflowCreate Visual Foundation (Midjourney V7):Create Dialogue Scenes (ElevenLabs -> Kling):Create B-Roll (Runway Gen-4):Assemble & Grade (DaVinci Resolve): III. Professional Studio Workflow: Full Control Goal: Achieve absolute pixel-level control, actor likeness, and integration into standard VFX pipelines using an open-source, modular approach. ToolchainCore Engine:VFX Compositing:Control Stack & WorkflowTrain Character LoRA:Build ComfyUI Node Graph:Export Multi-Layer EXR:Composite in Fusion:

Duration:01:11:37

Ask host to enable sharing for playback control

MLA 026 AI Video Generation: Veo 3 vs Sora, Kling, Runway, Stable Video Diffusion

7/11/2025
Google Veo leads the generative video market with superior 4K photorealism and integrated audio, an advantage derived from its YouTube training data. OpenAI Sora is the top tool for narrative storytelling, while Kuaishou Kling excels at animating static images with realistic, high-speed motion. Links ocdevel.com/mlg/mla-26 Try a walking deskAGNTCY S-Tier: Google Veo The market leader due to superior visual quality, physics simulation, 4K resolution, and integrated audio generation, which removes post-production steps. It accurately interprets cinematic prompts ("timelapse," "aerial shots"). Its primary advantage is its integration with Google products, using YouTube's vast video library for rapid model improvement. The professional focus is clear with its filmmaking tool, "Flow." A-Tier: Sora & Kling OpenAI Sora:"Storyboard" functionKuaishou Kling:Summary: Control and Customization: Runway & Stable Diffusion Runway:Stable Diffusion: Niche Tools: Midjourney & More Midjourney Video:Avatar Platforms (HeyGen, Synthesia): Head-to-Head Comparison PhotorealismWinner.Best 4K detail and physicsExcellent, but can have a stylistic "AI" lookVery strong, especially with human subjects Good, but a step below the top tierConsistency Strong, especially with Flow's scene-buildingCo-Winner.Storyboard feature is built for thisCo-Winner.Excels in image-to-video consistencyGood, with character reference toolsPrompt AdherenceWinner (Language).Best understanding of cinematic termsBest for imaginative/narrative promptsStrong on motion, less on camera specificsGood, but relies more on UI toolsDirectorial Control Strong via prompt Moderate, via prompt and storyboardModerate, focused on motionWinner (Interface).Motion Brush & Director Mode offer direct controlIntegrated AudioWinner.Native dialogue, SFX, and music. Major workflow advantageRequires post-productionRequires post-productionRequires post-production Advanced Multi-Tool Workflows High-Quality Animation:VFX Compositing:High-Volume Marketing: Decision Matrix: Who Should Use What? The Indie FilmmakerOpenAI SoraGoogle VeoThe VFX ArtistStable Diffusion (AnimateDiff/ComfyUI)The Creative AgencyRunwayGoogle VeoThe AI Artist / AnimatorMidjourney + KlingThe Corporate TrainerHeyGen / Synthesia Future Trajectory Pipeline Collapse:The Control Arms Race:Rise of Aggregators:

Duration:00:40:39

Ask host to enable sharing for playback control

MLA 025 AI Image Generation: Midjourney vs Stable Diffusion, GPT-4o, Imagen & Firefly

7/9/2025
The 2025 generative AI image market is a trade-off between aesthetic quality, instruction-following, and user control. This episode analyzes the key platforms, comparing Midjourney's artistic output against the superior text generation and prompt adherence of GPT-4o and Imagen 4, the commercial safety of Adobe Firefly, and the total customization of Stable Diffusion. Links ocdevel.com/mlg/mla-25 Try a walking deskAGNTCY The State of the Market The market is split by three core philosophies: The "Artist" (Midjourney):aesthetic excellenceThe "Collaborator" (GPT-4o, Imagen 4):conversational co-creationThe "Sovereign Toolkit" (Stable Diffusion):unparalleled control, customization, and privacy Table 1: 2025 Generative AI Image Tool At-a-Glance Comparison Artistic Aesthetics & Photorealism Conversational Control & Instruction Following Ecosystem Integration & SpeedUltimate Customization & ControlCommercial Safety & Workflow Integration Core Platforms Midjourney v7:artistic qualityFeatures:Web UI with Draft ModeWeaknesses:Poor text generationno API/bans automationOpenAI GPT-4o:intelligent co-creatorFeatures:Conversational refinementWeaknesses:Slower than competitorsstrict content filtersGoogle Imagen 4:ecosystem integrationFeatures:Weaknesses:Stable Diffusion 3:maximum user controlFeatures:MMDiT architectureWeaknesses:Steep learning curveAdobe Firefly:commercial safetyFeatures:Trained on Adobe Stock for legal indemnityWeaknesses: Tools and Concepts In-painting:Modifying a masked area inside an imageOut-painting:Extending an image beyond its original bordersLoRA (Low-Rank Adaptation):fine-tuned style, character, or conceptControlNet:enforce the composition, structure, or poseA1111 vs. ComfyUI:beginner-friendly tabbed interfacenode-based interface for complex, efficient, and automated workflows Workflows "Best of Both Worlds":Generate aesthetic base images in Midjourney, then composite, edit, and add text with precision in Photoshop/FireflySingle-Ecosystem:seamless integration, commercial safety (Adobe), and convenience (Google)"Build Your Own Factory":build automated, multi-step pipelines for consistent character generation, advanced upscaling, and video Decision Framework Choose by Goal: Fine Art/Concept Art:Logos/Ads with Text:IdeogramConsistent Character in Specific Pose:Editing/Expanding an Existing Photo: Exclusion Rules: exclude MidjourneyStable Diffusion is the only optionuse Adobe Fireflyuse OpenAI or Googleautomating Midjourney is a bannable offense

Duration:01:12:33

Ask host to enable sharing for playback control

MLG 036 Autoencoders

5/30/2025
Auto encoders are neural networks that compress data into a smaller "code," enabling dimensionality reduction, data cleaning, and lossy compression by reconstructing original inputs from this code. Advanced auto encoder types, such as denoising, sparse, and variational auto encoders, extend these concepts for applications in generative modeling, interpretability, and synthetic data generation. Links ocdevel.com/mlg/36 Try a walking deskAGNTCYT.J. Wilderintrep.io Fundamentals of Autoencoders Comparison with Supervised Learning Use Cases: Dimensionality Reduction and Representation Feature Learning and Embeddings Data Search, Clustering, and Compression Reconstruction Fidelity and Loss Types Outlier Detection and Noise Reduction Denoising Autoencoders Data Imputation Cryptographic Analogy Advanced Architectures: Sparse and Overcomplete Autoencoders Interpretability and Research Example Variational Autoencoders (VAEs) VAEs for Synthetic Data and Rare Event Amplification Conditional Generative Techniques Practical Considerations and Limitations

Duration:01:05:55

Ask host to enable sharing for playback control

MLG 035 Large Language Models 2

5/8/2025
At inference, large language models use in-context learning with zero-, one-, or few-shot examples to perform new tasks without weight updates, and can be grounded with Retrieval Augmented Generation (RAG) by embedding documents into vector databases for real-time factual lookup using cosine similarity. LLM agents autonomously plan, act, and use external tools via orchestrated loops with persistent memory, while recent benchmarks like GPQA (STEM reasoning), SWE Bench (agentic coding), and MMMU (multimodal college-level tasks) test performance alongside prompt engineering techniques such as chain-of-thought reasoning, structured few-shot prompts, positive instruction framing, and iterative self-correction. Links ocdevel.com/mlg/mlg35 AGNTCY Try a walking desk In-Context Learning (ICL) Definition:Types:Zero-shotOne-shotFew-shotMechanism:Emergent Properties: Retrieval Augmented Generation (RAG) and Grounding Grounding:MotivationBenefitRAG Workflow:Embedding:Storage:Retrieval:Augmentation:Generation:Advanced RAG: LLM Agents Overview:Key Components:Reasoning Engine (LLM Core):Planning Module:Memory:Tools and APIs:Capabilities:Current Trends: Multimodal Large Language Models (MLLMs) Definition:Architecture:Modality-Specific Encoders:Fusion/Alignment Layer:Unified Transformer Backbone:Recent Advances:Functionality: Advanced LLM Architectures and Training Directions Predictive Abstract Representation:Patch-Level Training:Concept-Centric Modeling:Multi-Token Prediction: Evaluation Benchmarks (as of 2025) Key Benchmarks Used for LLM Evaluation:GPQA (Diamond):SWE Bench Verified:MMMU:HumanEval:HLE (Human’s Last Exam):LiveCodeBench:MLPerf Inference v5.0 Long Context:MultiChallenge Conversational AI:TAUBench/PFCL:TruthfulnessQA: Prompt Engineering: High-Impact Techniques Foundational Approaches:Few-Shot Prompting:Chain of Thought:Clarity and Structure:Affirmative Directives:Iterative Self-Refinement:System Prompt/Role Assignment:Guideline: Trends and Research Outlook Inference-time computeAgentic LLMsmultimodal reasoningPrompt engineeringbenchmarking

Duration:00:45:25

Ask host to enable sharing for playback control

MLG 034 Large Language Models 1

5/7/2025
Explains language models (LLMs) advancements. Scaling laws - the relationships among model size, data size, and compute - and how emergent abilities such as in-context learning, multi-step reasoning, and instruction following arise once certain scaling thresholds are crossed. The evolution of the transformer architecture with Mixture of Experts (MoE), describes the three-phase training process culminating in Reinforcement Learning from Human Feedback (RLHF) for model alignment, and explores advanced reasoning techniques such as chain-of-thought prompting which significantly improve complex task performance. Links ocdevel.com/mlg/mlg34 AGNTCY Try a walking desk Transformer Foundations and Scaling Laws TransformersScaling Laws Emergent Abilities in LLMs EmergenceIn-Context Learning (ICL)Instruction FollowingMulti-Step Reasoning & Chain of Thought (CoT)Discontinuity & Debate Architectural Evolutions: Mixture of Experts (MoE) MoE LayersSpecialization & Efficiency The Three-Phase Training Process 1. Unsupervised Pre-Training2. Supervised Fine Tuning (SFT)3. Reinforcement Learning from Human Feedback (RLHF) Advanced Reasoning Techniques Prompt EngineeringChain of Thought (CoT) PromptingAutomated Reasoning Optimization Optimization for Training and Inference TradeoffsCurrent Trends

Duration:00:50:48

Ask host to enable sharing for playback control

MLA 024 Code AI MCP Servers, ML Engineering

4/13/2025
Tool Use and Model Context Protocol (MCP) Notes and resources at ocdevel.com/mlg/mla-24 Try a walking desk to stay healthy while you study or work! Tool Use in Vibe Coding Agents File OperationsExecutable CommandsBrowser Integration Model Context Protocol (MCP) StandardizationImplementationMCP ClientMCP ServerLocal and Cloud FrameworksLocal (S-T-D-I-O MCP)Cloud (SSE MCP) Expanding AI Capabilities with MCP Servers Directoriesmodelcontextprotocol/serversUse CasesAutomation Beyond CodingCreative Solutions AI Tools in Machine Learning Automating ML ProcessAuto ML and Feature EngineeringPipeline Construction and DeploymentActive ExperimentationJupyter Integration ChallengesPractical Strategies Conclusion Action Plan for ML Engineers

Duration:00:43:38

Ask host to enable sharing for playback control

MLA 023 Code AI Models & Modes

4/13/2025
Notes and resources at ocdevel.com/mlg/mla-23 Try a walking desk to stay healthy while you study or work! Model Current Leaders According to the Aider Leaderboard (as of April 12, 2025), leading models include for vibe-coding: Gemini 2.5 Pro Preview 03-25Claude 3.7 SonnetDeepSeek R1 with Claude 3.5 Sonnet Local Models Tools for Local ModelsOllamaPrivacy and SecurityPerformance Trade-offs Fine-Tuning Models CustomizationAdvanced Usage Tips and Best Practices Judicious Use of the KeyConcurrent Feature ImplementationBoomerang modeContinued LearningRoo Code

Duration:00:37:35

Ask host to enable sharing for playback control

MLA 022 Code AI Tools

2/9/2025
Try a walking desk while studying ML or working on your projects! https://ocdevel.com/walk Show notes: https://ocdevel.com/mlg/mla-22 Tools discussed: https://codeium.com/windsurfhttps://github.com/features/copilothttps://www.cursor.com/https://github.com/cline/clinehttps://github.com/RooVetGit/Roo-Codehttps://aider.chat/ Other: https://aider.chat/docs/leaderboards/https://www.youtube.com/watch?v=QlUt06XLbJE&feature=youtu.behttps://www.reddit.com/r/chatgptcoding/ Examines the rapidly evolving world of AI coding tools designed to boost programming productivity by acting as a pair programming partner. The discussion groups these tools into three categories: • Hands-Off Tools: These include solutions that work on fixed monthly fees and require minimal user intervention. GitHub Copilot started with simple tab completions and now offers an agent mode similar to Cursor, which stands out for its advanced codebase indexing and intelligent file searching. Windsurf is noted for its simplicity—accepting prompts and performing automated edits—but some users report performance throttling after prolonged use. • Hands-On Tools: Aider is presented as a command-line utility that demands configuration and user involvement. It allows developers to specify files and settings, and it efficiently manages token usage by sending prompts in diff format. Aider also implements an “architect versus edit” approach: a reasoning model (such as DeepSeek R1) first outlines a sequence of changes, then an editor model (like Claude 3.5 Sonnet) produces precise code edits. This dual-model strategy enhances accuracy and reduces token costs, especially for complex tasks. • Intermediate Power Tools: Open-source tools such as Cline and its more advanced fork, RooCode, require users to supply their own API keys and pay per token. These tools offer robust, agentic features, including codebase indexing, file editing, and even browser automation. RooCode stands out with its ability to autonomously expand functionality through integrations (for example, managing cloud resources or querying issue trackers), making it particularly attractive for tinkerers and power users. A decision framework is suggested: for those new to AI coding assistants or with limited budgets, starting with Cursor (or cautiously exploring Copilot’s new features) is recommended. For developers who want to customize their workflow and dive deep into the tooling, RooCode or Cline offer greater control—always paired with Aider for precise and token-efficient code edits. Also reviews model performance using a coding benchmark leaderboard that updates frequently. The current top-performing combination uses DeepSeek R1 as the architect and Claude 3.5 Sonnet as the editor, with alternatives such as OpenAI’s O1 and O3 Mini available. Tools like Open Router are mentioned as a way to consolidate API key management and reduce token costs.

Duration:00:46:35

Ask host to enable sharing for playback control

MLG 033 Transformers

2/8/2025
Try a walking desk while studying ML or working on your projects! https://ocdevel.com/walk Show notes: https://ocdevel.com/mlg/33 3Blue1Brown videos: https://3blue1brown.com/ Background & Motivation: RNN Limitations:Breakthrough:Core Architecture: Layer Stack:Positional Encodings:Self-Attention Mechanism: Q, K, V Explained:Query (Q):Key (K):Value (V):Multi-Head Attention:Dot-Product & Scaling:Masking: Causal Masking:Padding Masks:Feed-Forward Networks (MLPs): Transformation & Storage:Depth & Expressivity:Residual Connections & Normalization: Residual Links:Layer Normalization:Scalability & Efficiency Considerations: Parallelization Advantage:Complexity Trade-offs:Training Paradigms & Emergent Properties: Pretraining & Fine-Tuning:Emergent Behavior:Interpretability & Knowledge Distribution: Distributed Representation:Debate on Attention:

Duration:00:42:14

Ask host to enable sharing for playback control

MLA 021 Databricks

6/21/2022
Try a walking desk while studying ML or working on your projects! Discussing Databricks with Ming Chang from Raybeam (part of DEPT®)

Duration:00:26:00

Ask host to enable sharing for playback control

MLA 020 Kubeflow

1/28/2022
Support my new podcast: Lefnire's Life Hacks Conversation with Dirk-Jan Kubeflow (vs cloud native solutions like SageMaker) Dirk-Jan Verdoorn - Data Scientist at Dept Agency Kubeflow. (From the website:) The Machine Learning Toolkit for Kubernetes. The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable. Our goal is not to recreate other services, but to provide a straightforward way to deploy best-of-breed open-source systems for ML to diverse infrastructures. Anywhere you are running Kubernetes, you should be able to run Kubeflow. TensorFlow Extended (TFX). If using TensorFlow with Kubeflow, combine with TFX for maximum power. (From the website:) TensorFlow Extended (TFX) is an end-to-end platform for deploying production ML pipelines. When you're ready to move your models from research to production, use TFX to create and manage a production pipeline. Alternatives: AirflowMLflow

Duration:01:07:56

Ask host to enable sharing for playback control

MLA 019 DevOps

1/13/2022
Try a walking desk while studying ML or working on your projects! https://ocdevel.com/walk Chatting with co-workers about the role of DevOps in a machine learning engineer's life Expert coworkers at Dept Matt MerrillJirawat UttayayaThe Ship It Podcast Devops tools TerraformAnsible Pictures (funny and serious) Which AWS container service should I use?A visual guide on troubleshooting Kubernetes deploymentsPublic Cloud Services ComparisonKilled by Google aCloudGuru AWS curriculum

Duration:01:14:53

Ask host to enable sharing for playback control

MLA 018 Descript

11/6/2021
Support my new podcast: Lefnire's Life Hacks (Optional episode) just showcasing a cool application using machine learning Dept uses Descript for some of their podcasting. I'm using it like a maniac, I think they're surprised at how into it I am. Check out the transcript & see how it performed. DescriptThe Ship It Podcast Brandbeats Podcast by BASIC

Duration:00:06:21

Ask host to enable sharing for playback control

MLA 017 AWS Local Development

11/6/2021
Try a walking desk while studying ML or working on your projects! Show notes: ocdevel.com/mlg/mla-17 Developing on AWS first (SageMaker or other) Consider developing against AWS as your local development environment, rather than only your cloud deployment environment. Solutions: LambdaSageMaker StudioCloud9Connect to deployed infrastructure via Client VPN Terraform exampleYouTube tutorial Creating the keysLocalStack Infrastructure as Code TerraformCDKServerless

Duration:01:04:19

Ask host to enable sharing for playback control

MLA 016 SageMaker 2

11/5/2021
Support my new podcast: Lefnire's Life Hacks Part 2 of deploying your ML models to the cloud with SageMaker (MLOps) MLOps is deploying your ML models to the cloud. See MadeWithML for an overview of tooling (also generally a great ML educational run-down.) SageMakerJumpstartDeployPipelinesMonitorKubernetesNeo

Duration:00:59:42

Ask host to enable sharing for playback control

MLA 015 SageMaker 1

11/4/2021
Support my new podcast: Lefnire's Life Hacks Show notes Part 1 of deploying your ML models to the cloud with SageMaker (MLOps) MLOps is deploying your ML models to the cloud. See MadeWithML for an overview of tooling (also generally a great ML educational run-down.) SageMakerDataWranglerFeature StoreGround TruthClarifyStudioAutoPilotDebuggerDistributed Training And I forgot to mention JumpStart, I'll mention next time.

Duration:00:46:45

Ask host to enable sharing for playback control

MLA 014 Machine Learning Server

1/17/2021
Try a walking desk while studying ML or working on your projects! Server-side ML. Training & hosting for inference, with a goal towards serverless. AWS SageMaker, Batch, Lambda, EFS, Cortex.dev

Duration:00:52:05

Ask host to enable sharing for playback control

MLA 013 Customer Facing Tech Stack

1/2/2021
Support my new podcast: Lefnire's Life Hacks Client, server, database, etc.

Duration:00:46:53

Ask host to enable sharing for playback control

MLA 012 Docker

11/8/2020
Support my new podcast: Lefnire's Life Hacks Use Docker for env setup on localhost & cloud deployment, instead of pyenv / Anaconda. I recommend Windows for your desktop.

Duration:00:30:57