Machine Learning Guide-logo

Machine Learning Guide

Technology Podcasts

Machine learning audio course, teaching the fundamentals of machine learning and artificial intelligence. It covers intuition, models (shallow and deep), math, languages, frameworks, etc. Where your other ML resources provide the trees, I provide the forest. Consider MLG your syllabus, with highly-curated resources for each episode's details at ocdevel.com. Audio is a great supplement during exercise, commute, chores, etc.

Location:

United States

Description:

Machine learning audio course, teaching the fundamentals of machine learning and artificial intelligence. It covers intuition, models (shallow and deep), math, languages, frameworks, etc. Where your other ML resources provide the trees, I provide the forest. Consider MLG your syllabus, with highly-curated resources for each episode's details at ocdevel.com. Audio is a great supplement during exercise, commute, chores, etc.

Language:

English


Episodes
Ask host to enable sharing for playback control

MLG 036 Autoencoders

5/30/2025
Auto encoders are neural networks that compress data into a smaller "code," enabling dimensionality reduction, data cleaning, and lossy compression by reconstructing original inputs from this code. Advanced auto encoder types, such as denoising, sparse, and variational auto encoders, extend these concepts for applications in generative modeling, interpretability, and synthetic data generation. Links ocdevel.com/mlg/36 Try a walking deskAGNTCYT.J. Wilderintrep.io Fundamentals of Autoencoders Comparison with Supervised Learning Use Cases: Dimensionality Reduction and Representation Feature Learning and Embeddings Data Search, Clustering, and Compression Reconstruction Fidelity and Loss Types Outlier Detection and Noise Reduction Denoising Autoencoders Data Imputation Cryptographic Analogy Advanced Architectures: Sparse and Overcomplete Autoencoders Interpretability and Research Example Variational Autoencoders (VAEs) VAEs for Synthetic Data and Rare Event Amplification Conditional Generative Techniques Practical Considerations and Limitations

Duration:01:05:55

Ask host to enable sharing for playback control

MLG 035 Large Language Models 2

5/8/2025
At inference, large language models use in-context learning with zero-, one-, or few-shot examples to perform new tasks without weight updates, and can be grounded with Retrieval Augmented Generation (RAG) by embedding documents into vector databases for real-time factual lookup using cosine similarity. LLM agents autonomously plan, act, and use external tools via orchestrated loops with persistent memory, while recent benchmarks like GPQA (STEM reasoning), SWE Bench (agentic coding), and MMMU (multimodal college-level tasks) test performance alongside prompt engineering techniques such as chain-of-thought reasoning, structured few-shot prompts, positive instruction framing, and iterative self-correction. Links ocdevel.com/mlg/mlg35 AGNTCY Try a walking desk In-Context Learning (ICL) Definition:Types:Zero-shotOne-shotFew-shotMechanism:Emergent Properties: Retrieval Augmented Generation (RAG) and Grounding Grounding:MotivationBenefitRAG Workflow:Embedding:Storage:Retrieval:Augmentation:Generation:Advanced RAG: LLM Agents Overview:Key Components:Reasoning Engine (LLM Core):Planning Module:Memory:Tools and APIs:Capabilities:Current Trends: Multimodal Large Language Models (MLLMs) Definition:Architecture:Modality-Specific Encoders:Fusion/Alignment Layer:Unified Transformer Backbone:Recent Advances:Functionality: Advanced LLM Architectures and Training Directions Predictive Abstract Representation:Patch-Level Training:Concept-Centric Modeling:Multi-Token Prediction: Evaluation Benchmarks (as of 2025) Key Benchmarks Used for LLM Evaluation:GPQA (Diamond):SWE Bench Verified:MMMU:HumanEval:HLE (Human’s Last Exam):LiveCodeBench:MLPerf Inference v5.0 Long Context:MultiChallenge Conversational AI:TAUBench/PFCL:TruthfulnessQA: Prompt Engineering: High-Impact Techniques Foundational Approaches:Few-Shot Prompting:Chain of Thought:Clarity and Structure:Affirmative Directives:Iterative Self-Refinement:System Prompt/Role Assignment:Guideline: Trends and Research Outlook Inference-time computeAgentic LLMsmultimodal reasoningPrompt engineeringbenchmarking

Duration:00:45:25

Ask host to enable sharing for playback control

MLG 034 Large Language Models 1

5/7/2025
Explains language models (LLMs) advancements. Scaling laws - the relationships among model size, data size, and compute - and how emergent abilities such as in-context learning, multi-step reasoning, and instruction following arise once certain scaling thresholds are crossed. The evolution of the transformer architecture with Mixture of Experts (MoE), describes the three-phase training process culminating in Reinforcement Learning from Human Feedback (RLHF) for model alignment, and explores advanced reasoning techniques such as chain-of-thought prompting which significantly improve complex task performance. Links ocdevel.com/mlg/mlg34 AGNTCY Try a walking desk Transformer Foundations and Scaling Laws TransformersScaling Laws Emergent Abilities in LLMs EmergenceIn-Context Learning (ICL)Instruction FollowingMulti-Step Reasoning & Chain of Thought (CoT)Discontinuity & Debate Architectural Evolutions: Mixture of Experts (MoE) MoE LayersSpecialization & Efficiency The Three-Phase Training Process 1. Unsupervised Pre-Training2. Supervised Fine Tuning (SFT)3. Reinforcement Learning from Human Feedback (RLHF) Advanced Reasoning Techniques Prompt EngineeringChain of Thought (CoT) PromptingAutomated Reasoning Optimization Optimization for Training and Inference TradeoffsCurrent Trends

Duration:00:50:48

Ask host to enable sharing for playback control

MLA 024 Code AI MCP Servers, ML Engineering

4/13/2025
Tool Use and Model Context Protocol (MCP) Notes and resources at ocdevel.com/mlg/mla-24 Try a walking desk to stay healthy while you study or work! Tool Use in Vibe Coding Agents File OperationsExecutable CommandsBrowser Integration Model Context Protocol (MCP) StandardizationImplementationMCP ClientMCP ServerLocal and Cloud FrameworksLocal (S-T-D-I-O MCP)Cloud (SSE MCP) Expanding AI Capabilities with MCP Servers Directoriesmodelcontextprotocol/serversUse CasesAutomation Beyond CodingCreative Solutions AI Tools in Machine Learning Automating ML ProcessAuto ML and Feature EngineeringPipeline Construction and DeploymentActive ExperimentationJupyter Integration ChallengesPractical Strategies Conclusion Action Plan for ML Engineers

Duration:00:43:38

Ask host to enable sharing for playback control

MLA 023 Code AI Models & Modes

4/13/2025
Notes and resources at ocdevel.com/mlg/mla-23 Try a walking desk to stay healthy while you study or work! Model Current Leaders According to the Aider Leaderboard (as of April 12, 2025), leading models include for vibe-coding: Gemini 2.5 Pro Preview 03-25Claude 3.7 SonnetDeepSeek R1 with Claude 3.5 Sonnet Local Models Tools for Local ModelsOllamaPrivacy and SecurityPerformance Trade-offs Fine-Tuning Models CustomizationAdvanced Usage Tips and Best Practices Judicious Use of the KeyConcurrent Feature ImplementationBoomerang modeContinued LearningRoo Code

Duration:00:37:35

Ask host to enable sharing for playback control

MLA 022 Code AI Tools

2/9/2025
Try a walking desk while studying ML or working on your projects! https://ocdevel.com/walk Show notes: https://ocdevel.com/mlg/mla-22 Tools discussed: https://codeium.com/windsurfhttps://github.com/features/copilothttps://www.cursor.com/https://github.com/cline/clinehttps://github.com/RooVetGit/Roo-Codehttps://aider.chat/ Other: https://aider.chat/docs/leaderboards/https://www.youtube.com/watch?v=QlUt06XLbJE&feature=youtu.behttps://www.reddit.com/r/chatgptcoding/ Examines the rapidly evolving world of AI coding tools designed to boost programming productivity by acting as a pair programming partner. The discussion groups these tools into three categories: • Hands-Off Tools: These include solutions that work on fixed monthly fees and require minimal user intervention. GitHub Copilot started with simple tab completions and now offers an agent mode similar to Cursor, which stands out for its advanced codebase indexing and intelligent file searching. Windsurf is noted for its simplicity—accepting prompts and performing automated edits—but some users report performance throttling after prolonged use. • Hands-On Tools: Aider is presented as a command-line utility that demands configuration and user involvement. It allows developers to specify files and settings, and it efficiently manages token usage by sending prompts in diff format. Aider also implements an “architect versus edit” approach: a reasoning model (such as DeepSeek R1) first outlines a sequence of changes, then an editor model (like Claude 3.5 Sonnet) produces precise code edits. This dual-model strategy enhances accuracy and reduces token costs, especially for complex tasks. • Intermediate Power Tools: Open-source tools such as Cline and its more advanced fork, RooCode, require users to supply their own API keys and pay per token. These tools offer robust, agentic features, including codebase indexing, file editing, and even browser automation. RooCode stands out with its ability to autonomously expand functionality through integrations (for example, managing cloud resources or querying issue trackers), making it particularly attractive for tinkerers and power users. A decision framework is suggested: for those new to AI coding assistants or with limited budgets, starting with Cursor (or cautiously exploring Copilot’s new features) is recommended. For developers who want to customize their workflow and dive deep into the tooling, RooCode or Cline offer greater control—always paired with Aider for precise and token-efficient code edits. Also reviews model performance using a coding benchmark leaderboard that updates frequently. The current top-performing combination uses DeepSeek R1 as the architect and Claude 3.5 Sonnet as the editor, with alternatives such as OpenAI’s O1 and O3 Mini available. Tools like Open Router are mentioned as a way to consolidate API key management and reduce token costs.

Duration:00:46:35

Ask host to enable sharing for playback control

MLG 033 Transformers

2/8/2025
Try a walking desk while studying ML or working on your projects! https://ocdevel.com/walk Show notes: https://ocdevel.com/mlg/33 3Blue1Brown videos: https://3blue1brown.com/ Background & Motivation: RNN Limitations:Breakthrough:Core Architecture: Layer Stack:Positional Encodings:Self-Attention Mechanism: Q, K, V Explained:Query (Q):Key (K):Value (V):Multi-Head Attention:Dot-Product & Scaling:Masking: Causal Masking:Padding Masks:Feed-Forward Networks (MLPs): Transformation & Storage:Depth & Expressivity:Residual Connections & Normalization: Residual Links:Layer Normalization:Scalability & Efficiency Considerations: Parallelization Advantage:Complexity Trade-offs:Training Paradigms & Emergent Properties: Pretraining & Fine-Tuning:Emergent Behavior:Interpretability & Knowledge Distribution: Distributed Representation:Debate on Attention:

Duration:00:42:14

Ask host to enable sharing for playback control

MLA 021 Databricks

6/21/2022
Try a walking desk while studying ML or working on your projects! Discussing Databricks with Ming Chang from Raybeam (part of DEPT®)

Duration:00:26:00

Ask host to enable sharing for playback control

MLA 020 Kubeflow

1/28/2022
Support my new podcast: Lefnire's Life Hacks Conversation with Dirk-Jan Kubeflow (vs cloud native solutions like SageMaker) Dirk-Jan Verdoorn - Data Scientist at Dept Agency Kubeflow. (From the website:) The Machine Learning Toolkit for Kubernetes. The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable. Our goal is not to recreate other services, but to provide a straightforward way to deploy best-of-breed open-source systems for ML to diverse infrastructures. Anywhere you are running Kubernetes, you should be able to run Kubeflow. TensorFlow Extended (TFX). If using TensorFlow with Kubeflow, combine with TFX for maximum power. (From the website:) TensorFlow Extended (TFX) is an end-to-end platform for deploying production ML pipelines. When you're ready to move your models from research to production, use TFX to create and manage a production pipeline. Alternatives: AirflowMLflow

Duration:01:07:56

Ask host to enable sharing for playback control

MLA 019 DevOps

1/13/2022
Try a walking desk while studying ML or working on your projects! https://ocdevel.com/walk Chatting with co-workers about the role of DevOps in a machine learning engineer's life Expert coworkers at Dept Matt MerrillJirawat UttayayaThe Ship It Podcast Devops tools TerraformAnsible Pictures (funny and serious) Which AWS container service should I use?A visual guide on troubleshooting Kubernetes deploymentsPublic Cloud Services ComparisonKilled by Google aCloudGuru AWS curriculum

Duration:01:14:53

Ask host to enable sharing for playback control

MLA 018 Descript

11/6/2021
Support my new podcast: Lefnire's Life Hacks (Optional episode) just showcasing a cool application using machine learning Dept uses Descript for some of their podcasting. I'm using it like a maniac, I think they're surprised at how into it I am. Check out the transcript & see how it performed. DescriptThe Ship It Podcast Brandbeats Podcast by BASIC

Duration:00:06:21

Ask host to enable sharing for playback control

MLA 017 AWS Local Development

11/6/2021
Try a walking desk while studying ML or working on your projects! Show notes: ocdevel.com/mlg/mla-17 Developing on AWS first (SageMaker or other) Consider developing against AWS as your local development environment, rather than only your cloud deployment environment. Solutions: LambdaSageMaker StudioCloud9Connect to deployed infrastructure via Client VPN Terraform exampleYouTube tutorial Creating the keysLocalStack Infrastructure as Code TerraformCDKServerless

Duration:01:04:19

Ask host to enable sharing for playback control

MLA 016 SageMaker 2

11/5/2021
Support my new podcast: Lefnire's Life Hacks Part 2 of deploying your ML models to the cloud with SageMaker (MLOps) MLOps is deploying your ML models to the cloud. See MadeWithML for an overview of tooling (also generally a great ML educational run-down.) SageMakerJumpstartDeployPipelinesMonitorKubernetesNeo

Duration:00:59:42

Ask host to enable sharing for playback control

MLA 015 SageMaker 1

11/4/2021
Support my new podcast: Lefnire's Life Hacks Show notes Part 1 of deploying your ML models to the cloud with SageMaker (MLOps) MLOps is deploying your ML models to the cloud. See MadeWithML for an overview of tooling (also generally a great ML educational run-down.) SageMakerDataWranglerFeature StoreGround TruthClarifyStudioAutoPilotDebuggerDistributed Training And I forgot to mention JumpStart, I'll mention next time.

Duration:00:46:45

Ask host to enable sharing for playback control

MLA 014 Machine Learning Server

1/17/2021
Try a walking desk while studying ML or working on your projects! Server-side ML. Training & hosting for inference, with a goal towards serverless. AWS SageMaker, Batch, Lambda, EFS, Cortex.dev

Duration:00:52:05

Ask host to enable sharing for playback control

MLA 013 Customer Facing Tech Stack

1/2/2021
Support my new podcast: Lefnire's Life Hacks Client, server, database, etc.

Duration:00:46:53

Ask host to enable sharing for playback control

MLA 012 Docker

11/8/2020
Support my new podcast: Lefnire's Life Hacks Use Docker for env setup on localhost & cloud deployment, instead of pyenv / Anaconda. I recommend Windows for your desktop.

Duration:00:30:57

Ask host to enable sharing for playback control

MLG 032 Cartesian Similarity Metrics

11/8/2020
Try a walking desk while studying ML or working on your projects! Show notes at ocdevel.com/mlg/32. L1/L2 norm, Manhattan, Euclidean, cosine distances, dot product Normed distances link link Dot product linklink Cosine (normalized dot)

Duration:00:41:52

Ask host to enable sharing for playback control

MLA 011 Practical Clustering

11/7/2020
Try a walking desk while studying ML or working on your projects! Kmeans (sklearn vs FAISS), finding n_clusters via inertia/silhouette, Agglomorative, DBSCAN/HDBSCAN

Duration:00:34:23

Ask host to enable sharing for playback control

MLA 010 NLP packages: transformers, spaCy, Gensim, NLTK

10/27/2020
Support my new podcast: Lefnire's Life Hacks NLTK: swiss army knife. Gensim: LDA topic modeling, n-grams. spaCy: linguistics. transformers: high-level business NLP tasks.

Duration:00:25:31