ML Cult

Science & Technology News

A curated podcast covering the latest machine learning developments, text, and audio is generated using AI.

Location:

United States

Genres:

Science & Technology News

Description:

A curated podcast covering the latest machine learning developments, text, and audio is generated using AI.

Language:

English

Website:

https://podcast.mlcult.org

Episodes

October 27th, 2023 - AI Unleashed: Decoding Sycophancy, Mastering Control, and Crafting 3D Realities

10/27/2023

Towards Understanding Sycophancy in Language ModelsControlled Decoding from Language ModelsHyperFields: Towards Zero-Shot Generation of NeRFs from TextSupport the Show.

Duration:00:08:32

October 26th, 2023 - Frontiers of AI: From Quantum Compression to Visionary Transformers

10/26/2023

LLM-FP4: 4-Bit Floating-Point Quantized TransformersDetecting Pretraining Data from Large Language ModelsConvNets Match Vision Transformers at ScaleA Picture is Worth a Thousand Words: Principled Recaptioning Improves Image GenerationQMoE: Practical Sub-1-Bit Compression of Trillion-Parameter ModelsSupport the Show.

Duration:00:14:15

October 25th, 2023 - Pixel to Perception: Matryoshka Synthesis, GPT-3's Linguistic Mysteries, Woodpecker's Visual Refinement, and SAM-CLIP's Vision Evolution

10/25/2023

Matryoshka Diffusion ModelsDissecting In-Context Learning of Translations in GPTsWoodpecker: Hallucination Correction for Multimodal Large Language ModelsSAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial UnderstandingSupport the Show.

Duration:00:11:12

October 24th, 2023 - Neural Visions Unveiled: From FreeNoise's Video Clarity, HallusionBench's Reality Check, to FlashEdit's Instant Image Refinements

10/24/2023

FreeNoise: Tuning-Free Longer Video Diffusion Via Noise ReschedulingHallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality ModelsLocalizing and Editing Knowledge in Text-to-Image Generative ModelsSupport the Show.

Duration:00:06:35

October 23th, 2023 - Unlocking AI's Potential: From Open Waters to Self-Enhancing Miniature Models

10/23/2023

H2O Open Ecosystem for State-of-the-art Large Language ModelsLet's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small ModelsTeaching Language Models to Self-Improve through Interactive DemonstrationsSupport the Show.

Duration:00:06:35

October 4th, 2023 - NeuroFrontiers: Pensive Processors, Natural Evolution, and the New Age of Linguistic Titans

10/4/2023

Think before you speak: Training Language Models With Pause TokensTowards Self-Assembling Artificial Neural Networks through Neural Developmental ProgramsEfficient Streaming Language Models with Attention SinksLarge Language Models Cannot Self-Correct Reasoning YetSmartPlay : A Benchmark for LLMs as Intelligent AgentsSupport the Show.

Duration:00:13:09

October 3nd, 2023 - Evolution in Text: Self-Improvement, Synthesis, and Scrutiny

10/3/2023

Enable Language Models to Implicitly Learn Self-Improvement From DataPixArt-alpha: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image SynthesisFELM: Benchmarking Factuality Evaluation of Large Language ModelsSupport the Show.

Duration:00:07:51

October 2nd, 2023 - Math to Motion: ToRA, Decaf, and DRaFT Transformations

10/2/2023

ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem SolvingDecaf: Monocular Deformation Capture for Face and Hand InteractionsDirectly Fine-Tuning Diffusion Models on Differentiable RewardsSupport the Show.

Duration:00:06:52

September 29th, 2023 - Masters of AI Metamorphosis: From Long-Context Linguistics to 3D Dreamscapes

9/29/2023

Effective Long-Context Scaling of Foundation ModelsDemystifying CLIP DataVision Transformers Need RegistersQwen Technical ReportDreamGaussian: Generative Gaussian Splatting for Efficient 3D Content CreationSupport the Show.

Duration:00:16:14

September 28th, 2023 - Neural Vistas & Visual Alchemy: From NeuRBF Reconstructions to ScalarSimplicity in AI Imagery

9/28/2023

NeuRBF: A Neural Fields Representation with Adaptive Radial Basis FunctionsEmu: Enhancing Image Generation Models Using Photogenic Needles in a HaystackShow-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video GenerationFinite Scalar Quantization: VQ-VAE Made SimpleSupport the Show.

Duration:00:08:49

September 27th, 2023 - Beyond Boundaries: Pioneering Sequences, Alignments, and Realism in AI Evolution

9/27/2023

DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer ModelsAligning Large Multimodal Models with Factually Augmented RLHFLAVIE: High-Quality Video Generation with Cascaded Latent Diffusion ModelsSupport the Show.

Duration:00:06:30

September 25th, 2023 - From Pixels to Precedents: Pioneering Visions in Color, Law, Code, and Sight

9/25/2023

CoRF : Colorizing Radiance Fields using Knowledge DistillationThe Cambridge Law Corpus: A Corpus for Legal AI ResearchCodePlan: Repository-level Coding using LLMs and PlanningDualToken-ViT: Position-aware Efficient Vision Transformer with Dual Token FusionSupport the Show.

Duration:00:10:47

September 22th, 2023 - Revolutionary Speeds & Precision: The Future of Neural Networks and Language Models

9/22/2023

Parallelizing non-linear sequential models over the sequence lengthFast Feedforward NetworksLongLoRA: Efficient Fine-tuning of Long-Context Large Language ModelsA Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language ModelsBoolformer: Symbolic Regression of Logic Functions with TransformersSupport the Show.

Duration:00:13:54

September 21th, 2023 - Neural Frontiers: From FreeU's Image Mastery to Languini Kitchen's Equalized Research

9/21/2023

FreeU: Free Lunch in Diffusion U-NetNeurons in Large Language Models: Dead, N-gram, PositionalDreamLLM: Synergistic Multimodal Comprehension and CreationKosmos-2.5: A Multimodal Literate ModelEnd-to-End Speech Recognition Contextualization with Large Language ModelsThe Languini Kitchen: Enabling Language Modelling Research at Different Scales of ComputeSupport the Show.

Duration:00:14:32

September 20th, 2023 - From Overthinking Graphs to Code Whispering and Polyglot AI: The New Frontiers of Neural Networks, Language Models, and Data Compression

9/20/2023

Graph Neural Networks Use Graphs When They Shouldn'tLarge Language Models for Compiler OptimizationOpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from ScratchBaichuan 2: Open Large-scale Language ModelsLanguage Modeling Is CompressionFoleyGen: Visually-Guided Audio GenerationSupport the Show.

Duration:00:13:20

September 12th, 2023 - Frontiers in AI: From Pint-Sized Powerhouses and Pruned Datasets to Multilingual Mastery and Image Restoration

9/12/2023

Textbooks Are All You Need II: phi-1.5 technical reportDiffBIR: Towards Blind Image Restoration with Generative Diffusion PriorWhen Less is More: Investigating Data Pruning for Pretraining LLMs at ScaleMADLAD-400: A Multilingual And Document-Level Large Audited DatasetFIAT: Fusing learning paradigms with Instruction-Accelerated TuningOptimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMsSupport the Show.

Duration:00:11:11

September 11th, 2023 - Neural Frontiers: Audiobooks, Virtual Cities, Summarization, and Vision Transformers Reimagined

9/11/2023

Large-Scale Automatic Audiobook CreationCityDreamer: Compositional Generative Model of Unbounded 3D CitiesFrom Sparse to Dense: GPT-4 Summarization with Chain of Density PromptingMobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-ExpertsHigh-Quality Entity SegmentationSupport the Show.

Duration:00:09:26

September 8th, 2023 - Unlocking the Future of AI: From Master Optimizers and Budget-Friendly Giants to Truthful Decoding and Video Segmentation Breakthroughs

9/8/2023

Large Language Models as OptimizersFLM-101B: An Open LLM and How to Train It with $100K BudgetXGen-7B Technical ReportTracking Anything with Decoupled Video SegmentationDoLa: Decoding by Contrasting Layers Improves Factuality in Large Language ModelsSupport the Show.

Duration:00:11:28

September 7th, 2023 - SLiMe, Matcha-TTS, RoboSense, and CM3Leon: Revolutionizing Vision, Speech, and Multi-Modal Intelligence for a Smarter, Faster Future

9/7/2023

SLiMe: Segment Like MeMatcha-TTS: A fast TTS architecture with conditional flow matchingPhysically Grounded Vision-Language Models for Robotic ManipulationScaling Autoregressive Multi-Modal Models: Pretraining and Instruction TuningSupport the Show.

Duration:00:08:11

September 6th, 2023 - Unlocking the Future of AI: Lean Transformers, Memory-Efficient RLHF, Voice-Altering Text Prompts, and 3D Virtual Humans

9/6/2023

One Wide Feedforward is All You NeedEfficient RLHF: Reducing the Memory Usage of PPOPromptTTS 2: Describing and Generating Voices with Text PromptAniPortraitGAN: Animatable 3D Portrait Generation from 2D Image CollectionsSupport the Show.

Duration:00:08:02