TalkRL: The Reinforcement Learning Podcast-logo

TalkRL: The Reinforcement Learning Podcast

Technology Podcasts

TalkRL podcast is All Reinforcement Learning, All the Time. In-depth interviews with brilliant people at the forefront of RL research and practice. Guests from places like MILA, OpenAI, MIT, DeepMind, Berkeley, Amii, Oxford, Google Research, Brown, Waymo, Caltech, and Vector Institute. Hosted by Robin Ranjit Singh Chauhan.

Location:

Canada

Description:

TalkRL podcast is All Reinforcement Learning, All the Time. In-depth interviews with brilliant people at the forefront of RL research and practice. Guests from places like MILA, OpenAI, MIT, DeepMind, Berkeley, Amii, Oxford, Google Research, Brown, Waymo, Caltech, and Vector Institute. Hosted by Robin Ranjit Singh Chauhan.

Language:

English

Contact:

6048856418


Episodes
Ask host to enable sharing for playback control

NeurIPS 2024 - Posters and Hallways 3

3/9/2025
Posters and Hallway episodes are short interviews and poster summaries. Recorded at NeurIPS 2024 in Vancouver BC Canada. Featuring WFCRL: A Multi-Agent Reinforcement Learning Benchmark for Wind Farm ControlOvercoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RLFoundations of Multivariate Distributional Reinforcement LearningContextual Bilevel Reinforcement Learning for Incentive AlignmentQGym: Scalable Simulation and Benchmarking of Queuing Network Controllers

Duration:00:10:01

Ask host to enable sharing for playback control

NeurIPS 2024 - Posters and Hallways 2

3/4/2025
Posters and Hallway episodes are short interviews and poster summaries. Recorded at NeurIPS 2024 in Vancouver BC Canada. Featuring Artificial Generational Intelligence: Cultural Accumulation in Reinforcement LearningDigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement LearningEnhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent ApproachImproving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy ChurnJaxMARL: Multi-Agent RL Environments and Algorithms in JAX

Duration:00:08:48

Ask host to enable sharing for playback control

NeurIPS 2024 - Posters and Hallways 1

3/2/2025
Posters and Hallway episodes are short interviews and poster summaries. Recorded at NeurIPS 2024 in Vancouver BC Canada. Featuring Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement LearningNo Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPOTime-Constrained Robust MDPsSustainDC: Benchmarking for Sustainable Data Center ControlBenchMARL: Benchmarking Multi-Agent Reinforcement LearningBeyond Optimism: Exploration With Partially Observable Rewards

Duration:00:09:32

Ask host to enable sharing for playback control

Abhishek Naik

2/9/2025
Abhishek Naik was a student at University of Alberta and Alberta Machine Intelligence Institute, and he just finished his PhD in reinforcement learning, working with Rich Sutton. Now he is a postdoc fellow at the National Research Council of Canada, where he does AI research on Space applications. Featured References Reinforcement Learning for Continuing Problems Using Average Reward Abhishek Naik dissertation 2024 Reward Centering Abhishek Naik, Yi Wan, Manan Tomar, Richard S. Sutto 2024 Learning and Planning in Average-Reward Markov Decision Processes Yi Wan, Abhishek Naik, Richard S. Sutton 2020 Discounted Reinforcement Learning Is Not an Optimization Problem Abhishek Naik, Roshan Shariff, Niko Yasui, Hengshuai Yao, Richard S. Sutton 2019 Additional References Explaining dopamine through prediction errors and beyond

Duration:01:21:40

Ask host to enable sharing for playback control

Neurips 2024 RL meetup Hot takes: What sucks about RL?

12/23/2024
What do RL researchers complain about after hours at the bar? In this "Hot takes" episode, we find out! Recorded at The Pearl in downtown Vancouver, during the RL meetup after a day of Neurips 2024. Special thanks to "David Beckham" for the inspiration :)

Duration:00:17:45

Ask host to enable sharing for playback control

RLC 2024 - Posters and Hallways 5

9/20/2024
Posters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: David RadkeAbhishek NaikDaphne CornelisseShray BansalClaas VoelckerBrent Venable

Duration:00:13:17

Ask host to enable sharing for playback control

RLC 2024 - Posters and Hallways 4

9/18/2024
Posters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: David AbelKevin WangAshwin KumarPrabhat Nagarajan

Duration:00:04:52

Ask host to enable sharing for playback control

RLC 2024 - Posters and Hallways 3

9/18/2024
Posters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: Kris De AsisAnna HakhverdyanDilip ArumugamMicah Carroll

Duration:00:06:43

Ask host to enable sharing for playback control

RLC 2024 - Posters and Hallways 2

9/15/2024
Posters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: Interpretable and Editable Programmatic Tree Policies for Reinforcement LearningInterpretable Concept Bottlenecks to Align Reinforcement Learning AgentsUnderstanding biological active sensing behaviors by interpreting learned artificial agent policiesOCAtari: Object-Centric Atari 2600 Reinforcement Learning EnvironmentsResolving Partial Observability in Decision Processes via the Lambda DiscrepancyAgent-Centric Human Demonstrations Train World Models

Duration:00:15:52

Ask host to enable sharing for playback control

RLC 2024 - Posters and Hallways 1

9/10/2024
Posters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: Ann HuangLearning Dynamics and the Geometry of Neural Dynamics in Recurrent Neural ControllersJannis BlümlHackAtari: Atari Learning Environments for Robust and Continual Reinforcement LearningBenjamin FuhrerGradient Boosting Reinforcement LearningPaul FestorEvaluating the impact of explainable RL on physician decision-making in high-fidelity simulations: insights from eye-tracking metrics

Duration:00:05:46

Ask host to enable sharing for playback control

Finale Doshi-Velez on RL for Healthcare @ RCL 2024

9/2/2024
Finale Doshi-Velez is a Professor at the Harvard Paulson School of Engineering and Applied Sciences. This off-the-cuff interview was recorded at UMass Amherst during the workshop day of RL Conference on August 9th 2024. Host notes: I've been a fan of some of Prof Doshi-Velez' past work on clinical RL and hoped to feature her for some time now, so I jumped at the chance to get a few minutes of her thoughts -- even though you can tell I was not prepared and a bit flustered tbh. Thanks to Prof Doshi-Velez for taking a moment for this, and I hope to cross paths in future for a more in depth interview. References Finale Doshi-VelezFinale Doshi-Velez

Duration:00:07:35

Ask host to enable sharing for playback control

David Silver 2 - Discussion after Keynote @ RCL 2024

8/28/2024
Thanks to Professor Silver for permission to record this discussion after his RLC 2024 keynote lecture. Recorded at UMass Amherst during RCL 2024. Due to the live recording environment, audio quality varies. We publish this audio in its raw form to preserve the authenticity and immediacy of the discussion. References AlphaProofDiscovering Reinforcement Learning AlgorithmsReinforcement Learning ConferenceDavid Silver

Duration:00:16:17

Ask host to enable sharing for playback control

David Silver @ RCL 2024

8/26/2024
David Silver is a principal research scientist at DeepMind and a professor at University College London. This interview was recorded at UMass Amherst during RLC 2024. References Discovering Reinforcement Learning AlgorithmsMastering Chess and Shogi by Self-Play with a General Reinforcement Learning AlgorithmAlphaProofAlphaFoldReinforcement Learning ConferenceDavid Silver

Duration:00:11:27

Ask host to enable sharing for playback control

Vincent Moens on TorchRL

4/8/2024
Dr. Vincent Moens is an Applied Machine Learning Research Scientist at Meta, and an author of TorchRL and TensorDict in pytorch. Featured References TorchRL: A data-driven decision-making library for PyTorch Albert Bou, Matteo Bettini, Sebastian Dittert, Vikash Kumar, Shagun Sodhani, Xiaomeng Yang, Gianni De Fabritiis, Vincent Moens Additional References TorchRL on githubTensorDict Documentation

Duration:00:40:14

Ask host to enable sharing for playback control

Arash Ahmadian on Rethinking RLHF

3/25/2024
Arash Ahmadian is a Researcher at Cohere and Cohere For AI focussed on Preference Training of large language models. He’s also a researcher at the Vector Institute of AI. Featured Reference Back to Basics: Revisiting REINFORCE Style Optimization for Learning from Human Feedback in LLMs Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Üstün, Sara Hooker Additional References Self-Rewarding Language ModelsReinforcement Learning: An IntroductionLearning from Delayed RewardsSimple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

Duration:00:33:30

Ask host to enable sharing for playback control

Glen Berseth on RL Conference

3/11/2024
Glen Berseth is an assistant professor at the Université de Montréal, a core academic member of the Mila - Quebec AI Institute, a Canada CIFAR AI chair, member l'Institute Courtios, and co-director of the Robotics and Embodied AI Lab (REAL). Featured Links Reinforcement Learning Conference Closing the Gap between TD Learning and Supervised Learning--A Generalisation Point of View Raj Ghugare, Matthieu Geist, Glen Berseth, Benjamin Eysenbach

Duration:00:21:38

Ask host to enable sharing for playback control

Ian Osband

3/7/2024
Ian Osband is a Research scientist at OpenAI (ex DeepMind, Stanford) working on decision making under uncertainty. We spoke about: - Information theory and RL - Exploration, epistemic uncertainty and joint predictions - Epistemic Neural Networks and scaling to LLMs Featured References Reinforcement Learning, Bit by Bit Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi, Ian Osband, Zheng Wen From Predictions to Decisions: The Importance of Joint Predictive Distributions Zheng Wen, Ian Osband, Chao Qin, Xiuyuan Lu, Morteza Ibrahimi, Vikranth Dwaracherla, Mohammad Asghari, Benjamin Van Roy Epistemic Neural Networks Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy Approximate Thompson Sampling via Epistemic Neural Networks Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, Benjamin Van Roy Additional References Thesis defenceHomepageEpistemic Neural NetworksBehaviour Suite for Reinforcement LearningEfficient Exploration for LLMs

Duration:01:08:26

Ask host to enable sharing for playback control

Sharath Chandra Raparthy

2/11/2024
Sharath Chandra Raparthy on In-Context Learning for Sequential Decision Tasks, GFlowNets, and more! Sharath Chandra Raparthy is an AI Resident at FAIR at Meta, and did his Master's at Mila. Featured Reference Generalization to New Sequential Decision Making Tasks with In-Context Learning Sharath Chandra Raparthy , Eric Hambro, Robert Kirk , Mikael Henaff, , Roberta Raileanu Additional References Sharath Chandra RaparthyHuman-Timescale Adaptation in an Open-Ended Task SpaceData Distributional Properties Drive Emergent In-Context Learning in TransformersDecision Transformer: Reinforcement Learning via Sequence Modeling

Duration:00:40:41

Ask host to enable sharing for playback control

Pierluca D'Oro and Martin Klissarov

11/13/2023
Pierluca D'Oro and Martin Klissarov on Motif and RLAIF, Noisy Neighborhoods and Return Landscapes, and more! Pierluca D'Oro is PhD student at Mila and visiting researcher at Meta. Martin Klissarov is a PhD student at Mila and McGill and research scientist intern at Meta. Featured References Motif: Intrinsic Motivation from Artificial Intelligence Feedback Martin Klissarov*, Pierluca D'Oro*, Shagun Sodhani, Roberta Raileanu, Pierre-Luc Bacon, Pascal Vincent, Amy Zhang, Mikael Henaff Policy Optimization in a Noisy Neighborhood: On Return Landscapes in Continuous Control Nate Rahn*, Pierluca D'Oro*, Harley Wiltzer, Pierre-Luc Bacon, Marc G. Bellemare To keep doing RL research, stop calling yourself an RL researcher Pierluca D'Oro

Duration:00:57:24

Ask host to enable sharing for playback control

Martin Riedmiller

8/22/2023
Martin Riedmiller of Google DeepMind on controlling nuclear fusion plasma in a tokamak with RL, the original Deep Q-Network, Neural Fitted Q-Iteration, Collect and Infer, AGI for control systems, and tons more! Martin Riedmiller is a research scientist and team lead at DeepMind. Featured References Magnetic control of tokamak plasmas through deep reinforcement learning Jonas Degrave, Federico Felici, Jonas Buchli, Michael Neunert, Brendan Tracey, Francesco Carpanese, Timo Ewalds, Roland Hafner, Abbas Abdolmaleki, Diego de las Casas, Craig Donner, Leslie Fritz, Cristian Galperti, Andrea Huber, James Keeling, Maria Tsimpoukelli, Jackie Kay, Antoine Merle, Jean-Marc Moret, Seb Noury, Federico Pesamosca, David Pfau, Olivier Sauter, Cristian Sommariva, Stefano Coda, Basil Duval, Ambrogio Fasoli, Pushmeet Kohli, Koray Kavukcuoglu, Demis Hassabis & Martin Riedmiller Human-level control through deep reinforcement learning Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, Demis Hassabis Neural fitted Q iteration–first experiences with a data efficient neural reinforcement learning method Martin Riedmiller

Duration:01:13:56