Arxiv Papers

Large language models (LLMs) self-improve to navigate web environments using synthetic data, achieving 31% task completion rate improvement on WebArena benchmark, introducing new evaluation metrics. https://arxiv.org/abs//2405.20309 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support

Duration:00:08:42

Large Language Models Can Self-Improve At Web Agent Tasks

Duration:00:13:55

[QA] Is In-Context Learning Sufficient for Instruction Following in LLMs?

In-context learning (ICL) with URIAL aligns base LLMs using few examples but underperforms compared to instruction fine-tuning, with a proposed greedy selection approach improving performance. https://arxiv.org/abs//2405.19874 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support

Duration:00:09:33

Is In-Context Learning Sufficient for Instruction Following in LLMs?

Duration:00:05:39

[QA] Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities

Kernel Language Entropy (KLE) method improves uncertainty quantification in Large Language Models (LLMs) by capturing semantic uncertainty, enhancing trustworthiness by detecting incorrect responses. https://arxiv.org/abs//2405.20003 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support

Duration:00:11:08

Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities

Duration:00:12:19

[QA] COSY: Evaluating Textual Explanations of Neurons

The paper introduces COSY, a framework to evaluate textual explanations for neural network concepts. It uses generative models to assess explanation quality, revealing differences in existing methods. https://arxiv.org/abs//2405.20331 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support

Duration:00:09:48

COSY: Evaluating Textual Explanations of Neurons

Duration:00:12:25

[QA] Nearest Neighbor Speculative Decoding for LLM Generation and Attribution

https://arxiv.org/abs//2405.19325 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support

Duration:00:08:07

Nearest Neighbor Speculative Decoding for LLM Generation and Attribution

--- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support

Duration:00:17:59

[QA] Self-Exploring Language Models: Active Preference Elicitation for Online Alignment

Reinforcement Learning from Human Feedback improves Large Language Models alignment with human intentions. SELM optimizes reward models for diverse responses, enhancing exploration efficiency and model performance. https://arxiv.org/abs//2405.19332 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support

Duration:00:08:51

Self-Exploring Language Models: Active Preference Elicitation for Online Alignment

Duration:00:15:13

[QA] Phased Consistency Model

The paper introduces the Phased Consistency Model (PCM) to improve text-conditioned image generation in the latent space, outperforming existing models across multiple generation steps. https://arxiv.org/abs//2405.18407 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support

Duration:00:11:54

Phased Consistency Model

Duration:00:12:00

[QA] Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations

Understanding model scaling is crucial for designing effective training setups and architectures. This paper challenges the complexity of cosine schedules, proposing a simpler alternative with predictable scaling behavior and improved performance. https://arxiv.org/abs//2405.18392 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support

Duration:00:08:30

Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations

Duration:00:11:06

[QA] On the Origin of Llamas: Model Tree Heritage Recovery

The paper introduces Model Tree Heritage Recovery (MoTHer Recovery) to decode model relationships using weights, reconstructing model hierarchies like Llama 2 and Stable Diffusion. https://arxiv.org/abs//2405.18432 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support

Duration:00:07:49

On the Origin of Llamas: Model Tree Heritage Recovery

Duration:00:15:01

[QA] Transformers Can Do Arithmetic with the Right Embeddings

Adding position embeddings to digits in transformers improves performance on arithmetic tasks, enabling solving larger problems and enhancing multi-step reasoning abilities like sorting and multiplication. https://arxiv.org/abs//2405.17399 YouTube: https://www.youtube.com/@ArxivPapers TikTok: https://www.tiktok.com/@arxiv_papers Apple Podcasts: https://podcasts.apple.com/us/podcast/arxiv-papers/id1692476016 Spotify: https://podcasters.spotify.com/pod/show/arxiv-papers --- Support this podcast: https://podcasters.spotify.com/pod/show/arxiv-papers/support

Duration:00:07:51

Transformers Can Do Arithmetic with the Right Embeddings