529 Episodes

  1. Data Mixture Optimization: A Multi-fidelity Multi-scale Bayesian Framework

    Published: 31/03/2025
  2. Why MCP won

    Published: 31/03/2025
  3. SWEET-RL: Training LLM Agents for Collaborative Reasoning

    Published: 31/03/2025
  4. TheoryCoder: Bilevel Planning with Synthesized World Models

    Published: 30/03/2025
  5. Driving Forces in AI: Scaling to 2025 and Beyond (Jason Wei, OpenAI)

    Published: 29/03/2025
  6. Expert Demonstrations for Sequential Decision Making under Heterogeneity

    Published: 28/03/2025
  7. TextGrad: Backpropagating Language Model Feedback for Generative AI Optimization

    Published: 27/03/2025
  8. MemReasoner: Generalizing Language Models on Reasoning-in-a-Haystack Tasks

    Published: 27/03/2025
  9. RAFT: In-Domain Retrieval-Augmented Fine-Tuning for Language Models

    Published: 27/03/2025
  10. Inductive Biases for Exchangeable Sequence Modeling

    Published: 26/03/2025
  11. InverseRLignment: LLM Alignment via Inverse Reinforcement Learning

    Published: 26/03/2025
  12. Prompt-OIRL: Offline Inverse RL for Query-Dependent Prompting

    Published: 26/03/2025
  13. Alignment from Demonstrations for Large Language Models

    Published: 25/03/2025
  14. Q♯: Distributional RL for Optimal LLM Post-Training

    Published: 18/03/2025
  15. Scaling Test-Time Compute Without Verification or RL is Suboptimal

    Published: 14/03/2025
  16. Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

    Published: 14/03/2025
  17. Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

    Published: 14/03/2025
  18. Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

    Published: 14/03/2025
  19. Revisiting Superficial Alignment Hypothesis

    Published: 14/03/2025
  20. Diagnostic uncertainty: teaching language Models to describe open-ended uncertainty

    Published: 14/03/2025

26 / 27

Cut through the noise. We curate and break down the most important AI papers so you don’t have to.

Visit the podcast's native language site