items/1216595/related \ stacker news

pull down to refresh

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning arxiv.org/abs/2509.07980

248 sats \ 0 comments \ @optimism 10 Sep 2025 AI

related

Researchers discover impressive learning capabilities in long-context LLMs venturebeat.com/ai/deepmind-researchers-discover-impressive-learning-capabilities-in-long-context-llms/

397 sats \ 0 comments \ @ch0k1 25 Apr 2024 tech

AgentFly: Fine-tuning LLM Agents without Fine-tuning LLMs arxiv.org/abs/2508.16153

182 sats \ 0 comments \ @optimism 25 Aug 2025 AI

Agentic Reinforced Policy Optimization arxiv.org/abs/2507.19849

171 sats \ 0 comments \ @optimism 29 Jul 2025 AI

DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning www.nature.com/articles/s41586-025-09422-z

151 sats \ 0 comments \ @carter 19 Sep 2025 AI

Inverse IFEval: Unlearn Training Conventions to Follow Real Instructions?arxiv.org/abs/2509.04292

120 sats \ 0 comments \ @optimism 5 Sep 2025 AI

RLNVR: Reinforcement Learning from Non-Verified Real-World Rewards arxiv.org/abs/2508.12165

130 sats \ 0 comments \ @Tony 25 Aug 2025 AI

LLM Daydreaming gwern.net/ai-daydreaming

349 sats \ 2 comments \ @k00b 16 Jul 2025 AI

Deep Dive into LLMs like ChatGPT www.youtube.com/watch?v=7xTGNNLPyMI

630 sats \ 1 comment \ @k00b 8 Feb 2025 AI

LLMs Can Get Brain Rot llm-brain-rot.github.io/

287 sats \ 0 comments \ @Scoresby 21 Oct 2025 AI

In a First, AI Models Analyze Language As Well As a Human Expert www.quantamagazine.org/in-a-first-ai-models-analyze-language-as-well-as-a-human-expert-20251031/

274 sats \ 0 comments \ @0xbitcoiner 31 Oct 2025 AI

Diffusion Language Models Know the Answer Before Decoding arxiv.org/abs/2508.19982

274 sats \ 0 comments \ @optimism 28 Aug 2025 AI

Is Chain-of-Thought Reasoning of LLMs a Mirage?arxiv.org/abs/2508.01191

427 sats \ 9 comments \ @optimism 7 Aug 2025 AI

Train your own R1 reasoning model locally unsloth.ai/blog/r1-reasoning

214 sats \ 1 comment \ @aljaz 7 Feb 2025 AI

Experimental evidence of the effects of LLMs vs web search on depth of learning academic.oup.com/pnasnexus/article/4/10/pgaf316/8303888

176 sats \ 1 comment \ @0xbitcoiner 20 Jan AI

Olympiad-level formal mathematical reasoning with reinforcement learning www.nature.com/articles/s41586-025-09833-y

205 sats \ 2 comments \ @0xbitcoiner 20 Nov 2025 AI

Why Do Researchers Care About Small Language Models?www.quantamagazine.org/why-do-researchers-care-about-small-language-models-20250310/

40 sats \ 3 comments \ @0xbitcoiner 10 Mar 2025 AI

DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning arxiv.org/abs/2508.05405

212 sats \ 0 comments \ @optimism 10 Aug 2025 AI

What Are Large Reasoning Models (LRMs)? Smarter AI Beyond LLMs youtu.be/enLbj0igyx4

188 sats \ 0 comments \ @jakoyoh629 8 Nov 2025 AI

Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples arxiv.org/abs/2510.07192

130 sats \ 0 comments \ @0xbitcoiner 9 Oct 2025 AI

Apertus: Democratizing Open and Compliant LLMs for Global Language Environments arxiv.org/abs/2509.14233

324 sats \ 1 comment \ @optimism 21 Sep 2025 AI

Large Language Models Pass the Turing Test arxiv.org/pdf/2503.23674

374 sats \ 11 comments \ @south_korea_ln 15 Apr 2025 AI