items/1077756/related \ stacker news

pull down to refresh

Evaluating LLMs Playing Text Adventures entropicthoughts.com/evaluating-llms-playing-text-adventures

141 sats \ 0 comments \ @carter 12 Aug 2025 AI

related

Graham King - Evaluating LLMs for my personal use case darkcoding.net/software/personal-ai-evals-aug-2025/

278 sats \ 1 comment \ @carter 25 Aug 2025 AI

Adventures in Extreme Vibecoding techbroiler.net/adventures-in-extreme-vibecoding/

568 sats \ 11 comments \ @StillStackinAfterAllTheseYears 16 Dec 2025 AI

The simulation of judgment in LLMs - PNAS www.pnas.org/doi/10.1073/pnas.2518443122

244 sats \ 5 comments \ @Scoresby 15 Oct 2025 AI

LLMs generate slop because they avoid surprises by design - Dan Fabulich danfabulich.medium.com/llms-tell-bad-jokes-because-they-avoid-surprises-7f111aac4f96

373 sats \ 2 comments \ @Scoresby 19 Aug 2025 AI

Writing for LLMs So They Listen - Gwern gwern.net/llm-writing

290 sats \ 8 comments \ @Scoresby 19 Jul 2025 AI

LLMs Can Get Brain Rot llm-brain-rot.github.io/

287 sats \ 0 comments \ @Scoresby 21 Oct 2025 AI

LLMs are getting better at character-level text manipulation blog.burkert.me/posts/llm_evolution_character_manipulation/

187 sats \ 0 comments \ @carter 14 Oct 2025 AI

How do you use LLMs?

901 sats \ 8 comments \ @gmd 21 Mar 2025 AI

LLM evaluation at scale with the NeurIPS Efficiency Challenge blog.mozilla.ai/exploring-llm-evaluation-at-scale-with-the-neurips-large-language-model-efficiency-challenge/

210 sats \ 0 comments \ @localhost 22 Feb 2024 tech

More Artificial than Intelligent, it is only getting worse - Mathjis Lagerberg mlagerberg.com/much-a-little-i-and-it-is-not-getting-better/

247 sats \ 4 comments \ @Scoresby 15 Jul 2025 AI

Andrej Karpathy: How I use LLMs www.youtube.com/watch?v=EWvNQjAaOHw

1278 sats \ 1 comment \ @k00b 28 Feb 2025 AI

Writing is thinking - the value of human scientific writing in the age of LLMs www.nature.com/articles/s44222-025-00323-4

555 sats \ 1 comment \ @k00b 24 Jul 2025 science

LLM can “see” because text has “spatial” qualities x.com/wesg52/status/1980680563582538099

130 sats \ 1 comment \ @carter 22 Oct 2025 AI

Elites, the curse of recursion, and the half-life of policy

5779 sats \ 11 comments \ @elvismercury 29 Mar 2024 mostly_harmless

Post-Chat UI: How LLMs are making traditional apps feel broken.allenpike.com/2025/post-chat-llm-ui

223 sats \ 3 comments \ @deSign_r 25 Jun 2025 Design

Things we learned about LLMs in 2024 simonwillison.net/2024/Dec/31/llms-in-2024/

470 sats \ 0 comments \ @Rsync25 31 Dec 2024 tech

Open Source AI Spam: Further Commentary

1570 sats \ 6 comments \ @halleck 10 May 2024 devs freebie

Context Rot: How Increasing Input Tokens Impacts LLM Performance research.trychroma.com/context-rot

334 sats \ 2 comments \ @Scoresby 14 Jul 2025 AI

33-46% of workers on MTurk used LLMs in a text production task arxiv.org/abs/2306.07899

36 sats \ 0 comments \ @shadowymartian 14 Jun 2023 tech

Defining and evaluating political bias in LLMs openai.com/index/defining-and-evaluating-political-bias-in-llms/

387 sats \ 2 comments \ @0xbitcoiner 14 Oct 2025 AI

LLMs and Programming in the first days of 2024 antirez.com/news/140

2759 sats \ 20 comments \ @hn 2 Jan 2024 tech