sign up
sign up
sign up
sign up
pull down to refresh
How the novel nature of LLMs confounds our ability to evaluate them
parsingphase.dev/tech/LLMs/psychologicalFactors.html
478 sats
\
0 comments
\
@co574
24 Mar
AI
related
The simulation of judgment in LLMs - PNAS
www.pnas.org/doi/10.1073/pnas.2518443122
244 sats
\
5 comments
\
@Scoresby
15 Oct 2025
AI
Graham King - Evaluating LLMs for my personal use case
darkcoding.net/software/personal-ai-evals-aug-2025/
278 sats
\
1 comment
\
@carter
25 Aug 2025
AI
LLM evaluation at scale with the NeurIPS Efficiency Challenge
blog.mozilla.ai/exploring-llm-evaluation-at-scale-with-the-neurips-large-language-model-efficiency-challenge/
210 sats
\
0 comments
\
@localhost
22 Feb 2024
tech
More Artificial than Intelligent, it is only getting worse - Mathjis Lagerberg
mlagerberg.com/much-a-little-i-and-it-is-not-getting-better/
247 sats
\
4 comments
\
@Scoresby
15 Jul 2025
AI
Elites, the curse of recursion, and the half-life of policy
5779 sats
\
11 comments
\
@elvismercury
29 Mar 2024
mostly_harmless
Are You Getting Dumber?
1986 sats
\
29 comments
\
@kr
6 Jun 2025
AskSN
Context Rot: How Increasing Input Tokens Impacts LLM Performance
research.trychroma.com/context-rot
334 sats
\
2 comments
\
@Scoresby
14 Jul 2025
AI
Hallucination Stations On Some Basic Limitations of Transformer-Based LM
arxiv.org/pdf/2507.07505
213 sats
\
0 comments
\
@0xbitcoiner
23 Jan
AI
2025 LLM Year in Review - karpathy
karpathy.bearblog.dev/year-in-review-2025/
1652 sats
\
3 comments
\
@Scoresby
21 Dec 2025
AI
"History is only useful to the extent that it can predict the future": Rebuttal
2674 sats
\
8 comments
\
@frostdragon
24 Mar 2024
FiresidePhilosophy
BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs
arxiv.org/abs/2510.04721
210 sats
\
1 comment
\
@jakoyoh629
25 Oct 2025
AI
Writing is thinking - the value of human scientific writing in the age of LLMs
www.nature.com/articles/s44222-025-00323-4
555 sats
\
1 comment
\
@k00b
24 Jul 2025
science
Defining and evaluating political bias in LLMs
openai.com/index/defining-and-evaluating-political-bias-in-llms/
387 sats
\
2 comments
\
@0xbitcoiner
14 Oct 2025
AI
Devs: LLMs are not about to take your jobs
729 sats
\
17 comments
\
@halleck
17 May 2024
devs
How do you use LLMs?
901 sats
\
8 comments
\
@gmd
21 Mar 2025
AI
Local LLMs are how nerds now justify a big computer they don't need
world.hey.com/dhh/local-llms-are-how-nerds-now-justify-a-big-computer-they-don-t-need-af2fcb7b
948 sats
\
9 comments
\
@k00b
25 Nov 2025
AI
Why do people find it so exciting when LLMs say outrageous things?
substack.com/home/post/p-167898567
548 sats
\
13 comments
\
@Scoresby
10 Jul 2025
AI
Andrej Karpathy: How I use LLMs
www.youtube.com/watch?v=EWvNQjAaOHw
1278 sats
\
1 comment
\
@k00b
28 Feb 2025
AI
Political censorship in large language models originating from China
academic.oup.com/pnasnexus/article/5/2/pgag013/8487339
251 sats
\
1 comment
\
@0xbitcoiner
27 Feb
AI
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection
arxiv.org/abs/2510.04849v1
433 sats
\
2 comments
\
@optimism
19 Oct 2025
AI
Pleb Economist #9: Comparative Advantage, AI, and You
12.5k sats
\
22 comments
\
@SimpleStacker
19 Jan
AI
econ
more