sign up
sign up
sign up
sign up
pull down to refresh
BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs
arxiv.org/abs/2510.04721
210 sats
\
1 comment
\
@jakoyoh629
25 Oct 2025
AI
related
Hallucination Stations On Some Basic Limitations of Transformer-Based LM
arxiv.org/pdf/2507.07505
213 sats
\
0 comments
\
@0xbitcoiner
23 Jan
AI
To Make Language Models Work Better, Researchers Sidestep Language
www.quantamagazine.org/to-make-language-models-work-better-researchers-sidestep-language-20250414/
210 sats
\
0 comments
\
@0xbitcoiner
15 Apr 2025
AI
Large Language Models Pass the Turing Test
arxiv.org/pdf/2503.23674
374 sats
\
11 comments
\
@south_korea_ln
15 Apr 2025
AI
The AI Revolution in Math Has Arrived
www.quantamagazine.org/the-ai-revolution-in-math-has-arrived-20260413/
355 sats
\
1 comment
\
@0xbitcoiner
13 Apr
math
AI
Mathematicians issue a major challenge to AI—show us your work
www.scientificamerican.com/article/mathematicians-launch-first-proof-a-first-of-its-kind-math-exam-for-ai/
1145 sats
\
4 comments
\
@south_korea_ln
14 Feb
AI
science
Why language models hallucinate - OpenAI
openai.com/index/why-language-models-hallucinate/
438 sats
\
4 comments
\
@Scoresby
6 Sep 2025
AI
The ORCA Benchmark Evaluates How Well AIs Deal with Everyday Math
www.omnicalculator.com/reports/omni-research-on-calculation-in-ai-benchmark
260 sats
\
0 comments
\
@0xbitcoiner
27 Feb
AI
Meet the new biologists treating LLMs like aliens
www.technologyreview.com/2026/01/12/1129782/ai-large-language-models-biology-alien-autopsy/
580 sats
\
1 comment
\
@winteryeti
14 Jan
AI
Is Chain-of-Thought Reasoning of LLMs a Mirage?
arxiv.org/abs/2508.01191
427 sats
\
9 comments
\
@optimism
7 Aug 2025
AI
Debate May Help AI Models Converge on Truth
www.quantamagazine.org/debate-may-help-ai-models-converge-on-truth-20241108/
258 sats
\
0 comments
\
@0xbitcoiner
8 Nov 2024
science
LLMs Can Get Brain Rot
llm-brain-rot.github.io/
287 sats
\
0 comments
\
@Scoresby
21 Oct 2025
AI
To Have Machines Make Math Proofs, Turn Them Into a Puzzle
www.quantamagazine.org/to-have-machines-make-math-proofs-turn-them-into-a-puzzle-20251110/
268 sats
\
0 comments
\
@0xbitcoiner
11 Nov 2025
AI
In a First, AI Models Analyze Language As Well As a Human Expert
www.quantamagazine.org/in-a-first-ai-models-analyze-language-as-well-as-a-human-expert-20251031/
274 sats
\
0 comments
\
@0xbitcoiner
31 Oct 2025
AI
Sycophantic Chatbots Cause Delusional Spiraling, Even in Ideal Bayesians
arxiv.org/abs/2602.19141
978 sats
\
9 comments
\
@k00b
31 Mar
AI
science
HealthAndFitness
Vibe physics
www.math.columbia.edu/~woit/wordpress/?p=15012
2355 sats
\
4 comments
\
@south_korea_ln
1 Aug 2025
science
LLMs and the Specter of the Cognitive Black Hole
www.psychologytoday.com/us/blog/the-digital-self/202403/llms-and-the-specter-of-the-cognitive-black-hole
200 sats
\
0 comments
\
@ch0k1
22 Mar 2024
science
Financial Statement Analysis with Large Language Models
papers.ssrn.com/sol3/papers.cfm?abstract_id=4835311&fbclid=IwY2xjawIJNupleHRuA2FlbQIxMAABHWJxn71ESvZCS0FxEF_31oro1rwtk4rlgOst5Q4A6tuxDhxB9cgZBPizAg_aem_OAMNHiz7Vyv2bb2vt2yM0Q
222 sats
\
2 comments
\
@scatman
31 Jan 2025
AI
How to turn LLM Pinocchio into a real boy
12.7k sats
\
10 comments
\
@Scoresby
7 Oct 2025
AI
AI is actually bad at math, ORCA shows
www.theregister.com/2025/11/17/ai_bad_math_orca/
197 sats
\
4 comments
\
@0xbitcoiner
18 Nov 2025
AI
When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection
arxiv.org/abs/2510.04849v1
433 sats
\
2 comments
\
@optimism
19 Oct 2025
AI
How large are large language models?
gist.github.com/rain-1/cf0419958250d15893d8873682492c3e
231 sats
\
0 comments
\
@carter
14 Jul 2025
AI
more