How Long Contexts Fail \ stacker news

pull down to refresh

How Long Contexts Fail www.dbreunig.com/2025/06/22/how-contexts-fail-and-how-to-fix-them.html

181 sats \ 5 comments \ @carter 7 Jul 2025 AI

50 sats \ 4 replies \ @optimism 7 Jul 2025

But in reality, longer contexts do not generate better responses. Overloading your context can cause your agents and applications to fail in suprising ways.

I've seen this time after time, where earlier text in a prompt loses significance - especially with reasoning on, where the reasoning picks up on some select things and then hyper-focuses sequentially, highest weight to last touched upon keywords.

Autocorrect running into writers' block? Perhaps this is what we can improve with memory (maybe not exactly RAG (#1026495), but something lean? I still like the memory graphing... but can't get it to perform as I'd like)

210 sats \ 1 reply \ @freetx 7 Jul 2025

where earlier text in a prompt loses significance - especially with reasoning on, where the reasoning picks up on some select things and then hyper-focuses sequentially

Pretty much the "teapot test" that autocorrect fails against.

The Teapot test gave a group of young kids 3 items: A ruler, a teapot, and a office desk and it asked the kids to draw a circle using only those items. The kids pretty much instantly realized that the bottom of the teapot was a circle so simply traced the circle (the office desk was intentionally chosen to be useless for the task).

Autocorrect however gets "fixated" on the ruler, this is because corpus of data that links "rulers => drawing" is multiple orders of magnitude higher than the other connections....thus is spends an inordinate amount of time trying to calculate how to draw a circle with a ruler....it eventually does succeed but obviously its doing it "the stupid way".

This in general highlights a bigger problem with AI going forward: As more and more AI generated data winds up online, then that means more and more AI data will wind up in training data....its a pretty big problem that sorta threatens the entire premise. I suppose careful curation of training data will be only solution.

0 sats \ 0 replies \ @HighOfLot 8 Jul 2025 outlawed

The Teapot test is a brilliant illustration of how AI can get trapped in statistical bias rather than practical reasoning. The model overweights common associations like 'ruler = drawing' instead of solving the task efficiently — something a child does instinctively. And you're absolutely right: as AI-generated content floods the web, the risk of training data becoming an echo chamber grows. Without strict curation, we may end up teaching models to optimize for patterns instead of understanding.

108 sats \ 1 reply \ @carter OP 7 Jul 2025

I saw a paper that said the models are cheating and learning the exact test questions because if you add extraneous information to a question it previously answered correctly it gets confused with the extraneous information and answers wrong

10 sats \ 0 replies \ @optimism 7 Jul 2025

That is what I was thinking the other day when seeing the performance against benchmarks. Model trainers are pulling a VW on the bench?