Why language models hallucinate - OpenAI \ stacker news ~AI

OpenAI published a paper on hallucinations in LLMs and they theorize that current training practices reward guessing more than just saying I don't know.

Think about it like a multiple-choice test. If you do not know the answer but take a wild guess, you might get lucky and be right. Leaving it blank guarantees a zero. In the same way, when models are graded only on accuracy, the percentage of questions they get exactly right, they are encouraged to guess rather than say “I don’t know.”

They include a really interesting chart which shows that guessing leads to a slightly higher accuracy rate, but dramatically more hallucinations.

There is a straightforward fix. Penalize confident errors more than you penalize uncertainty, and give partial credit for appropriate expressions of uncertainty. This idea is not new. Some standardized tests have long used versions of negative marking for wrong answers or partial credit for leaving questions blank to discourage blind guessing. Several research groups have also explored evaluations that account for uncertainty and calibration.

They also go into why next-word generation is prone to hallucinations:

Spelling and parentheses follow consistent patterns, so errors there disappear with scale. But arbitrary low-frequency facts, like a pet’s birthday, cannot be predicted from patterns alone and hence lead to hallucinations. Our analysis explains which kinds of hallucinations should arise from next-word prediction. Ideally, further stages after pretraining should remove them, but this is not fully successful for reasons described in the previous section.

The paper isn't long. Interesting read.

202 sats \ 1 reply \ @optimism 17h

"Why do models hallucinate?"

Because we made it so

"Thank you for your honesty"

202 sats \ 0 replies \ @kepford 11h

This is why my attitude about AI changed as I began to understand it better. The issues shouldn't surprise people. We have been set up to expect something that is different from the actual technology

0 sats \ 0 replies \ @brave 8h

The idea of penalizing confident errors feels like a game changer for reducing hallucinations. Uncertainty can also be a win, uncertainty would be the little errors made along the way to the prefect output.