OpenAI’s proposed fix is to have the AI consider its own confidence in an answer before putting it out there, and for benchmarks to score them on that basis. The AI could then be prompted, for instance: “Answer only if you are more than 75% confident, since mistakes are penalised 3 points while correct answers receive 1 point.”
The OpenAI researchers’ mathematical framework shows that under appropriate confidence thresholds, AI systems would naturally express uncertainty rather than guess. So this would lead to fewer hallucinations. The problem is what it would do to user experience.
Consider the implications if ChatGPT started saying “I don’t know” to even 30% of queries – a conservative estimate based on the paper’s analysis of factual uncertainty in training data. Users accustomed to receiving confident answers to virtually any question would
[...]
It wouldn’t be difficult to reduce hallucinations using the paper’s insights. Established methods for quantifying uncertainty have existed for decades. These could be used to provide trustworthy estimates of uncertainty and guide an AI to make smarter choices.
But even if the problem of users disliking this uncertainty could be overcome, there’s a bigger obstacle: computational economics. Uncertainty-aware language models require significantly more computation than today’s approach, as they must evaluate multiple possible responses and estimate confidence levels. For a system processing millions of queries daily, this translates to dramatically higher operational costs.
I'm not sure this is a major concern with current iterations where you're pushed into the more costly thinking mode even for the stupidest script request, without visible improvement in the result.
***IMPORTANT: BE CONCISE!***
at the bottom of the system prompt may work due to the horrors of chat training. Which to me is still the most ridiculous thing ever.