pull down to refresh

OpenAI released a new model called "o3" as part of their shipmas product release cycle

87.5% on ARC AGI (humans are rougly 85%), while the last generation o1 was suck in the 30%ish area

view on x.com

Paticularly impressive to me

it solves 1/4 of research-level math questions

reply

Scary that just 1 month ago, after evaluating o1, the great Terrence Tao-

anticipated that the benchmark would "resist AIs for several years at least," noting that the problems require substantial domain expertise and that we currently lack sufficient relevant training data.

(https://arxiv.org/html/2411.04872v1)

reply

AGI is coming.

reply