pull down to refresh

OpenAI released a new model called "o3" as part of their shipmas product release cycle
87.5% on ARC AGI (humans are rougly 85%), while the last generation o1 was suck in the 30%ish area
Paticularly impressive to me
it solves 1/4 of research-level math questions
reply
5 sats \ 0 replies \ @gmd 21 Dec
Scary that just 1 month ago, after evaluating o1, the great Terrence Tao-
anticipated that the benchmark would "resist AIs for several years at least," noting that the problems require substantial domain expertise and that we currently lack sufficient relevant training data.
reply
AGI is coming.
reply