pull down to refresh
Scary that just 1 month ago, after evaluating o1, the great Terrence Tao-
anticipated that the benchmark would "resist AIs for several years at least," noting that the problems require substantial domain expertise and that we currently lack sufficient relevant training data.
reply
Paticularly impressive to me
it solves 1/4 of research-level math questions