Scary that just 1 month ago, after evaluating o1, the great Terrence Tao-
anticipated that the benchmark would "resist AIs for several years at least," noting that the problems require substantial domain expertise and that we currently lack sufficient relevant training data.
Scary that just 1 month ago, after evaluating o1, the great Terrence Tao-
anticipated that the benchmark would "resist AIs for several years at least," noting that the problems require substantial domain expertise and that we currently lack sufficient relevant training data.
(https://arxiv.org/html/2411.04872v1)