I just tried out Claude 3.5. It's impressively good and had some thoughts about the lmsys voting, ELO scores and leaderboard. Something important changed in LLM benchmarking and nobody seems to have talked about the implications yet. Maybe I'm overthinking it.
I watched some of this video anon shared in saloon yesterday and it gave me similar "alarm":
The guest wrote a series of essays that have been circulating a lot, which predict, among other things, AI intelligence better than the average college graduate by 2026.
reply