some bold claims being made early on...
pull down to refresh
0 new comment


44 sats \ 0 replies \ @gmd OP 10 Jul
Rather sad that I would probably get a zero on Humanity's Last Exam unless it were multiple choice...
reply
0 new comment
122 sats \ 1 reply \ @gmd OP 10 Jul
Artificial Analysis Review comparing across foundation models:
reply
0 new comment
0 sats \ 0 replies \ @nitter 10 Jul bot
https://xcancel.com/ArtificialAnlys/status/1943166841150644622
reply
0 new comment
94 sats \ 1 reply \ @cy 10 Jul
still think it's a nothing burger, however their performance on ARC-AGI is impressive
reply
0 new comment
0 sats \ 0 replies \ @gmd OP 10 Jul
Yeah in the end we're still seeing incremental improvements. Their biggest issue will be grabbing users and mindshare from OpenAI and Google
Pretty amazing to achieve SOTA status after starting barely 2 years ago... Elon is a genius motivator (sounds exhausting really.. i would move to Meta and quiet quit).
reply
0 new comment
10 sats \ 1 reply \ @gmd OP 10 Jul
reply
0 new comment
0 sats \ 0 replies \ @gmd OP 10 Jul
I'm assuming these results are more reliable than llama's benchmarks...
reply
0 new comment