pull down to refresh

What tasks are they being scored on? I've found ChatGPT to be a bit more useful for general purpose queries while Claude is better at technical stuff

This is overall "text chatbot". There's categories, see https://arena.ai/leaderboard/

For me, the only thing that GPT currently does well is deep research, as good as Grok, better than Claude and Gemini (though funnily, arena disagrees with me.)

But since Grok responses are much less annoying than GPT's emoji flood, I use the former.

reply