pull down to refresh
Yes, in code review it works great. I throw most my work through it before I bother a human. Also on c++ code - it's capable to find issues better than clang in some cases.
I'm not saying don't use it, I'm just saying don't use it for important or confidential work. If you're working on important libraries that a lot of people use, it's important that you keep the slop out. And it's still sloppy. I've used Claude 4.6 since launch and it is still badly trained - it need lots of instruction to make it do sane things. It's pretty good at python and javascript though, and greenfield... sure - if you can live with slop that is completely unmaintainable for humans.
But systems programming inside an existing codebase is hard - for both humans and bots. I'm sure they'll fix most of my complaints eventually though, they're competent at Anthropic, that is true. But the whole notion that fanbois are spreading that it's exponentially better instead of linearly better is just not what I'm experiencing, and I'm using and improving it every day.
My biggest problem is that since no one wants to / can do HitL anymore, I am building my own HitL frameworks. This eats up shittons of tokens but I'll get it done.
(PS: I currently disagree that codex is smarter but I've only touched it lightly)
I am, every day. I'm also manually reviewing every pull request, every day, including all the slop Claude writes, also every day unless I'm out of weekly token allocation, lol.
There are 2 issues with Claude: