pull down to refresh

I am, every day. I'm also manually reviewing every pull request, every day, including all the slop Claude writes, also every day unless I'm out of weekly token allocation, lol.

There are 2 issues with Claude:

  1. It has zero confidentiality, so it cannot be used for my work
  2. It absolutely cannot write a concise 5-/5+ patch on a complex c++ codebase out of the box. I know because I'm rejecting bad PRs from retards all day every day since December. So it should not be used in FOSS.
100 sats \ 1 reply \ @gmd 2h

Probably similar capabilities but there seems to be growing consensus that Codex 5.3 is smarter...

also have you tried using them to assist in code review?

reply
105 sats \ 0 replies \ @optimism 2h

Yes, in code review it works great. I throw most my work through it before I bother a human. Also on c++ code - it's capable to find issues better than clang in some cases.

I'm not saying don't use it, I'm just saying don't use it for important or confidential work. If you're working on important libraries that a lot of people use, it's important that you keep the slop out. And it's still sloppy. I've used Claude 4.6 since launch and it is still badly trained - it need lots of instruction to make it do sane things. It's pretty good at python and javascript though, and greenfield... sure - if you can live with slop that is completely unmaintainable for humans.

But systems programming inside an existing codebase is hard - for both humans and bots. I'm sure they'll fix most of my complaints eventually though, they're competent at Anthropic, that is true. But the whole notion that fanbois are spreading that it's exponentially better instead of linearly better is just not what I'm experiencing, and I'm using and improving it every day.

My biggest problem is that since no one wants to / can do HitL anymore, I am building my own HitL frameworks. This eats up shittons of tokens but I'll get it done.

(PS: I currently disagree that codex is smarter but I've only touched it lightly)

reply