pull down to refresh

"To his credit, Shumer is actually correct that something has changed recently. You really can let things rip more in the most recent systems (AI). But he misses something subtle but essential, captured well by a friend who very skilled in the art of coding pointed out to me. After reading a draft of this essay, he texted me

“[Shumer’s blog is representing my experience as well… Something happened a couple of months ago where you can truly give it a description and let it go and - sometimes! - will come out with the right answer. … Sometimes… [But] Ultimately, I think this makes it more dangerous …. Generally, the closer these systems are to appearing right, the more dangerous they become because people become increasingly at ease just trusting them [when the shouldn’t]”."

100 sats \ 7 replies \ @gmd 5h

I really miss the days when GPT5 was released to an underwhelming reception and people were saying we were hitting a wall... those were comforting times.

reply
63 sats \ 6 replies \ @optimism 4h

The walls are still there, just people stopped reading the slop 😂

reply
100 sats \ 5 replies \ @gmd 3h

hmmm dunno seems like were just accelerating….

reply
63 sats \ 4 replies \ @optimism 3h

Maybe adoption is accelerating, but honestly in terms of breakthroughs, I'm with Mr. LeCunn - there are too little. (Despite that I think that Anthropic is hot shit rn)

reply
100 sats \ 3 replies \ @gmd 1h

Adoption and capability... who is coding by hand anymore?

reply
205 sats \ 2 replies \ @optimism 1h

I am, every day. I'm also manually reviewing every pull request, every day, including all the slop Claude writes, also every day unless I'm out of weekly token allocation, lol.

There are 2 issues with Claude:

  1. It has zero confidentiality, so it cannot be used for my work
  2. It absolutely cannot write a concise 5-/5+ patch on a complex c++ codebase out of the box. I know because I'm rejecting bad PRs from retards all day every day since December. So it should not be used in FOSS.
reply
100 sats \ 1 reply \ @gmd 1h

Probably similar capabilities but there seems to be growing consensus that Codex 5.3 is smarter...

also have you tried using them to assist in code review?

reply
105 sats \ 0 replies \ @optimism 58m

Yes, in code review it works great. I throw most my work through it before I bother a human. Also on c++ code - it's capable to find issues better than clang in some cases.

I'm not saying don't use it, I'm just saying don't use it for important or confidential work. If you're working on important libraries that a lot of people use, it's important that you keep the slop out. And it's still sloppy. I've used Claude 4.6 since launch and it is still badly trained - it need lots of instruction to make it do sane things. It's pretty good at python and javascript though, and greenfield... sure - if you can live with slop that is completely unmaintainable for humans.

But systems programming inside an existing codebase is hard - for both humans and bots. I'm sure they'll fix most of my complaints eventually though, they're competent at Anthropic, that is true. But the whole notion that fanbois are spreading that it's exponentially better instead of linearly better is just not what I'm experiencing, and I'm using and improving it every day.

My biggest problem is that since no one wants to / can do HitL anymore, I am building my own HitL frameworks. This eats up shittons of tokens but I'll get it done.

(PS: I currently disagree that codex is smarter but I've only touched it lightly)

That Shumer post must really be making the rounds. My wife sent it to me saying that a friend had sent it to her. They're not normally the type to talk about this stuff.

Personally, I thought it sounded like yet another this time is different post that we've been hearing repeatedly forever.

reply
0 sats \ 1 reply \ @Scoresby 1h

My mom sent it to me.

reply

Being hysterical and hyperbolic while also reflecting empathy (which is what I'd characterize his post as), seems to really speak to people.

reply
0 sats \ 0 replies \ @gmd 1h

Forever since when? This time is different...

reply
21 sats \ 0 replies \ @optimism 5h

EXACTLY!

reply