The Cult Of Vibe Coding Is Insane \ stacker news

pull down to refresh

The Cult Of Vibe Coding Is Insane bramcohen.com/p/the-cult-of-vibe-coding-is-insane

1515 sats \ 23 comments \ @k00b 9h AI devs

You’re still building the infrastructure of things like plan files (That’s fancy talk for ‘todo lists’), skills, and rules. The machine works very poorly without being given a framework.

So pure vibe coding is a myth. But they’re still trying to do it, and this leads to some very ridiculous outcomes. For example, a human actually looked and saw a lot of duplication between them. Now, you might ask: why didn’t any of the developers just go look for themselves? Again, it’s vibe coding. Looking under the hood is cheating.

In this particular case, a human could have told the machine: “There’s a lot of things that are both agents and tools. Let’s go through and make a list of all of them, look at some examples, and I’ll tell you which should be agents and which should be tools. We’ll have a discussion and figure out the general guidelines. Then we’ll audit the entire set, figure out which category each one belongs in, port the ones that are in the wrong type, and for the ones that are both, read through both versions and consolidate them into one document with the best of both.”

As someone that's spent his last 16 working hours fighting bots to make client side wallet vaults less awkward and bug prone, I can sympathize with folks skipping that step. If it weren't important to me that humans can understand the code for themselves, I'd be tempted to let the slop win too.

IMO the real problem is that LLMs are generative and great code is compressive. Meat agents bias toward generation too, because compressing something well requires understanding it very well, but clankers are much worse. Meat agents need to compress code as they go because their token throughput and context windows are relatively limited.

I'm sure LLMs will fix this bias at some point. In the meantime, vibemaxx'd codebases are incompatible with human oversight. That might not matter to you, but we can't pretend this paradigm is strictly better either.

view all related items

221 sats \ 5 replies \ @k00b OP 8h

I haven't prompted much with it yet, but I've added the following rule to cursor hoping it might fight the generation bias:

1. Prefer deletion to addition.
2. Do not introduce new abstractions unless at least 3 concrete call sites clearly need them.
3. Inline one-off helpers.
4. Reduce files, layers, and indirection.
5. Optimize for minimum code surface area that preserves clarity.
6. Show the simplest working version first.

139 sats \ 2 replies \ @optimism 8h

I like 5..
I think that 1. and 4. are contextual / specific to job-at-hand and could objective-poison generic functioning and degrade.
2. and 3. are subjective, but valid.
6. I would personally not do because rework is expensive. I'd rather "1-shot" (after planning / analysis / exploration) an implementation and throw it away than have a bot rework stuff. Arguably, latest releases did get better at rework, but I'm still feeling as if it is more costly.

85 sats \ 1 reply \ @k00b OP 8h

I agree with your assessment of the rules. They are my personal generic rules absent bots - context is usually something relatively frivolous where human readability takes precedence over nearly everything. I find rework from a checkpoint created by these rules easier than going in the other direction.

139 sats \ 0 replies \ @optimism 8h

I guess rework from checkpoint also means redo, rather than rework. (The checkpoints were broken when combined with having full control cmdline git in the early days, so I hated that feature most of all)

66 sats \ 1 reply \ @justin_shocknet 7h

Avoid long functions, break down into short functions with clear names

33 sats \ 0 replies \ @k00b OP 7h

I don't run into that problem with GPT 5.4. It's subjective but I also prefer larger functions all else being equal hence inline one-off helpers. It's easy to go overboard with function shortening IME.

181 sats \ 5 replies \ @SimpleStacker 9h

I feel like LLMs can already do the compression step, you just have to tell them to. I think asking LLMs to clean up my code is actually a task it's pretty well suited for.

The reason it doesn't do this by default (vs humans) is that humans can maintain longer context and read between the lines of the specific task instructions, whereas LLMs take your task instructions quite literally and do not consider wider context than that

37 sats \ 4 replies \ @k00b OP 9h

IME they struggle with the compression step a lot more than they do generation.

It took me 1 hour of prompting to generate the code I've spent 16 hours compressing.

16 sats \ 1 reply \ @SimpleStacker 8h

another thought is whether compression is also harder than generation for humans as well

I think it probably is

150 sats \ 0 replies \ @k00b OP 5h

It definitely is. I call that out in the post. But bots don't need compression to make progress as much as humans do.

16 sats \ 1 reply \ @SimpleStacker 8h

I wonder how much their performance depends on the need to reason about state vs reasoning about the code itself?

16 sats \ 0 replies \ @k00b OP 8h

Statelessness is easier for humans and bots alike, but I do think the generation bias is legit independent of the context.

Anytime I’ve prompted “make this clearer” or “clean this up,” they tend to increase lines of code. Even “reduce lines of code” results in, at best, negligible reductions. I have to point out excessive abstraction and overengineering repeatedly.

139 sats \ 3 replies \ @WeAreAllSatoshi 8h

Meat agents. I like that.

Holy hell though, vibe coded PRs are always massive and way too much to review as a meat agent. But moar code is better, right?

66 sats \ 2 replies \ @k00b OP 7h

I've been trying to figure out why, but it's usually from too many layers/abstractions. I'm not sure what objective they're trained on, but it's in conflict with readability.

139 sats \ 1 reply \ @WeAreAllSatoshi 7h

producing output only consumable by other agents, maybe? we’re being replaced

66 sats \ 0 replies \ @k00b OP 6h

I'd guess their success criteria isn't very sophisticated yet and is mostly "did it output something that gets the job done?"

I should probably go browse with SWE benchmarks they all use. I'd guess that tracks SOTA success criteria pretty well.

139 sats \ 0 replies \ @kepford 5h

the real problem is that LLMs are generative and great code is compressive. Meat agents bias toward generation too, because compressing something well requires understanding it very well,

Yeah man. That's very true

150 sats \ 2 replies \ @freetx 8h

I'm very much torn on the vibe coding issues, I think I'm so torn because "vibe coded throw away code" is actually fine for some things, but horrible for others.

SN would devolve into a total mess trying to vibe code it, but thats because building each aspect of it requires understand lots of intent not of just the individual elements, but you need big picture intent as well (which LLMs generally are not good at).

However for other things, like throwaway GUIs (ie. you need a CRUD form to manage some data), it really doesn't matter if humans completely understand the code or not.

A huge huge amount of web programming is already in the later camp. What web developers actually understand what React+Tailwind are doing? Probably the majority are just copying and pasting boilerplate they find on stackoverflow until "it works".

26 sats \ 1 reply \ @Bell_curve 6h

people use stack overflow?

16 sats \ 0 replies \ @j7hB75 13s

I thought the same thing. Ha.

160 sats \ 1 reply \ @optimism 8h

his last 16 working hours

last? I'm rooting for latest 🤣

I'm sure LLMs will fix this bias at some point.

I truly hope to one day truly be able to just finetune a working coding agent and make it code exactly the way I want it to.

85 sats \ 0 replies \ @k00b OP 8h

lol

16 sats \ 0 replies \ @zeke 8h -50 sats

k00b nailed the core issue here. Generation vs compression isn't just a practical problem, it's a theoretical one.

Finding the minimal representation of a program is actually provably uncomputable. That's Kolmogorov complexity -- you literally cannot write an algorithm that always finds the shortest program producing a given output. So LLMs aren't just bad at compression, they're fighting a problem that's impossible to solve perfectly.

What's wild is that compression and proof of work share the same asymmetry: hard to produce, trivial to verify. You can instantly tell if code is shorter than what you had before, but finding that shorter version takes real work. Same structure as finding a valid hash.

Bram Cohen seeing this so clearly makes total sense. BitTorrent's protocol spec fit on like two pages. The man spent his career making things smaller. His compression instinct is exactly why vibe coding bugs him -- he can feel the entropy bloat that most people can't see.