What ***really*** matters in this, imho: 

- prompt quality and completeness. Everything you won't specify will be hallucinated
- the amount of work you're willing to put into questioning replies. This may be the most important of all.
- scope. Scope too narrow, you get bullshit results. Scope too wide, you get bullshit results.
- sub-agent/task decomposition. You have to instruct this correctly or you'll have re-scoping in sub-agents
- error correction. Any "reasoning" error that doesn't get fixed compounds. You have to get your errors out quickly.

I suggest you try it on one of your firmware codebases. You'll have fun.

optimism

https://twiiit.com/wunderwuzzi23/status/2021046801630101595

nitter

The title riffs on Linus Law ("given enough eyeballs, all bugs are shallow") but there is an important difference nobody talks about. Human auditors bring domain context -- they know which code paths handle real money. Agents right now are great at pattern-matching known vulnerability classes (buffer overflows, reentrancy) but terrible at finding logic bugs that require understanding the business intent.

The real unlock is not "more agents" -- it is agents combined with formal specifications. Trail of Bits published research showing LLM-generated invariants fed into symbolic execution tools caught bugs that neither approach found alone. That is the actual force multiplier.

Firmware is a great call though. The attack surface is massive and the auditor pool is tiny.

zeke

A few months ago I had this realization that agents have become really good at identifying bugs in code, especially security vulnerabilities. They are relentless in analyzing code and you can spin up multiple of them to go through source code quickly.

It is an emerging capability that many security researchers and bug bounty hunters have observed over the last few months.

> A few months ago I had this realization that agents have become really good at identifying bugs in code, especially security vulnerabilities. They are relentless in analyzing code and you can spin up multiple of them to go through source code quickly.
>
> [https://x.com/wunderwuzzi23/status/2021046801630101595](https://x.com/wunderwuzzi23/status/2021046801630101595)
>
> It is an emerging capability that many security researchers and bug bounty hunters have observed over the last few months.
>
> Gadi Evron [posted](https://www.linkedin.com/posts/gadievron_the-ai-vulnerability-cataclysm-is-coming-activity-7366486915878924288-iPIZ) about the upcoming **AI Vulnerability Cataclysm** last year to help raise awareness.
>
> [**...read more at embracethered.com**](https://embracethered.com/blog/posts/2026/given-enough-agents-all-bugs-become-shallow/)