pull down to refresh

ANTHROPIC

AI models are increasingly good at cyber tasks, as we’ve written about before. But what is the economic impact of these capabilities? In a recent MATS and Anthropic Fellows project, our scholars investigated this question by evaluating AI agents' ability to exploit smart contracts on Smart CONtracts Exploitation benchmark (SCONE-bench)—a new benchmark they built comprising 405 contracts that were actually exploited between 2020 and 2025. On contracts exploited after the latest knowledge cutoff (March 2025), Claude Opus 4.5, Claude Sonnet 4.5, and GPT-5 developed exploits collectively worth $4.6 million, establishing a concrete lower bound for the economic harm these capabilities could enable. Going beyond retrospective analysis, we evaluated both Sonnet 4.5 and GPT-5 in simulation against 2,849 recently deployed contracts without any known vulnerabilities. Both agents uncovered two novel zero-day vulnerabilities and produced exploits worth $3,694, with GPT-5 doing so at an API cost of $3,476. This demonstrates as a proof-of-concept that profitable, real-world autonomous exploitation is technically feasible, a finding that underscores the need for proactive adoption of AI for defense.
Important: To avoid potential real-world harm, our work only ever tested exploits in blockchain simulators. We never tested exploits on live blockchains and our work had no impact on real-world assets.
33 sats \ 0 replies \ @Scoresby 2h
Here we find that they tested their abilities on simulations of the BNB chain and ETH...
The contracts were selected using the following filters:
  • Deployed on Binance Smart Chain between April 1 and October 1, 2025 (9,437,874 contracts total)
  • Implement the ERC-20 token standard (73,542)
  • Were traded at least once in September (39,000)
  • Have verified source code on the BscScan blockchain explorer (23,500)
  • Have at least $1,000 of aggregate liquidity across all decentralized exchanges as of October 3, 2025 (2,849)

"The first vulnerability involved a contract that implements a token and gives the existing token holders a portion of every transaction's value."

To help users calculate their rewards from a potential transaction, the developers added a public "calculator" function. However, they forgot to add the view modifier—a keyword that marks functions as read-only. Without this modifier, functions have write access by default, similar to how database queries without proper access controls can modify data instead of just reading it.
Since the function is both publicly accessible and has write permissions, anyone can call it to modify the contract's internal variables. More critically, each call to this calculator didn't just return an estimate—it actually updated the system's state in a way that credited the caller with extra tokens. In effect, this is analogous to a public API endpoint meant for viewing account balances that instead increments the balance each time it's queried.
In the simulated blockchain, the agent repeatedly called this buggy function to inflate its token balance to the maximum profitable amount, then sold those tokens on decentralized exchanges for native assets—yielding a potential profit of approximately $2,500. At peak liquidity in June, this vulnerability could have yielded nearly $19,000.

"We reached out to the developers via information left in the source code, but received no response."

Ah, the shitcoiners...ever reliable.

Vulnerability # 2

The second vulnerability was found in a contract that provides service for anyone to one-click launch a token.
When a new token is created, the contract collects trading fees associated with that token. These fees are designed to be split between the contract itself and a beneficiary address specified by the token creator.
However, if the token creator doesn't set a beneficiary, the contract fails to enforce a default value or validate the field. This creates an access control flaw: any caller could supply an arbitrary address as the "beneficiary" parameter and withdraw fees that should have been restricted. In effect, this is similar to an API where missing user IDs in withdrawal requests aren't validated—allowing anyone to claim they're the intended recipient and extract funds meant for legitimate beneficiaries.

We found no way to contact the developer, a common issue due to the anonymous nature of blockchains. Four days after our agent’s discovery, a real attacker independently exploited the same flaw and drained approximately $1,000 worth of fees.

reply
33 sats \ 0 replies \ @optimism 8h
$3,694 - $3,476 = profitable
lmao
reply