In recent years, large language models (LLMs) have become increasingly capable and can now interact with tools (i.e., call functions), read docu- ments, and recursively call themselves. As a re- sult, these LLMs can now function autonomously as agents. With the rise in capabilities of these agents, recent work has speculated on how LLM agents would affect cybersecurity. However, not much is known about the offensive capabilities of LLM agents. In this work, we show that LLM agents can au- tonomously hack websites, performing tasks as complex as blind database schema extraction and SQL injections without human feedback. Impor- tantly, the agent does not need to know the vul- nerability beforehand. This capability is uniquely enabled by frontier models that are highly capa- ble of tool use and leveraging extended context. Namely, we show that GPT-4 is capable of such hacks, but existing open-source models are not. Finally, we show that GPT-4 is capable of au- tonomously finding vulnerabilities in websites in the wild. Our findings raise questions about the widespread deployment of LLMs.
pull down to refresh
4 sats \ 0 replies \ @PlebeiusG 26 Feb 2024
Inevitable.
Soon open source models will be able to do the same - given current pace of innovation.
We live in this world now.
reply
4 sats \ 0 replies \ @Atreus 26 Feb 2024
How about all those people integrating LLMs into their personal email clients? 😀
(I use the smile emoji 😀 because otherwise I'd have to use 😢)
reply