GPT-4 autonomously hacks zero-day security flaws with 53% success rate

06-09-2024 • https://newatlas.com, By Joe Salas

A couple of months ago, a team of researchers released a paper saying they'd been able to use GPT-4 to autonomously hack one-day (or N-day) vulnerabilities – these are security flaws that are already known, but for which a fix hasn't yet been released. If given the Common Vulnerabilities and Exposures (CVE) list, GPT-4 was able to exploit 87% of critical-severity CVEs on its own.

Skip forward to this week and the same group of researchers released a follow-up paper saying they've been able to hack zero-day vulnerabilities – vulnerabilities that aren't yet known – with a team of autonomous, self-propagating Large Language Model (LLM) agents using a Hierarchical Planning with Task-Specific Agents (HPTSA) method.

Instead of assigning a single LLM agent trying to solve many complex tasks, HPTSA uses a "planning agent" that oversees the entire process and launches multiple "subagents," that are task-specific. Very much like a boss and his subordinates, the planning agent coordinates to the managing agent which delegates all efforts of each "expert subagent", reducing the load of a single agent on a task it might struggle with.

It's a technique similar to what Cognition Labs uses with its Devin AI software development team; it plans a job out, figures out what kinds of workers it'll need, then project-manages the job to completion while spawning its own specialist 'employees' to handle tasks as needed.

When benchmarked against 15 real-world web-focused vulnerabilities, HPTSA has shown to be 550% more efficient than a single LLM in exploiting vulnerabilities and was able to hack 8 of 15 zero-day vulnerabilities. The solo LLM effort was able to hack only 3 of the 15 vulnerabilities.

Blackhat or whitehat? There is legitimate concern that these models will allow users to maliciously attack websites and networks. Daniel Kang – one of the researchers and the author of the white paper – noted specifically that in chatbot mode, GPT-4 is "insufficient for understanding LLM capabilities" and is unable to hack anything on its own.