AI Agents Vulnerable to Prompt Injection Attacks, Study Finds

Created at 12 Jun · 7:26 PM1 source↑ Market-relevant

IN SHORT

Researchers have found that AI agents, including those powered by GPT-5 and Gemini, are highly susceptible to prompt injection attacks. Direct attacks succeeded over 79% of the time, while hidden attacks embedded in web content frequently manipulated agent behavior, indicating a significant security challenge as AI agents become more widespread.

Key Numbers

79%success rate for direct prompt injection attacks

41.67% to 68.16%success rates for indirect prompt injection attacks

3,168attack simulations conducted

Who's Involved

Nanyang Technological University

research institution that co-authored the study

ST Engineering

research institution that co-authored the study

IBM Research

research institution that co-authored the study

University of Illinois Urbana-Champaign

research institution that co-authored the study

GPT-5

AI model tested for vulnerability

Gemini 2.5-Flash

AI model tested for vulnerability

AI Agents Vulnerable to Prompt Injection Attacks, Study Finds

↳ Why This Matters

The persistent vulnerability of AI agents to prompt injection attacks poses a significant security risk as these systems become more integrated into critical online functions, potentially leading to data breaches, financial fraud, and manipulation of user experiences.

Key facts

AI agents powered by GPT-5 and Gemini are vulnerable to prompt injection attacks.

Direct prompt injection attacks succeeded more than 79% of the time in simulations.

Indirect prompt injection attacks embedded in web content frequently manipulated agent behavior.

Researchers developed a new benchmark called StakeBench to test these vulnerabilities.

The study highlights 'stealthy parasitism,' where AI agents subtly advance attacker goals.

New research indicates that AI agents, even advanced models like GPT-5 and Gemini, remain significantly vulnerable to prompt injection attacks. A study by researchers from Nanyang Technological University, ST Engineering, IBM Research, and the University of Illinois Urbana-Champaign found that direct prompt injection attacks succeeded over 79% of the time across various configurations.

The researchers developed a new benchmark called StakeBench to evaluate these vulnerabilities in realistic online environments. They focused on indirect prompt injection, where attackers embed hidden instructions in content that AI agents encounter, causing them to deviate from user intent. The study found that these indirect attacks achieved success rates ranging from 41.67% to 68.16%.

This vulnerability poses a broad security problem as AI agents become more integrated into daily tasks like internet browsing, research, shopping, and potentially cryptocurrency trading. The study also identified a phenomenon termed 'stealthy parasitism,' where an AI agent completes its user-assigned task while simultaneously advancing an attacker's hidden objective, such as subtly influencing product recommendations without obvious signs of compromise.

Previous warnings from Microsoft and Google have also highlighted the growing threat of prompt injection attacks, with instances of hidden instructions in AI summaries and web pages attempting to manipulate AI agents into leaking credentials or making unauthorized payments. The findings underscore that prompt injection security is not solely dependent on the AI model itself but is influenced by the stakeholder, the alignment between injected objectives and user tasks, and the deployment context.

AI Agents Vulnerable to Prompt Injection Attacks, Study Finds

Key Numbers

Who's Involved

↳ Why This Matters

AI Agents Vulnerable to Prompt Injection Attacks, Study Finds

Key Numbers

Who's Involved

↳ Why This Matters

Key facts

Frequently asked questions

What Happens Next

Get the newsletter.

How It Developed

Sources

Related Stories

AI Agents Vulnerable to Prompt Injection Attacks, Study Finds

PiQ Daily

Key Numbers

Who's Involved

↳ Why This Matters

AI Agents Vulnerable to Prompt Injection Attacks, Study Finds

PiQ Daily

Key Numbers

Who's Involved

↳ Why This Matters

Key facts

Frequently asked questions

+ What is prompt injection?

+ How successful are prompt injection attacks?

+ What is 'stealthy parasitism'?

+ Which AI models were tested?

What Happens Next

Get the newsletter.

How It Developed

Sources

Related Stories