high severityAutoGen (multi-agent web agents)

Agents perform unauthorized actions (data exfil, malicious code exec) after processing malicious webpages/emails, despite safe system prompts.

Root cause

LLMs in AutoGen agents treat untrusted external content (e.g., scraped webpages) as instructions, allowing override of system prompts via hidden malicious text, especially in web agents like Magentic-One WebSurfer.

autogenprompt-injectionmagentic-onellm-agentweb-surfer

Citations

https://microsoft.github.io/autogen/stable/user-guide/agentchat-user-guide/magentic-one.html