Experimental Tool Poisoning Attacks via MCP Injection
The “mcp-injection-experiments” repository demonstrates proof-of-concept attack code for exploiting vulnerabilities in the Model Context Protocol (MCP). It includes Python scripts showing three types of tool poisoning: a direct poisoning attack that coerces an agent into leaking sensitive files, a shadowing attack that intercepts an existing trusted tool like email, and a sleeper attack that alters a tool interface mid-session (e.g. WhatsApp takeover). These experiments highlight how untrusted input or tool definitions can manipulate agent behavior without altering the agent or server code, exposing critical risks in AI agent workflows.
https://github.com/invariantlabs-ai/mcp-injection-experiments
Comments
Post a Comment