ToolJack: Hijacking AI Agent Perception via Bridge Exploitation

April 15, 2026

ToolJack is an attack methodology that manipulates the trust boundary between AI agents and their tools. After achieving local compromise, an attacker can extract session credentials, pivot across devices, and intercept the bridge protocol between Claude Desktop and its browser extension. This enables Phantom Tab Injection (fabricating tabs only the agent sees) and Tool Relay Spoofing (replacing legitimate tool responses with attacker-controlled data), leading to Remote Listener Indirect Prompt Injection—actively constructing a poisoned environment around the agent. Testing showed complete control over the agent's perceived context, but Anthropic's model-level safety alignment consistently blocked autonomous code execution. The research concludes that infrastructure requires cryptographic tool attestation and device-bound tokens, while model alignment serves as a critical last line of defense.

https://www.preamble.com/blogs/tooljack-hijacking-an-ai-agents-perception-through-bridge-protocol-exploitation

Search This Blog

Appsec adventures

ToolJack: Hijacking AI Agent Perception via Bridge Exploitation

Comments

Post a Comment

Popular posts from this blog

Prompt Engineering Demands Rigorous Evaluation

SecObserve: Simplified Vulnerability and License Management for CI/CD Pipelines

OWASP ZAP 2.16.0 Introduces Key Updates and Enhancements