Posts

CLIProxyAPI Turns AI CLIs into a Unified API

CLIProxyAPI is an open-source proxy that converts tools like Claude Code, Gemini CLI, and Codex into OpenAI-compatible APIs. Built in Go, it provides unified routing, streaming, multi-account management, and provider abstraction for AI agent workflows. The project is gaining attention as a way to centralize access to multiple LLM ecosystems while simplifying orchestration and rate-limit management.  https://github.com/router-for-me/CLIProxyAPI

Microsoft Open-Sources RAMPART and Clarity for AI Agent Safety

Microsoft introduced two open-source tools, RAMPART and Clarity, aimed at embedding safety and security into the AI agent development lifecycle. RAMPART is a pytest-native framework that converts red-team findings into repeatable CI/CD safety tests, helping developers continuously evaluate agent behavior against adversarial and benign scenarios. Clarity focuses earlier in the process, helping teams formalize assumptions, risks, and design intent before implementation. The initiative reflects a broader “shift-left” approach to AI security, where safety becomes part of everyday engineering workflows rather than a post-deployment audit. Microsoft positions the tools as practical defenses for increasingly autonomous AI agents that can execute code, access sensitive systems, and trigger real-world actions.  https://www.microsoft.com/en-us/security/blog/2026/05/20/introducing-rampart-and-clarity-open-source-tools-to-bring-safety-into-agent-development-workflow/

TealTiger: Runtime Guardrails and Governance for AI Agents

The TealTiger project (formerly AgentGuard) positions itself as a security and governance layer for AI agents, focused on runtime policy enforcement, auditability, and compliance. The SDK supports frameworks like LangChain, CrewAI, AutoGen, and MCP-based agents, adding controls such as tool whitelisting, spend limits, human approval gates, PII protection, and signed audit trails. The ecosystem also emphasizes enterprise governance mappings for standards like CPS 230, ISO 42001, and the EU AI Act. The rebrand from AgentGuard to TealTiger preserved APIs while consolidating the Python and TypeScript SDKs under a unified identity.  https://github.com/agentguard-ai/tealtiger

OpenTaint vs Semgrep vs CodeQL: Where SAST Tools Lose the Dataflow

The article compares Semgrep, CodeQL, and OpenTaint across five increasingly complex XSS scenarios in a Java Spring application. It argues that Semgrep struggles once analysis crosses function boundaries, CodeQL weakens on deep object graphs and virtual dispatch, while OpenTaint maintains taint tracking through builders, constructor chains, and interface calls using Semgrep-style rules interpreted semantically rather than syntactically. The piece frames the core challenge of SAST as preserving dataflow visibility as software architecture accumulates abstraction layers. https://opentaint.org/blog/semgrep-vs-codeql-vs-opentaint/

Adversarial Distillation of American AI Models (NSTM-4)

This April 23, 2026 memorandum from the White House Office of Science and Technology Policy (OSTP) addresses the threat of industrial-scale adversarial distillation of U.S. frontier AI models by foreign entities, principally based in China. The document states that these campaigns leverage tens of thousands of proxy accounts and jailbreaking techniques to systematically extract capabilities from American AI models at a fraction of the cost, enabling foreign actors to release models that appear comparable on benchmarks while deliberately stripping security protocols and mechanisms that ensure models are "ideologically neutral and truth-seeking." While the U.S. supports legitimate AI distillation (producing smaller, lighter-weight models from advanced systems), the administration announces four actions: sharing threat information with U.S. AI companies, enabling private sector coordination, developing best practices to identify and mitigate industrial-scale distillation, and ex...

Skill Issues: How We Discovered Supply Chain Attack Vectors in an AI Agent Skills Marketplace

 Orca Security's research team discovered four supply chain attack primitives in a prominent AI agent skills marketplace (where developers install reusable prompt-based extensions for AI coding agents). The primitives include: (1) install count inflation — unauthenticated GET requests can trivially spoof popularity metrics; (2) non-deterministic security scanning — skills are scanned only at creation and again only when they become popular, creating a window for malicious modifications; (3) silent skill override — installing a skill with the same name as an existing one silently replaces it with no warning; and (4) no fine-grained updates — the update command refreshes all installed skills at once with no diff or changelog. The researchers demonstrated three end-to-end attack flows (bait-and-switch, nested skill injection, and delayed weaponization via update) that achieved persistent code execution through malicious skills that passed the platform's security audits. Real-world...

Inside Claude Managed Agents: Reverse-Engineering the Security Boundaries of Anthropic's Hosted Agent Runtime

This Pluto Security blog post reverse-engineers Anthropic's Claude Managed Agents (a hosted runtime where Claude runs autonomously in cloud containers with bash, file I/O, web access, and MCP tools). Key findings include: the sandbox uses gVisor with a three-layer egress control system (the same isolation engine as Claude Cowork); all outbound traffic routes through a JWT-authenticated egress proxy with TLS inspection; the JWT is readable by any process in the sandbox and reveals organization metadata, session ID, and allowed hosts; even in "limited" networking mode, six additional Anthropic infrastructure hosts (including sentry.io and a staging endpoint) are silently injected into the egress JWT beyond user configuration. Three independent layers prevent proxy bypass (no DNS, network firewall, JWT validation). The vault credential proxy is identified as the platform's strongest security property — vault secrets never enter the sandbox, structurally preventing creden...