Posts

Showing posts from June, 2026

Anthropic Cybersecurity Skills: 754 AI Agent Security Skills Mapped to 5 MITRE & NIST Frameworks

This open-source repository provides the largest library of structured cybersecurity skills for AI agents, containing 754 production-grade skills across 26 security domains. Each skill is mapped to six industry frameworks—MITRE ATT&CK v19.1, NIST CSF 2.0, MITRE ATLAS, MITRE D3FEND, NIST AI RMF, and the MITRE Fight Fraud Framework (F3)—making it a unique cross-framework knowledge base. Built on the agentskills.io standard, these skills encode real practitioner workflows, enabling AI agents like Claude Code, GitHub Copilot, and Cursor to perform expert-level tasks such as threat hunting, malware analysis, and incident response by following step-by-step procedures. The library is designed for progressive disclosure, allowing agents to search all skills efficiently and load detailed guidance as needed. It is a community project, Apache 2.0 licensed, and actively maintained with contributions welcome.  https://github.com/mukul975/Anthropic-Cybersecurity-Skills

OWASP AI Security Verification Standard (AISVS)

The OWASP Artificial Intelligence Security Verification Standard (AISVS) is a community-driven catalogue of testable security requirements for AI-enabled systems, modeled after the OWASP ASVS. It provides a structured framework for developers, architects, security engineers, and auditors to design, build, test, and verify AI application security across the lifecycle. Version 1.0 includes 12 requirement chapters covering training data integrity, input validation, model lifecycle, infrastructure, access control, supply chain, model behavior, vector databases, agentic orchestration, MCP security, adversarial robustness, and monitoring. It uses three verification levels (1-3) based on risk and complements other standards like NIST AI RMF and ISO/IEC 42001 by providing technical controls. Each requirement is verifiable, testable, and implementable, with a stable versioning system and community contributions welcome.  https://github.com/OWASP/AISVS

RHC Protocol Core - Randomized Header Channel for CSRF Protection

The RHC (Randomized Header Channel) Protocol is an OWASP project that introduces dynamic entropy into HTTP headers to protect the integrity of the communication channel, addressing a new class of attack called Flow Channel Hijacking (FCHA). Unlike traditional CSRF tokens or session validation, RHC operates at the Communication Integrity Layer (CIL) to verify that the communication flow itself is legitimate and non-replicable, rather than just validating identity or individual requests. It uses randomized header selection, variable-length tokens, and decoy headers across four progressive implementation levels (Basic to Dynamic Adaptive). RHC is designed for programmatic HTTP clients (fetch, APIs, microservices, agent workflows), not standard HTML form submissions, and complements existing security controls like TLS, OAuth, and CSRF tokens. The project includes PoC implementations, an entropy analyzer, academic publications, and aims to mitigate automated attacks, replay attacks, and cha...

Meta AI Agent Account Takeover: The Risk of Missing Authorization in Agentic Workflows

This blog post from AI Village examines how missing authorization in AI agent tool-calling workflows can turn normal support actions (like email changes) into account takeover paths. The core issue is not LLM manipulation, but the absence of an authorization boundary between the user, the agent, and privileged tools. It presents three common design patterns: 1) Agent initiates but does not perform mutations (reduces direct risk but can still be abused for harassment); 2) Privileged tools require separate verification (stronger, but verification flows can still be abused); 3) A policy layer sits between agent and tools (cleanest, with centralized enforcement). The post introduces a "Maze Design" pattern with multiple security gates (intent classification, identity verification, ownership, policy, rate limiting, step-up verification) to force controlled execution paths. It emphasizes that this is a classic IAM problem amplified by agents, and advises assessing where agents can ...

AI Security Hub - Comprehensive AI Security Resource

AI Security Hub is a comprehensive, community-driven resource for AI security, structured as the "PayloadsAllTheThings + SecLists + OWASP Cheat Sheets" of the AI security world. It includes: payload collections (prompt injection, jailbreaks, RAG, agent, MCP attacks); cheat sheets (attack taxonomy, detection, hardening); hands-on security labs (including DVAP, a deliberately vulnerable AI platform); security tools (Garak, PyRIT, NeMo Guardrails, etc.); CTF challenges; learning paths from beginner to expert; and a curated research database. It covers AI attack surfaces including prompt injection, RAG security, agent security, and MCP security. The hub is designed for educational and authorized security research, with a focus on practical, actionable resources for red teaming and defense.  https://github.com/sonuoffsec/AI-Security-Hub

CVE Lite CLI - Fast, developer-friendly JS/TS dependency vulnerability scanner

CVE Lite CLI is an OWASP Lab Project that provides a fast, local-first dependency vulnerability scanner for JavaScript and TypeScript projects. It scans lockfiles (npm, pnpm, Yarn, Bun), matches against OSV advisory data, and produces concrete, copy-and-run remediation commands for direct and transitive vulnerabilities. Key features include: parent-aware transitive guidance; --fix mode that applies validated fixes and rescans; an overrides hygiene audit for stale pins; offline advisory DB; usage-aware reachability filtering; and outputs including JSON, SARIF, HTML reports, and SBOMs. It is free, requires no account, runs locally with no code leaving the machine, and has minimal dependencies. The tool focuses on actionable remediation, fitting naturally into developer workflows before code is pushed.  https://github.com/OWASP/cve-lite-cli

Controlled Agency | Issue 10: The Hidden Cost of AI

This essay explores the concern that AI, by automating tasks and removing inefficiency, may inadvertently eliminate the repetitive, mistake-filled apprenticeship that has traditionally built human expertise. The author argues that expertise is formed through accumulated error—debugging broken code, triaging false positives, writing bad drafts—and that AI's ability to bypass this struggle risks creating a generation that can produce output but lacks the judgment to evaluate it. The piece frames this as a critical governance challenge: as AI takes more action, humans may lose the opportunities to develop the intuition needed to supervise it, making the preservation of learning conditions as important as controlling the technology itself.  https://arksher.substack.com/p/controlled-agency-issue-10-the-hidden

The Next Wave of Agentic Security

This research report from MMC Ventures, based on conversations with 30+ CISOs and founders, outlines three distinct approaches to securing AI agents: Runtime AppSec (instrumenting applications to confirm exploitability), Contextual Agentic Security (steering agent behavior in real time with deterministic and probabilistic methods), and Structural Security (embedding hard boundaries at the infrastructure level). The report notes that enterprises are actively deploying agents, incidents are already happening (e.g., Meta data exposure), and the market is fragmented but moving fast. It highlights the importance of GTM strategy (developer-led vs. CISO-led) and looks for startups with depth of visibility, layered security, deliberate GTM, and native architecture for the agentic paradigm.  https://mmc.vc/research/the-next-wave-of-agentic-security/

Fork Community - Free SaaS Threat Modeling Platform (PASTA Methodology)

Fork Community is a free SaaS tool that implements the PASTA (Process for Attack Simulation and Threat Analysis) threat modeling methodology. It provides hierarchical threat libraries mapping standards like MITRE ATT&CK, CWE, CAPEC, OWASP ASVS, and real-world attack data. The platform integrates theoretical and evidence-based insights to help build risk-centric threat models. The community edition is open-source and extensible, allowing contributors to update JSON threat libraries via pull requests. The enterprise version offers advanced features. The repository serves as a collaborative space for reporting issues, suggesting enhancements, and contributing to threat data.  https://github.com/VerSprite/fork-community

Modern Malware - Spyware Skills, Hijacked Base URLs, and 1,230+ Leaking API Keys in AI Instruction Files

Mitiga Labs investigated AI instruction files (skills, hooks, AGENTS.md, MCP configs, rules) and found widespread supply-chain risks, including prompt-exfiltration tradecraft, attacker-controlled ANTHROPIC_BASE_URL overrides routing Claude traffic through MITM proxies, permission-bypass defaults, and over 1,230 hardcoded API keys across tens of services. They released Skillgate, a free community scanner with 80+ detection rules across families like direct execution, prompt manipulation, tool poisoning, credential exposure, and obfuscation. The scanner has analyzed 50,000+ files from 7,000+ public repos, using both rule-based detection and an LLM-based reviewer (Gator Agent). The post warns that AI agents treat these files as trusted instructions with zero validation, making them a new malware vector, and recommends scanning all instruction files before agents load them.  https://www.mitiga.io/blog/malware-in-ai-instruction-files-skillgate

Ponytail - Makes your AI agent think like the laziest senior dev

Ponytail is a plugin and ruleset that guides AI agents to write minimal, necessary code by following a "ladder" of checks before writing anything: does this need to exist, can stdlib do it, is there a native platform feature, can an installed dependency do it, can it be one line, and only then write the minimum that works. It is designed to be lazy, not negligent—keeping validation, security, and accessibility intact. Benchmarks on real Claude Code sessions (FastAPI + React) showed ~54% less code, ~20% cheaper, and ~27% faster compared to no-skill, while maintaining 100% safety. It installs as plugins for Claude Code, Codex, GitHub Copilot CLI, Gemini, OpenCode, and others, with always-on rules and commands like /ponytail (to set intensity levels) and /ponytail-review (to review diffs for over-engineering).  https://github.com/DietrichGebert/ponytail

Bagel - Inventory security-relevant metadata on developer workstations

Bagel is a cross-platform CLI tool that inventories security-relevant metadata on developer workstations (macOS, Linux, Windows) to improve supply-chain security. It scans for risky configurations and secret locations across 9 probes (Git, SSH, npm, environment variables, shell history, cloud credentials, JetBrains IDEs, GitHub CLI, and AI CLI tools) and uses 8 secret detectors, but crucially, it records only metadata (paths, permissions, key types, config flags) and never the secret values themselves. It outputs structured JSON or table reports and can be run in CI with --strict to fail builds on findings. A scrub command (a fork addition) removes credentials from AI CLI session logs and shell histories, replacing them with redacted markers. Bagel is privacy-focused, read-only, and open-source under the MIT license.  https://github.com/boostsecurityio/bagel

Governing AI Assets at Scale with MCP Gateway and Registry

AWS has open-sourced the MCP Gateway and Registry (Apache 2.0), a solution for governing, discovering, and securing AI assets (MCP servers, agents, skills, and custom entities) at enterprise scale. It provides a central catalog with a React-based UI and a built-in MCP server for AI agents to search and discover assets programmatically. Key features include: fine-grained access control with identity provider integration (Entra ID, Okta, Cognito); optional MCP gateway for routing, auditing, and policy enforcement; security scanning at registration using Cisco AI Defense scanners; hybrid search (vector + lexical) for discovery; federation with other registries, Amazon Bedrock AgentCore, Workday ASOR, and public catalogs; and OpenTelemetry-based observability. It supports deployment on EKS (Helm), ECS Fargate (Terraform), EC2, or local development. The post includes a case study from Expedia Group and highlights integration with AI coding assistants like Claude Code and Codex.  https:/...

Perplexity Is Open-Sourcing Bumblebee

Perplexity has open-sourced Bumblebee, a read-only scanner that checks developer machines for risky packages, extensions, and AI tool configs during supply-chain incidents. It covers language package managers (npm, PyPI, Go, etc.), AI agent configs (MCP), VS Code-family extensions, and browser extensions. Bumblebee is designed to be safe: it reads metadata files directly without executing code, invoking package managers, or reading application source files, preventing the scanner itself from triggering attacks like postinstall scripts. It supports baseline, project, and deep scan profiles, and integrates with Perplexity's workflow where threat signals are cataloged, reviewed, and then scanned across endpoints. The tool is available as open-source Go project for macOS and Linux.  https://www.perplexity.ai/hub/blog/perplexity-is-open-sourcing-bumblebee

Foundry Security Spec - An open specification for agentic AI security evaluation

Cisco has released Foundry, an open specification for building agentic AI security evaluation systems, distilling lessons from their internal operations. The spec defines eight core agent roles (Orchestrator, Indexer, Cartographer, Detector, Triager, Validator, Reporter, Coverage Guide) plus five optional extensions, a finding lifecycle, and a coordination substrate. It includes a constitution with eleven inviolable principles and ~130 functional requirements. Crucially, Foundry is designed to consume CodeGuard detection rules and operationalize a "detection-to-prevention flywheel" where missed findings generate new rules that improve both future evaluations and developer prevention via LLM coding assistants. The spec is infrastructure-neutral, deliberately lacks code, and is meant to be clarified and adapted to each organization's stack via a spec-kit workflow. It is not a turnkey scanner but a proven blueprint for building a self-improving security evaluation system.  h...

EPSS V5 Is Here

Empirical Security has released V5 of the Exploit Prediction Scoring System (EPSS), achieving a 23% improvement over the prior model in ranking vulnerabilities most likely to be exploited. The update includes model optimization, refined probability calibration, and improved exploit-code intelligence for better detection of risky repositories. EPSS predicts real-world exploitation likelihood (unlike CVSS severity scores), helping teams prioritize remediation across all 318,000+ published CVEs while reducing workload. The model is freely available and widely integrated into security products. The post notes that Anthropic recently recommended EPSS to help defenders prepare for an AI-accelerated increase in vulnerabilities.  https://research.empiricalsecurity.com/research/epss-v5-is-here

Intent as a Security Boundary

This essay argues that current access control models (like ABAC and Zero Trust) are insufficient for AI agents because they evaluate single requests rather than the full trajectory of a task. It introduces "intent governance" as a new security layer that compares an agent's registered purpose (declared at design time) against its runtime actions and executed scope. The author identifies three layers of intent—registered, declared, and executed—and three failure patterns: prompt injection, delegated intent poisoning, and intent drift. The proposed solution includes scope binding, drift detection, and purpose expiry (task-completion revocation). The piece concludes that intent governance is an architectural addition, not a policy tweak, and treats purpose as a measurable security primitive to detect misalignment even when authentication and permissions are valid.  https://puneetbhatnagar.substack.com/p/intent-as-a-security-boundary

Omnigent: A Meta-Harness to Combine, Control, and Share AI Agents

Databricks introduces Omnigent, an open-source meta-harness that sits above existing AI agent frameworks to unify, govern, and orchestrate multiple agents such as Claude Code, Codex, and custom tools. It enables composition across agents without rewriting code, enforces policy-based control at a higher layer, and supports real-time collaboration and session sharing. Released under Apache 2.0, it targets the fragmentation of agent workflows by providing a single control plane for managing diverse AI agents and workflows.  https://www.databricks.com/blog/introducing-omnigent-meta-harness-combine-control-and-share-your-agents

Measuring LLMs' impact on N-day exploits

This research from Anthropic's Frontier Red Team evaluates how LLMs accelerate the development of N-day exploits (vulnerabilities patched in some systems but not all). Across 18 recent Firefox security patches, Claude Mythos Preview autonomously built 8 working code-execution exploits, with the first exploit arriving within an hour. On 21 Windows kernel patches (where source code is unavailable), it produced 8 full privilege escalation chains from low-privilege user to SYSTEM, at an average cost of about $2,000 per exploit. The findings show that models can weaponize patches in hours rather than the weeks historically required, compressing the patch gap dramatically. The post concludes that this shifts the threat landscape, particularly for slow-to-patch systems, and recommends faster patching, memory-safe languages, and stronger mitigations as defensive measures https://www.anthropic.com/research/n-days

What Israeli dominance in cyber means for non-Israeli cybersecurity founders

This analysis explores Israel's dominance in cybersecurity startups, attributing it to cultural factors like strong founder networks from military service, active early-stage VCs who provide hands-on support, and a flywheel effect of successful exits and reinvestment. However, it also notes challenges facing Israeli startups, including AI reducing technical moats, crowded markets making differentiation harder, and the over-reliance on established playbooks becoming a liability. The piece advises non-Israeli founders that success is still achievable through authentic branding, customer proximity in the US, and embracing different approaches. It concludes that while Israel excels at building companies for acquisition, the largest independent security firms remain American, and the market can support many winners.  https://ventureinsecurity.net/p/what-israeli-dominance-in-cyber-means

GitHub Actions Security Checklist for Supply Chain Attacks

This practical checklist from Corgea provides actionable steps to secure GitHub Actions workflows against supply chain attacks. Key priorities include: setting default GITHUB_TOKEN permissions to read-only, pinning third-party actions to full commit SHAs, avoiding pull_request_target for public repositories, treating all untrusted input (PR titles, issue bodies, branch names) as hostile, and using OIDC instead of long-lived cloud secrets. The full checklist covers locking down organization defaults, making workflow permissions explicit, preventing script injection, reducing secret exposure, hardening runners, securing artifacts and caches, and adding continuous detection with tools like zizmor and OpenSSF Scorecard. The guide emphasizes that workflow YAML is part of the trusted computing base and provides a practical rollout plan for hardening repositories.  https://corgea.com/learn/github-actions-security-checklist

Securing CI/CD in an agentic world: Claude Code GitHub action case

This Microsoft Threat Intelligence blog post details a vulnerability discovered in Anthropic's Claude Code GitHub Action, where the Read tool could access sensitive /proc files (like /proc/self/environ) and expose workflow secrets, including the ANTHROPIC_API_KEY. The issue arose because the Read tool operated outside the Bubblewrap sandbox used for Bash, and a prompt injection could bypass safety filters and GitHub's secret scanner by laundering the key. Anthropic mitigated this in Claude Code 2.1.128 by blocking access to sensitive /proc files. The post provides actionable hardening guidance, including applying the "Agents Rule of Two" (never combine untrusted input, secret access, and external communication in one workflow), enforcing least privilege, hardening system prompts, and monitoring for suspicious activity. It also maps the attack to MITRE ATLAS techniques and emphasizes that AI workflows processing untrusted content must be treated as high-risk.  https://...

How Semgrep Cut Taint Analysis Time by 75%

This blog post details how Semgrep redesigned its taint analysis engine to run once instead of twice, achieving up to 75% faster full scans. The original interfile analysis computed taint configurations twice, costing significant CPU time. By refactoring the code, leveraging OCaml 5.0's multicore support for parallelization, and thoroughly testing against thousands of tests and production benchmarks, the team reduced P95 scan times from 10 to 7.5 minutes, made P99 times more consistent, and dramatically lowered max scan times. Some large repositories saw over 3x speedups. The post highlights the importance of performance profiling, validation through A/B experiments, and the benefits of parallelizing previously sequential work.  https://semgrep.dev/blog/2026/how-we-cut-semgreps-taint-analysis-time-by-75-percent/

Well-architected best practices for software supply chain security

This AWS blog post outlines best practices, aligned with the AWS Well-Architected Framework, to protect against software supply chain attacks like Shai-Hulud. Key recommendations for package consumers include: using temporary credentials and least privilege to limit exposure; implementing defense in depth with multi-factor authentication, multi-party approval workflows, and artifact signing (using AWS Signer) to prevent sprawl; centralizing dependency management with AWS CodeArtifact and Amazon ECR; scanning dependencies throughout the lifecycle with Amazon Inspector and community threat intelligence (including MAL-IDs); and configuring robust logging and monitoring with CloudTrail, GuardDuty, and Security Hub to detect anomalous activity. The post emphasizes layered controls to reduce risk from compromised credentials and malicious packages.  https://aws.amazon.com/pt/blogs/security/well-architected-best-practices-for-software-supply-chain-security

Detecting and removing dangerous secrets on dev workstations

This post presents an open-source approach using Bagel (a workstation secret scanner) and Fleet (an osquery-based platform) to detect and manage plain-text secrets on developer machines. The author's proof-of-concept, "Fleebag," automates scanning via a macOS LaunchAgent, parses results with Fleet queries, and enforces compliance through policies. The goal is to prevent credential theft by infostealers, especially for developers with access to critical projects, and to complement existing endpoint security with proactive, automated detection and remediation. https://recyclebin.zip/posts/2026-05-25-secret-scanning-fleet-bagel

Introducing the Agent Governance Toolkit: Open-source runtime security for AI agents

Microsoft has released the open-source Agent Governance Toolkit (MIT license) to provide runtime security and governance for autonomous AI agents. The toolkit addresses all 10 OWASP Agentic AI Top 10 risks with sub-millisecond policy enforcement. It consists of seven packages that apply proven patterns from operating systems, service meshes, and SRE practices to AI agents, including a stateless policy engine (Agent OS), cryptographic identity and trust scoring (Agent Mesh), dynamic execution rings (Agent Runtime), SLOs and circuit breakers (Agent SRE), compliance verification (Agent Compliance), plugin supply-chain security (Agent Marketplace), and governance for RL training (Agent Lightning). The framework-agnostic toolkit works with LangChain, CrewAI, Microsoft Agent Framework, and others across Python, TypeScript, Rust, Go, and .NET. It is designed for incremental adoption and community stewardship, with over 9,500 tests and SLSA-compliant builds.  https://opensource.microsoft.c...

Mapping AI-Enabled Cyber Threats: Insights from the LLM ATT&CK Navigator

This research report from Anthropic's Frontier Red Team analyzes 832 banned accounts over one year to map how threat actors misuse AI for cyber operations. Key findings include: the percentage of medium- to high-risk actors jumped from 33% to 56% in under a year, with growth concentrated in harmful activities like lateral movement and credential dumping; agentic scaffolding enables more autonomous, dangerous attacks, as seen in a cyber espionage campaign that achieved a maximum risk score despite using a comparable number of techniques to lower-risk actors; and the MITRE ATT&CK framework lacks categories for autonomous orchestration behaviors. The report introduces the AI Risk Enablement Score (ARiES) and the LLM ATT&CK Navigator to score actors. It concludes that defenders must evolve threat vocabularies to capture agentic behaviors and use AI with the same urgency as attackers.  https://www.anthropic.com/research/attack-navigator

CrowdStrike Shadow AI Visibility Service: Gaining Full Control Over Enterprise AI Usage

The article introduces CrowdStrike’s Shadow AI Visibility Service, designed to help organizations detect, inventory, and govern unapproved AI usage across endpoints, cloud, and SaaS environments. It highlights how rapidly AI adoption is outpacing traditional security visibility, creating blind spots where employees and systems use AI tools without oversight. The service leverages Falcon telemetry to uncover hidden AI apps, agents, and extensions, provides evidence of real AI interactions (including prompts and outputs), and prioritizes risks to reduce exposure and improve governance of enterprise AI adoption.  https://www.crowdstrike.com/en-us/blog/crowdstrike-shadow-AI-visibility-service/

Phoenix Blue: Agentic Vulnerability Intelligence for Real-Time Cyber Risk Analysis

Phoenix Blue is an AI-driven vulnerability intelligence platform that enhances traditional CVE management with automated analysis, risk scoring, and threat intelligence. The platform aggregates vulnerability data from multiple sources, uses AI models to classify risks, detects emerging threats before formal disclosure, and monitors malicious software packages across major ecosystems. It aims to help security teams prioritize remediation based on real-world exploitability rather than vulnerability severity alone.  https://phxintel.security/

Identity and Access Management Whitepaper (CNCF)

This whitepaper from the CNCF TAG Security and Compliance provides practical guidance on implementing Identity and Access Management (IAM) in cloud native environments. As distributed, dynamic architectures make identity the new security perimeter, the paper covers modern authentication for users and workloads, zero-trust architectures, authorization best practices using PEP/PDP patterns, and the role of SPIFFE for secure workload identity. It offers reference patterns and implementation advice for architects, platform engineers, and security practitioners to build secure and scalable cloud native systems.  https://www.cncf.io/blog/2026/06/04/identity-and-access-management-whitepaper

The Anatomy of an Agent Harness

This blog post defines an "agent harness" as every piece of code, configuration, and execution logic that surrounds a model to turn it into a useful agent (Agent = Model + Harness). It argues that harness engineering is key to enabling desired agent behaviors by providing durable storage (filesystem), general-purpose tools (bash/code execution), safe sandboxes, memory and search capabilities, and strategies to combat "context rot" (like compaction and skill loading). The post details how these components compound to support long-horizon, autonomous execution through planning, self-verification, and tools like git. It concludes by discussing the future coupling of model training and harness design, noting that while models will absorb some harness functions, optimising the harness remains a powerful way to improve agent performance.  https://www.langchain.com/blog/the-anatomy-of-an-agent-harness

Using LLMs to Secure Source Code - Best Practices from Anthropic

This guide from Anthropic shares best practices for using LLMs like Claude Opus to build threat models, discover vulnerabilities, and then verify, triage, and patch them. It outlines a six-step find-and-fix loop: 1) Define a threat model to establish trust boundaries and scope; 2) Build a sandbox environment for safe agent execution and proof-of-concept verification; 3) Run parallel discovery agents with rich context and simple prompts; 4) Use independent verifier agents to filter out non-exploitable findings; 5) Triage by deduplicating findings and ranking by severity based on reachability and impact; and 6) Patch by writing tests, fixing root causes, and validating fixes. The key takeaway is that discovery is now easily parallelizable, shifting the bottleneck to verification, triage, and patching, which can be streamlined with structured workflows, independent verification, and automated patch validation.  https://claude.com/blog/using-llms-to-secure-source-code

MIT AI Risk Repository - Authoritative Data and Frameworks for AI Risk Management

The MIT AI Risk Repository provides authoritative data, frameworks, and tools to help organizations identify, prioritize, and manage AI risks. It features a comprehensive living repository of AI risks, a database of real-world AI incidents mapped by severity and impact, and a global mapping of AI laws and policies against specific risk domains. The AI Risk Navigator connects these datasets for integrated exploration. The platform is used by governments, industry leaders, and academics worldwide, offering evidence-based resources to support risk assessment, governance, and policy development across the AI value chain.  https://airisk.mit.edu/

HTTP/2 Bomb Attacks Put Telcos, Healthcare Orgs at Risk

This article details the "HTTP/2 Bomb" vulnerability (CVE-2026-49975), a high-severity denial-of-service (DoS) exploit discovered via AI that chains together two HTTP/2 features—HPACK header compression and flow control—to create massive amplification attacks. An attacker with minimal resources can overwhelm vulnerable servers (including nginx, Apache, Envoy, and Microsoft IIS) by sending small requests that force the server to expand memory usage while blocking responses. While patches are available from most vendors, over 880,000 websites remain potentially vulnerable. The exploit disproportionately impacts industries with large web footprints, particularly telecommunications (25% of vulnerable servers), IT (18%), and healthcare (17%). Organizations are urged to patch immediately to mitigate the risk.  https://www.darkreading.com/vulnerabilities-threats/http-2-bomb-attacks-telcos-healthcare

The Beginning of the End of Social Engineering

This opinion piece argues that AI-native operating systems, like those being integrated by Google and Apple, could fundamentally change the fight against social engineering. It explains that social engineering has historically succeeded due to three weaknesses: the burden of authentication on users, the lack of cross-context understanding in systems, and the speed that forces quick user decisions. By operating across all apps and data, these new OS-level AIs can continuously authenticate users, detect coordinated manipulation attempts in real time, and intervene during or after an attack. This shifts the responsibility from user vigilance to system vigilance, potentially making social engineering attacks more costly and complex, similar to how widespread antivirus changed the economics of computer viruses.  https://www.darkreading.com/cyberattacks-data-breaches/beginning-end-social-engineering

SkillsGuard - Static Security Scanner for AI Agent Skill Packages

SkillsGuard is a static security scanner that detects malicious AI agent skill packages (SKILL.md files and bundled scripts) before they execute. With 151 regex-based detection rules across 15 categories—including prompt injection, command injection, exfiltration, and obfuscation—it decodes base64, hex, and URL-encoded payloads recursively to uncover hidden threats. It offers a CLI, MCP server integration for Claude, pre-commit hooks, a free cloud API, and outputs JSON or SARIF for CI/CD pipelines, all with zero runtime dependencies beyond Node.js.  https://github.com/Teycir/SkillsGuard

OWASP Secure Pipeline Verification Standard (SPVS)

The OWASP SPVS is a framework that integrates security across the entire software delivery lifecycle—Plan, Develop, Integrate, Release, and Operate. It provides a tiered maturity model with actionable controls to secure code, artifacts, and build environments. Adaptable to cloud, hybrid, and on-premises setups, it helps organizations progressively improve pipeline security, ensure compliance, and embed a security-first culture within DevSecOps practices. https://github.com/OWASP/www-project-spvs

Mapping Application Vulnerabilities to MITRE ATT&CK for Threat-Based Risk Management

The article explains how linking application vulnerabilities to MITRE ATT&CK techniques helps organizations move beyond CVSS-based prioritization and understand real attacker behavior. By mapping vulnerabilities to exploitation methods, security teams can connect AppSec findings with threat intelligence, detection rules, and defensive controls. This approach improves vulnerability prioritization, strengthens collaboration between developers and SOC teams, and enables a more threat-informed cybersecurity strategy.  https://securityboulevard.com/2026/06/mapping-application-vulnerabilities-to-mitre-attck/

Visa Vulnerability Agentic Harness (VVAH) - Agentic SAST Pipeline

Visa Vulnerability Agentic Harness (VVAH) is an open-source tool from Visa that uses frontier AI models for autonomous vulnerability discovery in code. Built on lessons from Anthropic's Project Glasswing, it employs a three-phase, nine-stage pipeline that combines threat modeling, multi-agent deterministic voting to reduce false positives, and structured triage to accelerate the path from discovery to fix. The tool supports multiple AI backends (Anthropic Claude, OpenAI) and is designed to be configurable via reusable "skills" for each pipeline stage. It outputs findings in both markdown reports and SARIF format. While findings are AI-generated and require human review, the tool aims to improve the Mean Time to Adapt (MTTA) for security fixes. The project is not accepting external contributions and is intended for authorized use only on owned or permitted code. https://github.com/visa/visa-vulnerability-agentic-harness

AI Deep SAST - LLM-powered deep static analysis for CI/CD

AI Deep SAST is an open-source tool from Cisco that combines traditional static analysis (Semgrep) with LLM-based vulnerability detection for CI/CD pipelines. It offers two scan modes: **fast scan** (Semgrep + local Foundation-Sec-8B model, ~5 min) and **deep scan** (tree-sitter indexing + frontier LLMs like GPT-4o or Claude, ~30 min–14 hr). Features include OWASP Top 10 mapping, CWE mapping, CVSS scoring, attack vectors, remediation code, and defence-in-depth recommendations. The tool uses smart LLM skipping for deterministic rules, severity-based filtering, and multiple report formats (Markdown, JSON, JUnit XML). It includes custom secret detection for config files, supports 15 programming languages via tree-sitter, and provides a Jenkins CI/CD pipeline with quality gates. The local fast scan keeps code on-premises (no external API calls), while deep scan sends redacted code to configured LLM providers. Optimised for Apple Silicon with Metal GPU acceleration, it requires ~16 GB RAM a...

Jinn Guard — Enterprise Semantic Firewall

Jinn Guard is an **asynchronous, kernel-aware semantic firewall** that enforces mathematical safety constraints on autonomous AI agents before any tool executes. It intercepts agent intents and validates them through a **Z3 SMT solver pipeline**, checking state transitions and risk ceilings against formalized compliance models. Built for AlphaOS, it operates over UNIX domain sockets and integrates **eBPF kernel telemetry** for zero-trust isolation and anti-replay protection. Key features include HMAC-SHA256 authentication, SO_PEERCRED process identity, per-agent intent allowlists, sequence quotas, replay attack protection, behavioral drift detection, and a hash-chained audit log. Performance benchmarks show ~6,500 decisions/second with median latency of 257 µs. The system blocks 12 attack types (replay, signature forgery, injection, etc.) with zero fail-open. It includes a Python SDK, systemd service, and installer. The repository is a **validated research prototype** (not enterprise-G...

AI Agent Security Hits Its Reckoning: Prompt Injection May Be a Permanent Flaw, Not a Patchable Bug

This article argues that prompt injection in LLM-based agents is a **structural, unpatchable flaw** rather than a temporary bug. Citing OWASP’s June 2026 State of Agentic AI Security report, it explains that language models cannot distinguish trusted commands from untrusted data because all inputs are processed as a single token stream—no architectural privilege boundary exists. The piece highlights real incidents: an autonomous bot (“hackerbot-claw”) poisoning PyPI with backdoored LiteLLM (47,000 downloads) and CVEs like CVE-2026-2256 (MS-Agent RCE), CVE-2026-22708 (Cursor), and malicious MCP servers. It introduces **Simon Willison’s “lethal trifecta”** (private data access + untrusted content exposure + external communication) as the condition enabling data exfiltration, and **Meta’s “Agents Rule of Two”** (an unsupervised agent may hold at most two of three). Defenses are containment-based (least privilege, human-in-the-loop, strict scoping), not cures. Regulatory pressure (DORA, NI...

elastic/cicd-abuse-detector: CI/CD Abuse Detection

This GitHub repository hosts a **prototype CI/CD abuse detector** from Elastic Security Labs. It provides drop-in CI templates that use an LLM (Claude) to detect suspicious changes to pipelines, workflows, and automation configurations – specifically targeting attacks where stolen credentials are used to modify workflows and harvest CI secrets. The detector works by filtering changed CI/CD files, generating per-file diffs, enriching them with regex-based prescreen labels, having an LLM analyze the diff for credential-harvesting threats, then alerting (Slack, issues, Elasticsearch) and optionally failing the PR based on severity thresholds. It includes reference templates for GitHub Actions, GitLab CI, and Azure DevOps. The repository is **not an officially supported Elastic product** – users are expected to fork and customize the templates, prompts, and schemas for their own environment. Documentation covers architecture, threat model, setup per platform, alerting, and testing.  ht...

Policy as Code: From Documents to Machine Intelligence

This blog post argues that traditional static policy documents cannot keep pace with modern multi-cloud, ephemeral, and continuous deployment environments. It presents **Policy as Code (PaC)** as a discipline that transforms policies into machine-readable, version-controlled, continuously enforced and auditable rules. PaC operates across three areas: modernizing policies, embedding validation into development/operations, and enabling continuous assurance. Key enablers include **OSCAL** (for machine-readable control definitions, profiles, and system plans) and **Open Policy Agent (OPA)** (for enforcement using Rego rules). The Compliance-to-Policy (C2P) bridge helps convert existing OSCAL artifacts into enforcement formats. A worked example (MFA for privileged accounts) traces a control from OSCAL catalog through OPA enforcement to evidence generation. The post concludes that **agentic AI** can accelerate PaC adoption by automating policy translation, rule testing, and remediation triag...

Infosys completes CMMI AI maturity pilot assessment

Infosys has completed the CMMI AI Maturity pilot assessment, becoming one of the first organizations globally to do so. Conducted by the CMMI Institute with support from KPMG, the assessment evaluated how large enterprises govern and apply AI across business and engineering environments. The pilot focused on AI-augmented software development, maintenance, testing, and support, assessing productivity, quality, governance, and responsible AI practices. Infosys contributed real-world insights from its large-scale delivery operations (including Infosys Topaz tools) to help refine the framework for enterprise use. The model addresses alignment with business outcomes, consistency, risk management, and accountability in AI-driven decisions. Executives highlighted the milestone as defining responsible, enterprise-grade AI adoption at scale.  https://securitybrief.co.nz/story/infosys-completes-cmmi-ai-maturity-pilot-assessment

LiteLLM Flaw CVE-2026-42271 Exploited in the Wild, Chains to Unauthenticated RCE

CISA added a high-severity command injection flaw (CVE-2026-42271, CVSS 8.7) in BerriAI LiteLLM to its KEV catalog due to active exploitation. The vulnerability allows any authenticated user to execute arbitrary commands via the `/mcp-rest/test/connection` and `/mcp-rest/test/tools/list` endpoints. Security researchers chained it with a Starlette host header validation bypass (CVE-2026-48710, CVSS 6.5) to achieve unauthenticated remote code execution (combined CVSS 10.0). This chain enables attackers to run commands, steal API keys and secrets, move laterally, and compromise downstream systems. Users should update LiteLLM to version 1.83.7+ and Starlette to 1.0.1+, block the affected endpoints, restrict network access, rotate credentials, and review logs.  https://thehackernews.com/2026/06/litellm-flaw-cve-2026-42271-exploited.html

LangGraph Flaw Chain Exposes Self-Hosted AI Agents to Remote Code Execution

Security researchers disclosed three patched flaws in LangGraph, including a critical chain enabling remote code execution (RCE). The vulnerabilities include SQL injection (CVE-2025-67644), unsafe msgpack deserialization (CVE-2026-28277), and RediSearch query injection (CVE-2026-27022). Exploitation requires attacker-controlled filter input in self-hosted deployments using SQLite or Redis checkpoints, leading to RCE via the `get_state_history()` endpoint. The managed LangSmith platform is unaffected. Users are advised to apply fixes, enable authentication, enforce network segmentation, and follow least privilege principles. https://thehackernews.com/2026/06/langgraph-flaw-chain-exposes-self.html

npm-scan — npm supply chain security scanner

npm-scan detects obfuscated payloads, credential stealers, conditional triggers, sandbox evasion, and worm propagation that npm audit, Snyk, and Socket miss. It includes detection for major 2026 campaigns (Megalodon, Mini Shai-Hulud, TrapDoor, node-ipc, typosquatting, axios poisoning), plus HuggingFace impersonation, VSIX extensions, and Python CVE-2026-48710. Features: SBOM, SARIF, policy-as-code, HTML/PDF reports, Docker, GitHub Action, zero telemetry. Free tier includes all detectors; premium adds PDF and SIEM export.  https://github.com/lateos-ai/npm-scan

LLMjacking: what these attacks are, and how to protect AI servers

This article describes LLMjacking, a rapidly growing threat where attackers hijack private AI server resources to run their own prompts and tasks, avoiding compute costs. Based on a honeypot experiment with a Raspberry Pi masquerading as a high-performance AI server running Ollama, LM Studio, and MCP tools, the researcher observed that Shodan discovered the server within three hours, and over one month it received 113,000 requests from thousands of unique IPs. 23% of traffic targeted AI capability discovery and exploitation. Attackers did not attempt root access or code execution; instead, they focused on resource siphoning: parsing technical documentation, writing erotic novels, processing social media data, and using the compromised server as an API proxy to call Anthropic models. The article notes standardized reconnaissance tools (LLM-Scanner) that evolved during the experiment, plus systematic hunting for exposed .env files. Defensive measures include: binding LLM servers only to ...

PromptZero — Transparent Claude API proxy that anonymizes PII before it leaves your environment

PromptZero is a local proxy that detects and replaces sensitive data (IPs, hostnames, emails, credentials, names, national IDs, etc.) in prompts sent to Claude API, then restores real values in responses. It uses NLP (spaCy/Presidio) and regex patterns, substitutes with IANA-reserved ranges (RFC 5737/3849/2606), maintains session mapping tables, and supports pentest mode to disable name/organization detection. Runs via Docker or native install, works as a drop-in replacement for api.anthropic.com, and can route Claude Code CLI through it. Includes demo datasets, document summarization, and pentest report generator examples. From pentesters to pentesters. MIT license.  https://github.com/openbashok/promptzero

NomShub: Weaponizing Cursor's Remote Tunnel Through Indirect Prompt Injection and Sandbox Breakout

This article discloses NomShub, a critical vulnerability chain in the Cursor AI code editor that allows a malicious repository to silently hijack a developer's machine with no user interaction beyond opening the repository. The attack combines three elements: indirect prompt injection (malicious instructions hidden in a README file), a sandbox escape via shell builtins (Cursor's command parser is blind to commands like export and cd, allowing escape from workspace restrictions), and Cursor's built-in remote tunnel feature (cursor-tunnel) which provides authenticated shell access through Microsoft's Dev Tunnels infrastructure. The AI agent autonomously executes a multi-step chain: escaping the sandbox using a one-line command, establishing persistence by writing to ~/.zshenv, terminating existing tunnel processes, clearing cached GitHub credentials, starting a new tunnel, capturing the GitHub device authorization code, and exfiltrating it to an attacker-controlled server...

From Exploit Code to Production Detection: Building a CVE-2026-31431 (Copy Fail) detection with Agents

This article details CVE-2026-31431 (Copy Fail), a high-severity Linux kernel vulnerability (CVSS 7.8) that allows any unprivileged local user to corrupt page cache memory and escalate privileges to root. The exploit chains three kernel mechanisms: AF_ALG sockets (exposing kernel crypto to unprivileged users), the authencesn AEAD template, and splice() for zero-copy data movement. By splicing a readable target file (e.g., a setuid binary like /usr/bin/su or PAM configuration files) into a crafted AF_ALG decrypt operation, the attacker can write controlled bytes directly into the file's page cache without touching the on-disk file, avoiding normal file-write detection. The corruption persists only in memory, and when the corrupted setuid binary executes, the attacker gains root privileges. The vulnerability affects kernel versions 4.14 through 6.19 and 7.0 RCs, and active exploitation has been confirmed in the wild. Datadog's detection uses chained Workload Protection rules that...

Skill Issues: Compromising Claude Code with malicious skills & agents -- Part 1

This technical blog post demonstrates how attackers can compromise Claude Code, Anthropic's AI coding assistant, through malicious skill files and sub-agents. Skills are markdown files that instruct LLMs on how to perform specific tasks, and thousands of users share them on GitHub and skills.sh without proper vetting. The author shows that with default settings, a skill containing frontmatter with "allowed-tools: Bash(*)" and a dynamic context command (using !`command`) can execute arbitrary bash commands, including a reverse shell, without any user prompt or LLM reasoning. Sub-agents, which can run with "bypassPermissions" mode, can also execute malicious commands, such as installing a backdoored npm package. The article notes that while Claude Code has complex permission and command-parsing logic, the LLM itself may reject obviously malicious commands, but dynamic context inputs bypass this reasoning entirely. Defensive measures include denying Bash commands i...

Claude Code has an MCP security problem — and your developers are already using it

This opinion piece warns that Anthropic's AI coding assistant, Claude Code, has a critical security vulnerability involving the Model Context Protocol (MCP). Researchers at Mitiga Labs demonstrated an attack chain where a malicious npm package with a post-install hook rewrites a single configuration file (~/.claude.json), which controls how Claude Code routes MCP traffic. This redirects authenticated requests and OAuth tokens (stored in plaintext) to attacker-controlled infrastructure instead of legitimate services like Jira, Confluence, or GitHub. The attacker then holds valid long-lived bearer tokens. The attack is difficult to detect because provider audit logs show Anthropic’s IP range and a valid user session — nothing appears wrong, but the user did not initiate the actions. Anthropic responded that the issue was out of scope, reasoning that prior code execution requires user consent to install the package, and as of this writing no patch exists. The article notes previous vu...

The Intersection of Encryption and AI - Schneier on Security

In this reflective piece from June 2026, Bruce Schneier revisits his 2010 argument that cryptography is ill-suited to solve major network security problems. He explains that while cryptography has inherent mathematical properties favoring defenders—such as key length increases benefiting defenders more than attackers—computer security as a whole is a fragile, fast-moving arms race where advantages can shift overnight. Schneier notes that cryptography is necessary but not sufficient for cybersecurity, as it must be implemented in software, hardware, networks, and operated by users, each step introducing vulnerabilities. Turning to AI, he observes that artificial intelligence is not advancing cryptography but is changing cybersecurity dramatically. AI has demonstrated superhuman ability to find software vulnerabilities and write exploits, with similar patch-writing capabilities likely emerging. This development has profound implications for both attackers and defenders, and Schneier conc...

Corporate Insiders and How They Operate

This article explains that some of the most damaging threats to a company come not from external hackers but from insiders—people already inside the organization with legitimate access. An insider can be an employee, contractor, vendor, or former worker whose access was never removed. Threats fall into three categories: malicious insiders who intentionally steal or sabotage for money, revenge, or ideology; negligent insiders who cause harm through carelessness like clicking phishing links or sharing passwords; and compromised insiders whose accounts are taken over by external attackers. Modern insiders often exploit cloud services, collaboration tools, and remote work environments, gradually moving small amounts of data to avoid detection. Motivations include financial gain, recruitment by organized crime, or workplace disputes. Detection requires analyzing system logs, access patterns, and behavioral changes, while prevention relies on least-privilege access, continuous monitoring, au...

AI Risk Quadrant for Agent Security – AIRQ Report 2026

This report by Adversa AI introduces the AI Risk Quadrant (AIRQ) Framework, a quantitative security framework for evaluating AI agents across 10 enterprise archetypes (e.g., coding, browser, workflow, business process agents). Based on scoring 100 agents on Attack Surface, Blast Radius, and Defense Controls, the findings reveal that: only 11 percent of agents are both capable and well-defended (Fortified Leaders); 40 percent of agents fall into Exposed Giants (high capability, weak defenses); the lethal trifecta (private data access plus untrusted input plus outbound action) is nearly universal, meaning one hostile document can compromise most agents; 83 percent of claimed defenses lack public verification; and tool execution without sandboxing explains 76 percent of blast radius variance. The report provides quadrant visualizations, class-by-class security deep-dives, and strategic advice including requiring execution isolation as a procurement gate, tightening identity and egress con...