shield auto-audit
download code extension

Autonomous security auditor for Claude Code

Point auto-audit at a GitHub repo. It scans for vulnerabilities, triages false positives, writes a proof of concept, fixes each confirmed bug in its own PR, independently reviews the fix, and merges when the review is clean. It keeps doing that until the queue is drained, then rescans, until you stop it.

Security v0.10.0 MIT

audited by auto-audit

claude plugin install auto-audit@wrxck-claude-plugins

Part of the Claude Code Plugins collection.

How a tick flows

Each /auto-audit:tick advances one finding by one lifecycle stage, then returns. The /loop wrapper reinvokes the tick until the queue drains. One stage per tick means the independent-review checkpoint is a real checkpoint, and the loop is cheap to interrupt.

discovered      → triaging      (security-triage subagent)
triaging        → confirmed  |  false_positive
confirmed       → poc_written  (poc-builder subagent)
poc_written     → fix_committed (security-fixer subagent; new branch, commit)
fix_committed   → pr_opened    (branch push + gh pr create)
pr_opened       → reviewing    (security-reviewer subagent, independent context)
reviewing       → pr_approved  |  pr_rejected
pr_approved     → merged       (squash; if merge_policy=auto)
pr_approved     → skipped      (if merge_policy=manual — default)
pr_rejected     → confirmed    (fixer retries, bounded by max_fix_iterations)

Safety model — two layers

Every safety claim is enforced in two places: an instruction in the relevant agent's role card and a programmatic guard that runs before the action commits and refuses if violated. The LLM layer covers judgement calls; the programmatic layer covers everything mechanically checkable.

We do not claim 100% safety. An LLM is not a hardened security boundary, and some properties genuinely cannot be mechanically verified. What the programmatic layer does guarantee is that anything expressible as "refuse if input is not in the allowed set" stays refused. The guard test suite (scripts/test-guards.sh) exercises every guard — currently 79/79 passing.

ClaimLLM-layer instructionProgrammatic guard
Never push to the default branchfixer: "never touch the default branch locally"guard_autoaudit_branch in push_branch; commit_all refuses non-autoaudit/* HEAD; guard_not_default_branch
Never force-push outside autoaudit/*guard_autoaudit_branch + --force-with-lease only
Reviewer is independent of the fixer's reasoningreviewer: "do not fetch triager/fixer reasoning"PR body strips .triage and .fix.diff_summary; guard_pr_body_clean dies on leakage; guard_commit_msg_clean on the commit body
Fix diff is minimalfixer + reviewer: "minimal diff, no refactor"guard_max_files_changed (default 5), guard_max_lines_changed (default 400); env-tunable
PoCs never land in commitspoc-builder: "PoCs live outside the workspace"guard_poc_outside_workspace + guard_no_poc_in_diff
PoCs don't perform network I/Opoc-builder: "never write a PoC that performs a live network request"guard_poc_no_network static scan for curl/wget/requests/fetch against non-loopback hosts
Scraped-repo tests are sandboxedfixer: "route every invocation through run_sandboxed"sandbox.sh via podman / docker / bwrap; sandbox_mode=strict (default) refuses unsandboxed
Secrets aren't committedfixer: "never bypass pre-commit hooks"guard_no_secrets_in_diff pattern-scans added lines (AKIA, ghp_, sk-ant-, PEM headers, …)
Credential comparisons must SHA3-256 hash-then-comparetriager / fixer / reviewer: hash both sides of a credential/MAC/signature comparison first; constant-time primitives on raw secrets are themselves a known-vulnerable posture — only hashing destroys prefix structure and eliminates the hangman oracleguard_no_unhashed_credential_compare dies if staged diff compares a credential-shaped identifier with ==/===/.equals(/strcmp/timingSafeEqual(/compare_digest(/ConstantTimeCompare(/MessageDigest.isEqual(/secure_compare(/hash_equals(/CRYPTO_memcmp( without a SHA3-256 hash call in the same file's added lines
Submodules can't be added mid-auditguard_no_submodule_change dies on .gitmodules edits or submodule-pointer changes
State transitions follow the lifecycletick skill: explicit per-status dispatchguard_status_transition rejects edges outside the allowed set on every finding_update_status
Finding fields are untrusted inputevery agent wraps them in === BEGIN UNTRUSTED REPOSITORY CONTENT === delimiters— (judgement call — no mechanical way to classify prose as instruction vs data)
Fixer gives up after N attemptsfixer: notes the capscripts/finding-attempts.sh increments before each attempt; tick reads counter and marks failed at the cap
Only one tick per repowith_lock uses flock(1) — atomic, kernel-released on process death
Concurrent scans can't collide on IDsfinding_create allocates IDs under a directory-level flock

Cells marked on the programmatic side are genuine judgement calls. Those live entirely at the LLM layer, which is why merge_policy=manual is the default: the plugin won't merge anything without a human look when the last line of defence is an LLM.

Sandbox for scraped-repo test execution

A malicious repo can ship a test file that nukes the host. scripts/lib/sandbox.sh routes every scraped-repo command through an isolated container:

Install

From the marketplace

/plugin marketplace update wrxck-claude-plugins /plugin install auto-audit@wrxck-claude-plugins

Requirements

You need bash 4.0+, gh, git, jq, and flock. Install hints the plugin prints itself when anything is missing:

macOS (Homebrew)

brew install bash gh git jq util-linux # util-linux's flock isn't on PATH by default on macOS; add brew + flock to PATH: cat >> ~/.zshrc <<'EOF' export PATH="$(brew --prefix)/bin:$(brew --prefix util-linux)/sbin:$PATH" EOF source ~/.zshrc bash --version # must be 4.x or 5.x

Debian / Ubuntu

sudo apt-get update sudo apt-get install -y gh git jq util-linux

Fedora / RHEL

sudo dnf install -y gh git jq util-linux

Arch / Alpine

# Arch sudo pacman -S --needed github-cli git jq util-linux # Alpine sudo apk add --no-cache github-cli git jq util-linux-misc

Windows

Use WSL2. The plugin is bash-only and assumes POSIX paths.

Authenticate gh

gh auth login # Choose: GitHub.com → HTTPS → Login with a web browser gh auth status

Needs at least repo scope. Default login flow grants it.

Sandbox runtime (recommended)

Under the default sandbox_mode=strict, fixer tests won't run without a container runtime. Install podman (rootless) if you can — otherwise docker or bubblewrap:

# Debian / Ubuntu sudo apt-get install -y podman # macOS brew install podman && podman machine init && podman machine start # alternatives sudo apt-get install -y docker.io bubblewrap

Commands

CommandPurpose
/auto-audit:start <repo> [modules=security] [policy=manual|auto]Clone, scan, start the loop. Policy defaults to manual.
/auto-audit:tickAdvance one finding by one stage. The /loop calls this for you.
/auto-audit:statusShow the findings breakdown and recent activity.
/auto-audit:resume [slug]Resume after /auto-audit:stop or a session restart.
/auto-audit:stopDrop the active-repo pointer so ticks become no-ops.

README badges

Two badges you can drop in any repo. The first just says you use auto-audit; the second shows live audit status.

Static — "audited by auto-audit"

audited by auto-audit

[![audited by auto-audit](https://img.shields.io/badge/audited_by-auto--audit-6366f1?logo=github&logoColor=white)](https://auto-audit.hesketh.pro)

Dynamic — current audit status

auto-audit publishes .auto-audit/status.json to the autoaudit/status branch of your repo on every scan. Shields.io's endpoint adapter renders it:

[![auto-audit status](https://img.shields.io/endpoint?url=https%3A%2F%2Fraw.githubusercontent.com%2FOWNER%2FREPO%2Fautoaudit%2Fstatus%2F.auto-audit%2Fstatus.json)](https://auto-audit.hesketh.pro)

Replace OWNER/REPO. Colour follows severity:

Let auto-audit install them for you

After the first scan on a repo, run /auto-audit:badge. It opens a PR that adds the static badge to your README and seeds the status file on the autoaudit/status branch so the dynamic badge works immediately. You can decline either half on the PR.

Evidence & benchmarks

Real runs, not marketing. All numbers here are from the public git history — reproducible.

First real run — wrxck/fleet

Dogfood — wrxck/auto-audit against itself

Coming as runs complete — target is three consecutive zero-finding runs before this section is considered signed-off.

External repos

Planned: wrxck/telegram-bot-lua and OWASP Juice Shop. Findings + PR links here once runs complete.

FAQ

What does this cost to run?

Depends on the size of the target repo and the model. Rough estimates on Sonnet 4.6 with decent cache hit: ~$0.30–$0.80 for a triage-only scan of ~8 findings; ~$1–$2 for a single end-to-end pipeline run on one finding; ~$5–$10 for a full pipeline on ~8 findings. On Opus 4.7, multiply by ~5×. The scanner itself is bounded at 60 files per pass.

Can I trust merge_policy=auto?

Only on repos you fully own and whose content you're willing to stake the default branch on. An LLM reviewer is not a hardened security boundary — a sufficiently clever prompt injection could flip a verdict. Even then, the sandbox still contains test execution, so test-file payloads can't escape the container. manual is the default for a reason.

Does it handle languages other than Node/Python/Go?

The explicit per-bucket prompts in audit-security/SKILL.md target Node, Python, and Go specifics. Other languages (Rust, Ruby, Lua, PHP, …) rely on the generic LLM review pass. We'll publish recall numbers once the external-repo runs complete.

How many files per scan?

60 per invocation, with per-file limits: skip files over 1500 lines, skip files over 300 kB. Resumable scan cursor is on the roadmap for bigger repos.

What happens if the target repo has no test framework?

The fixer writes the fix without a new test and notes that in the diff summary. The reviewer accepts this if the rest of the fix is sound — style quibbles aren't grounds for rejection, but missing test coverage on a testable project is.

What if a finding cycles forever between "fixed" and "rejected"?

After max_fix_iterations (default 3) attempts, the finding is marked failed and the loop moves on.

What if I want to kill it mid-run?

/auto-audit:stop drops the active-repo pointer so future ticks become no-ops, and press Esc in the main chat to cancel the running /loop. State is preserved; /auto-audit:resume picks it back up.

Changelog

Latest first. Every release is also tagged on GitHub: github.com/wrxck/auto-audit/releases.

v0.10.0 2026-04-26

Operator feedback memory loop. New /auto-audit:feedback skill records a per-repo append-only log at ${repo_dir}/feedback.jsonl. The triager and fixer subagents read it on every future tick and weigh prior signal — fix_pattern_rejected, fix_pattern_approved, human_revert, triage_override, reviewer_disagreed, note. The reviewer is explicitly forbidden to read it, so the independent-review checkpoint stays independent. Plugin remains experimental at 0.x; 1.0.0 is reserved for an explicit stability declaration.

v0.9.0 2026-04-26

HTML audit reports. New /auto-audit:report skill generates a self-contained HTML report per repo: summary stats, per-finding cards (triage / PoC / fix / PR / review), full activity log. Print-friendly so PDF / DOCX / PPTX conversion via weasyprint / chromium / pandoc is a one-shot follow-up. Three modes: active repo, named slug, --all.

v0.8.0 2026-04-25

Multi-repo support. Drops the exclusivity gate on /auto-audit:start; audits coexist on disk. /auto-audit:status [--all | <slug>] and /auto-audit:stop [<slug>] accept optional slug arguments.

v0.7.0 2026-04-25

Three small dogfooding fixes. audit_library_surface config flag makes the triager's posture on uncalled-but-public API surface explicit and reproducible. /auto-audit:resume eagerly recovers findings stuck mid-tick. Sandbox-incompatible-natives diagnostic records fix.test_status as skipped instead of marking the finding failed when host-built native addons fail to dlopen in the container.

v0.6.0 2026-04-25

Security-knowledge library expansion. Five new rules (csprng, sql-injection, deserialization, path-canonicalization, xxe) plus three new paired guards (guard_no_insecure_random, guard_no_unsafe_deserialize, guard_no_unsafe_xml_parser). SQL-injection and path-canonicalization stay LLM-only — reliable detection needs taint analysis or has unbounded false positives. Test suite 58 → 79 assertions.

v0.5.0 2026-04-25

Dogfooding-driven UX fixes. Reviewer self-PR fallback to gh pr comment. scripts/refresh-installed.sh repoints installed_plugins.json at the highest semver dir in the cache. Stale-finding visibility in /auto-audit:status.

v0.4.2 2026-04-21

Permission-allowlist recipe (docs-only). Path-scoped allowlist in the README so the autonomous loop can run without a per-Bash-call approval prompt on mobile / Remote Control. Existing PreToolUse hooks still run on top.

v0.4.1 2026-04-20

Hash-then-compare rule library. Supersedes v0.4.0. SHA3-256 hash both sides before comparing — constant-time primitives on raw secrets are themselves a known-vulnerable posture (compiler optimisations strip the constant-time property; raw-secret prefix structure still leaks). New guard_no_unhashed_credential_compare.

v0.3.0 2026-04-19

README badges. Static "audited by auto-audit" and dynamic shields.io endpoint badge reading from the autoaudit/status branch. New /auto-audit:badge skill.

v0.2.0 2026-04-18

Programmatic guardrails + sandbox. scripts/lib/guards.sh with branch / diff-size / PoC-location / secret-detection / status-transition checks. scripts/lib/sandbox.sh routes scraped-repo test commands through podman / docker / bubblewrap.

v0.1.0 2026-04-18

Initial import. Five-skill plugin (start, tick, status, resume, stop), four agents (security-triage, poc-builder, security-fixer, security-reviewer), single-stage-per-tick lifecycle.

Source & status

github.com/wrxck/auto-audit · v0.10.0 release notes · Issues

CI status