Autonomous security auditor for Claude Code

Point auto-audit at a GitHub repo. It scans for vulnerabilities, triages false positives, writes a proof of concept, fixes each confirmed bug in its own PR, independently reviews the fix, and merges when the review is clean. It keeps doing that until the queue is drained, then rescans, until you stop it.

Security v0.10.0 MIT

claude plugin install auto-audit@wrxck-claude-plugins

Part of the Claude Code Plugins collection.

How a tick flows

Each /auto-audit:tick advances one finding by one lifecycle stage, then returns. The /loop wrapper reinvokes the tick until the queue drains. One stage per tick means the independent-review checkpoint is a real checkpoint, and the loop is cheap to interrupt.

discovered      → triaging      (security-triage subagent)
triaging        → confirmed  |  false_positive
confirmed       → poc_written  (poc-builder subagent)
poc_written     → fix_committed (security-fixer subagent; new branch, commit)
fix_committed   → pr_opened    (branch push + gh pr create)
pr_opened       → reviewing    (security-reviewer subagent, independent context)
reviewing       → pr_approved  |  pr_rejected
pr_approved     → merged       (squash; if merge_policy=auto)
pr_approved     → skipped      (if merge_policy=manual — default)
pr_rejected     → confirmed    (fixer retries, bounded by max_fix_iterations)

Safety model — two layers

Every safety claim is enforced in two places: an instruction in the relevant agent's role card and a programmatic guard that runs before the action commits and refuses if violated. The LLM layer covers judgement calls; the programmatic layer covers everything mechanically checkable.

We do not claim 100% safety. An LLM is not a hardened security boundary, and some properties genuinely cannot be mechanically verified. What the programmatic layer does guarantee is that anything expressible as "refuse if input is not in the allowed set" stays refused. The guard test suite (scripts/test-guards.sh) exercises every guard — currently 79/79 passing.

Claim	LLM-layer instruction	Programmatic guard
Never push to the default branch	fixer: "never touch the default branch locally"	`guard_autoaudit_branch` in `push_branch`; `commit_all` refuses non-`autoaudit/*` HEAD; `guard_not_default_branch`
Never force-push outside `autoaudit/*`	—	`guard_autoaudit_branch` + `--force-with-lease` only
Reviewer is independent of the fixer's reasoning	reviewer: "do not fetch triager/fixer reasoning"	PR body strips `.triage` and `.fix.diff_summary`; `guard_pr_body_clean` dies on leakage; `guard_commit_msg_clean` on the commit body
Fix diff is minimal	fixer + reviewer: "minimal diff, no refactor"	`guard_max_files_changed` (default 5), `guard_max_lines_changed` (default 400); env-tunable
PoCs never land in commits	poc-builder: "PoCs live outside the workspace"	`guard_poc_outside_workspace` + `guard_no_poc_in_diff`
PoCs don't perform network I/O	poc-builder: "never write a PoC that performs a live network request"	`guard_poc_no_network` static scan for `curl`/`wget`/`requests`/`fetch` against non-loopback hosts
Scraped-repo tests are sandboxed	fixer: "route every invocation through `run_sandboxed`"	`sandbox.sh` via podman / docker / bwrap; `sandbox_mode=strict` (default) refuses unsandboxed
Secrets aren't committed	fixer: "never bypass pre-commit hooks"	`guard_no_secrets_in_diff` pattern-scans added lines (`AKIA`, `ghp_`, `sk-ant-`, PEM headers, …)
Credential comparisons must SHA3-256 hash-then-compare	triager / fixer / reviewer: hash both sides of a credential/MAC/signature comparison first; constant-time primitives on raw secrets are themselves a known-vulnerable posture — only hashing destroys prefix structure and eliminates the hangman oracle	`guard_no_unhashed_credential_compare` dies if staged diff compares a credential-shaped identifier with `==`/`===`/`.equals(`/`strcmp`/`timingSafeEqual(`/`compare_digest(`/`ConstantTimeCompare(`/`MessageDigest.isEqual(`/`secure_compare(`/`hash_equals(`/`CRYPTO_memcmp(` without a `SHA3-256` hash call in the same file's added lines
Submodules can't be added mid-audit	—	`guard_no_submodule_change` dies on `.gitmodules` edits or submodule-pointer changes
State transitions follow the lifecycle	tick skill: explicit per-status dispatch	`guard_status_transition` rejects edges outside the allowed set on every `finding_update_status`
Finding fields are untrusted input	every agent wraps them in `=== BEGIN UNTRUSTED REPOSITORY CONTENT ===` delimiters	— (judgement call — no mechanical way to classify prose as instruction vs data)
Fixer gives up after N attempts	fixer: notes the cap	`scripts/finding-attempts.sh` increments before each attempt; tick reads counter and marks `failed` at the cap
Only one tick per repo	—	`with_lock` uses `flock(1)` — atomic, kernel-released on process death
Concurrent scans can't collide on IDs	—	`finding_create` allocates IDs under a directory-level flock

Cells marked — on the programmatic side are genuine judgement calls. Those live entirely at the LLM layer, which is why merge_policy=manual is the default: the plugin won't merge anything without a human look when the last line of defence is an LLM.

Sandbox for scraped-repo test execution

A malicious repo can ship a test file that nukes the host. scripts/lib/sandbox.sh routes every scraped-repo command through an isolated container:

No network by default (--network=none). Egress allowed only for repos on the per-repo allow_network_for_repos list.
Host FS not mounted. Repo is bound read-only; writes go to a tmpfs workspace.
Runs as unprivileged nobody (65534). --cap-drop=ALL, --security-opt=no-new-privileges.
Resource caps: --cpus=2 --memory=2g --pids-limit=256.
Runtime detection: podman → docker → bwrap.
sandbox_mode=strict (default) refuses if no runtime is installed. best-effort warns and falls back; off runs unsandboxed on the host (opt-in only).

Install

From the marketplace

/plugin marketplace update wrxck-claude-plugins
/plugin install auto-audit@wrxck-claude-plugins

Requirements

You need bash 4.0+, gh, git, jq, and flock. Install hints the plugin prints itself when anything is missing:

macOS (Homebrew)

brew install bash gh git jq util-linux
# util-linux's flock isn't on PATH by default on macOS; add brew + flock to PATH:
cat >> ~/.zshrc <<'EOF'
export PATH="$(brew --prefix)/bin:$(brew --prefix util-linux)/sbin:$PATH"
EOF
source ~/.zshrc
bash --version   # must be 4.x or 5.x

Debian / Ubuntu

sudo apt-get update
sudo apt-get install -y gh git jq util-linux

Fedora / RHEL

sudo dnf install -y gh git jq util-linux

Arch / Alpine

# Arch
sudo pacman -S --needed github-cli git jq util-linux
# Alpine
sudo apk add --no-cache github-cli git jq util-linux-misc

Windows

Use WSL2. The plugin is bash-only and assumes POSIX paths.

Authenticate gh

gh auth login
# Choose: GitHub.com → HTTPS → Login with a web browser
gh auth status

Needs at least repo scope. Default login flow grants it.

Sandbox runtime (recommended)

Under the default sandbox_mode=strict, fixer tests won't run without a container runtime. Install podman (rootless) if you can — otherwise docker or bubblewrap:

# Debian / Ubuntu
sudo apt-get install -y podman
# macOS
brew install podman && podman machine init && podman machine start
# alternatives
sudo apt-get install -y docker.io bubblewrap

Commands

Command	Purpose
`/auto-audit:start <repo> [modules=security] [policy=manual\|auto]`	Clone, scan, start the loop. Policy defaults to manual.
`/auto-audit:tick`	Advance one finding by one stage. The `/loop` calls this for you.
`/auto-audit:status`	Show the findings breakdown and recent activity.
`/auto-audit:resume [slug]`	Resume after `/auto-audit:stop` or a session restart.
`/auto-audit:stop`	Drop the active-repo pointer so ticks become no-ops.

README badges

Two badges you can drop in any repo. The first just says you use auto-audit; the second shows live audit status.

Static — "audited by auto-audit"

[![audited by auto-audit](https://img.shields.io/badge/audited_by-auto--audit-6366f1?logo=github&logoColor=white)](https://auto-audit.hesketh.pro)

Dynamic — current audit status

auto-audit publishes .auto-audit/status.json to the autoaudit/status branch of your repo on every scan. Shields.io's endpoint adapter renders it:

[![auto-audit status](https://img.shields.io/endpoint?url=https%3A%2F%2Fraw.githubusercontent.com%2FOWNER%2FREPO%2Fautoaudit%2Fstatus%2F.auto-audit%2Fstatus.json)](https://auto-audit.hesketh.pro)

Replace OWNER/REPO. Colour follows severity:

auto-audit: clean (green) — no open findings
auto-audit: N findings (amber) — findings pending triage or fix
auto-audit: critical (red) — at least one confirmed critical

Let auto-audit install them for you

After the first scan on a repo, run /auto-audit:badge. It opens a PR that adds the static badge to your README and seeds the status file on the autoaudit/status branch so the dynamic badge works immediately. You can decline either half on the PR.

Evidence & benchmarks

Real runs, not marketing. All numbers here are from the public git history — reproducible.

First real run — `wrxck/fleet`

Scanner emitted 8 findings (2 critical, 2 high, 3 medium, 1 low)
Author's manual review of every finding: 8/8 plausible, 0 obvious false positives
One pair described the same vulnerability from two angles (webhook auth + router auth) — consolidation-at-triage is on the roadmap
SEC-0001 (unauthenticated webhook → /shell RCE) taken end-to-end: triage → PoC → fix → PR → independent review → approved. PR: wrxck/fleet#39

Dogfood — `wrxck/auto-audit` against itself

Coming as runs complete — target is three consecutive zero-finding runs before this section is considered signed-off.

External repos

Planned: wrxck/telegram-bot-lua and OWASP Juice Shop. Findings + PR links here once runs complete.

FAQ

What does this cost to run?

Depends on the size of the target repo and the model. Rough estimates on Sonnet 4.6 with decent cache hit: ~$0.30–$0.80 for a triage-only scan of ~8 findings; ~$1–$2 for a single end-to-end pipeline run on one finding; ~$5–$10 for a full pipeline on ~8 findings. On Opus 4.7, multiply by ~5×. The scanner itself is bounded at 60 files per pass.

Can I trust `merge_policy=auto`?

Only on repos you fully own and whose content you're willing to stake the default branch on. An LLM reviewer is not a hardened security boundary — a sufficiently clever prompt injection could flip a verdict. Even then, the sandbox still contains test execution, so test-file payloads can't escape the container. manual is the default for a reason.

Does it handle languages other than Node/Python/Go?

The explicit per-bucket prompts in audit-security/SKILL.md target Node, Python, and Go specifics. Other languages (Rust, Ruby, Lua, PHP, …) rely on the generic LLM review pass. We'll publish recall numbers once the external-repo runs complete.

How many files per scan?

60 per invocation, with per-file limits: skip files over 1500 lines, skip files over 300 kB. Resumable scan cursor is on the roadmap for bigger repos.

What happens if the target repo has no test framework?

The fixer writes the fix without a new test and notes that in the diff summary. The reviewer accepts this if the rest of the fix is sound — style quibbles aren't grounds for rejection, but missing test coverage on a testable project is.

What if a finding cycles forever between "fixed" and "rejected"?

After max_fix_iterations (default 3) attempts, the finding is marked failed and the loop moves on.

What if I want to kill it mid-run?

/auto-audit:stop drops the active-repo pointer so future ticks become no-ops, and press Esc in the main chat to cancel the running /loop. State is preserved; /auto-audit:resume picks it back up.

Changelog

Latest first. Every release is also tagged on GitHub: github.com/wrxck/auto-audit/releases.

v0.10.0 2026-04-26

Operator feedback memory loop. New /auto-audit:feedback skill records a per-repo append-only log at ${repo_dir}/feedback.jsonl. The triager and fixer subagents read it on every future tick and weigh prior signal — fix_pattern_rejected, fix_pattern_approved, human_revert, triage_override, reviewer_disagreed, note. The reviewer is explicitly forbidden to read it, so the independent-review checkpoint stays independent. Plugin remains experimental at 0.x; 1.0.0 is reserved for an explicit stability declaration.

v0.9.0 2026-04-26

HTML audit reports. New /auto-audit:report skill generates a self-contained HTML report per repo: summary stats, per-finding cards (triage / PoC / fix / PR / review), full activity log. Print-friendly so PDF / DOCX / PPTX conversion via weasyprint / chromium / pandoc is a one-shot follow-up. Three modes: active repo, named slug, --all.

v0.8.0 2026-04-25

Multi-repo support. Drops the exclusivity gate on /auto-audit:start; audits coexist on disk. /auto-audit:status [--all | <slug>] and /auto-audit:stop [<slug>] accept optional slug arguments.

v0.7.0 2026-04-25

Three small dogfooding fixes. audit_library_surface config flag makes the triager's posture on uncalled-but-public API surface explicit and reproducible. /auto-audit:resume eagerly recovers findings stuck mid-tick. Sandbox-incompatible-natives diagnostic records fix.test_status as skipped instead of marking the finding failed when host-built native addons fail to dlopen in the container.

v0.6.0 2026-04-25

Security-knowledge library expansion. Five new rules (csprng, sql-injection, deserialization, path-canonicalization, xxe) plus three new paired guards (guard_no_insecure_random, guard_no_unsafe_deserialize, guard_no_unsafe_xml_parser). SQL-injection and path-canonicalization stay LLM-only — reliable detection needs taint analysis or has unbounded false positives. Test suite 58 → 79 assertions.

v0.5.0 2026-04-25

Dogfooding-driven UX fixes. Reviewer self-PR fallback to gh pr comment. scripts/refresh-installed.sh repoints installed_plugins.json at the highest semver dir in the cache. Stale-finding visibility in /auto-audit:status.

v0.4.2 2026-04-21

Permission-allowlist recipe (docs-only). Path-scoped allowlist in the README so the autonomous loop can run without a per-Bash-call approval prompt on mobile / Remote Control. Existing PreToolUse hooks still run on top.

v0.4.1 2026-04-20

Hash-then-compare rule library. Supersedes v0.4.0. SHA3-256 hash both sides before comparing — constant-time primitives on raw secrets are themselves a known-vulnerable posture (compiler optimisations strip the constant-time property; raw-secret prefix structure still leaks). New guard_no_unhashed_credential_compare.

v0.3.0 2026-04-19

README badges. Static "audited by auto-audit" and dynamic shields.io endpoint badge reading from the autoaudit/status branch. New /auto-audit:badge skill.

v0.2.0 2026-04-18

Programmatic guardrails + sandbox. scripts/lib/guards.sh with branch / diff-size / PoC-location / secret-detection / status-transition checks. scripts/lib/sandbox.sh routes scraped-repo test commands through podman / docker / bubblewrap.

v0.1.0 2026-04-18

Initial import. Five-skill plugin (start, tick, status, resume, stop), four agents (security-triage, poc-builder, security-fixer, security-reviewer), single-stage-per-tick lifecycle.

Source & status

github.com/wrxck/auto-audit · v0.10.0 release notes · Issues