Content
<p align="center">
<img src="docs/assets/logo.svg" alt="Pentest Agent Suite" width="440"/>
</p>
<h1 align="center">Pentest Agent Suite for Claude Code</h1>
<p align="center">
<em>Autonomous bug-bounty framework for Claude Code and 6 other AI coding tools — 50 agents, 26 commands, 19 CLI tools, 11 skills, 2 MCP servers.</em>
</p>
<p align="center">
<img src="https://img.shields.io/badge/python-3.10%2B-blue?logo=python&logoColor=white" alt="Python 3.10+"/>
<img src="https://img.shields.io/badge/Claude-Code-d97757" alt="Claude Code"/>
<img src="https://img.shields.io/badge/MCP-servers%20%C3%97%202-2ea043" alt="MCP servers"/>
<img src="https://img.shields.io/badge/agents-50-8957e5" alt="50 agents"/>
<img src="https://img.shields.io/badge/payloads-2500%20lines-f85149" alt="Payloads"/>
<img src="https://img.shields.io/badge/IDEs-7-1f6feb" alt="7 IDE targets"/>
</p>
---
**~760 files · ~118k lines · 50 agents · 26 commands · 19 CLI tools · 11 skills · 2 MCP servers (16 bug-bounty platforms + BYO writeup search) · 2,500 payload lines**
A complete bug bounty framework. Battle-tested hunting methodology with concrete payloads, 7-Question Gate validation, autonomous hunt loops, A→B exploit chain building, persistent brain with endpoint tracking, optional semantic writeup search (bring your own index), automatic cost tracking via CC hooks, live platform integration, and a cross-IDE installer that emits the native format for Claude Code, Codex, Gemini, Cursor, Windsurf, VS Code Copilot, and OpenClaw.
## Quick Start
```bash
# MCP servers are launched via `uv run --with mcp` — no global pip install required.
export HACKERONE_USERNAME=you HACKERONE_TOKEN=your_token
uv run python3 tools/scaffold.py hackerone tesla
cd ~/bounties/hackerone-tesla && claude
/model opus # Opus 4.7 [1M] — subagents inherit via model: "inherit"
/sync hackerone tesla
/brain init && /status
/hunt tesla.com
```
`scaffold.py` provisions the workspace for every supported project-scoped
client, not only Claude Code: `CLAUDE.md`, `AGENTS.md`, `.codex/`,
`.agents/skills/`, `.gemini/`, `.cursor/`, `.windsurf/`, `.github/`, and
`.vscode/mcp.json` are generated from the copied workspace assets so paths
resolve inside the bounty workspace.
## Install (Claude Code + 6 other AI coding tools)
The framework ships pre-rendered for every supported tool. There are two ways
to use it:
**1. Use the bundles directly (no install step)**
```bash
git clone https://github.com/H-mmer/pentest-agents-suite
cd pentest-agents-suite/pentest-agents/providers/codex
codex # or: cd ../gemini && gemini, etc.
```
The `providers/<id>/` tree contains a fully-translated, ready-to-use bundle
for each non-Claude target. Path references inside use `..` to reach the
repo's `tools/`, `rules/`, and `mcp-*-server/` — so the bundle works as
long as it stays inside the cloned repo.
**2. Run the installer (writes into your own project or `~/.codex/` etc.)**
```bash
python3 -m tools.installer install --targets all --scope project
python3 -m tools.installer install --targets codex --scope global
```
Install mode rewrites paths to absolute references back into the cloned
pentest-agents repo, so the install works no matter where the user's own
project lives.
| Target | Agents | Slash commands | Rules | MCP | Scopes |
|---|---|---|---|---|---|
| **Claude Code** | native `.claude/agents/*.md` | `.claude/skills/<name>/SKILL.md` | `CLAUDE.md` | `.mcp.json` / `~/.claude.json` | global + project |
| **OpenAI Codex** | native `.codex/agents/*.toml` | `.agents/skills/<name>/SKILL.md` | `AGENTS.md` (≤32 KiB) | `[mcp_servers.*]` in `config.toml` | global + project |
| **Google Gemini** | native `.gemini/agents/*.md` | TOML in `.gemini/commands/` | `GEMINI.md` | `mcpServers` in `settings.json` | global + project |
| **Cursor** | → skills `.cursor/skills/agent-*/SKILL.md` (no native subagents) | → skills `.cursor/skills/cmd-*/SKILL.md` | `.cursor/rules/*.mdc` + `AGENTS.md` | `.cursor/mcp.json` | global + project |
| **Windsurf** | → skills | Workflows | `.windsurf/rules/*.md` (≤12 KiB / file) | `~/.codeium/windsurf/mcp_config.json` | global + project |
| **VS Code Copilot** | `.github/agents/*.agent.md` (≤30 KiB / agent) | `.github/prompts/*.prompt.md` | `.github/copilot-instructions.md` + `.github/instructions/*` | `.vscode/mcp.json` | project + global-MCP |
| **OpenClaw** | → skills | → skills | `~/.openclaw/workspace/AGENTS.md` or `<proj>/AGENTS.md` | `mcp.servers` in `~/.openclaw/openclaw.json` | global + project *(MCP is user-level)* |
Cursor, Windsurf, and OpenClaw have no native subagent concept; Claude-format
agents render as skills/rules. Codex commands are emitted as AgentSkills under
`.agents/skills/`; the deprecated `.codex/prompts/` path is not used.
**`providers/` directory** (in the cloned repo):
```
providers/
├── codex/ AGENTS.md + .codex/{agents,config.toml} + .agents/skills
├── gemini/ GEMINI.md + .gemini/{agents,commands} + settings.json
├── cursor/ AGENTS.md + .cursor/{rules,skills,mcp.json}
├── windsurf/ AGENTS.md + .windsurf/{rules,workflows,skills} + mcp_config.json
├── copilot/ .github/{copilot-instructions.md,instructions,prompts,agents} + .vscode/mcp.json
└── openclaw/ AGENTS.md + .agents/skills/ + openclaw.json
```
`providers/` is **generated**, not edited by hand. Re-render after editing
`.claude/`, `rules/`, or `skills/` source:
```bash
python3 -m tools.installer render --targets all
python3 -m tools.installer render --check # exits 1 if drift
```
The `test_committed_providers_match_render` pytest case enforces drift
detection locally — there is no GitHub Actions CI by project policy.
### What gets translated
When `.claude/` content is rendered for non-Claude targets, the translator:
- **Drops the `model:` field** — each target uses its own default model.
- **Strips Claude-specific prose** — "Claude Code" → "the AI coding tool",
"the Agent tool" → "the subagent dispatch tool", `model: "inherit"` is
removed entirely.
- **Rewrites `$CLAUDE_PROJECT_DIR`** — to `..` in `providers/` (relative
to the cloned repo), or to absolute paths into the cloned source repo
when installing into a user's project.
- **Maps `effort:` frontmatter** to `model_reasoning_effort` in Codex TOML.
- **Caps body length** — Copilot agents are truncated at 30,000 chars
(Copilot's hard limit). Windsurf rules are chunked at 12,000 chars
(workspace) / 6,000 chars (global).
- **Adds Copilot subagent links** — orchestrator agents (chain-builder,
correlator, recon-ranker) get an `agents:` list of siblings so Copilot
wires the dispatch graph.
### Installer management
```bash
pentest-agents list # detect which targets are installed
pentest-agents install --targets claude_code,codex --scope global
pentest-agents install --dry-run # preview every file + JSON merge
pentest-agents verify # check manifest vs. disk (drift)
pentest-agents uninstall # reverse, restore .pa-backup files
pentest-agents render --targets all # regenerate providers/<id>/
pentest-agents render --check # drift gate (exit 1 if dirty)
```
Every install records a manifest (`.pentest-agents/manifest.json` for project
scope, `~/.config/pentest-agents/manifest.json` for global). Uninstall only
removes files we wrote and surgically strips only the MCP/JSON keys we merged —
your other settings are never touched. Conflicting writes back up the original
as `<path>.pa-backup` and are restored on uninstall.
## Workflow
```
New program: /new → /sync → /brain init → /analyze → /surface → /hunt
Returning: /resume <target> → /hunt or /autopilot
After finding: /validate → /chain → /report → /dupcheck → /submit → /learn
Batch triage: /triage (7-Question Gate on all findings)
```
## MCP Servers (2)
### bounty-platforms (16 platforms)
HackerOne (full API), Bugcrowd, Intigriti, Immunefi (public), YesWeHack + 11 stubs.
7 MCP tools: list_platforms, get_program_scope, get_program_policy, search_hacktivity, sync_program, draft_report, submit_report.
### writeup-search (BYO index)
Searchable knowledge base agents query during hunting and validation.
4 MCP tools:
- `search_writeups` — semantic search (FAISS) or keyword search for prior art
- `get_writeup` — full writeup content by ID
- `search_techniques` — exploitation techniques by vuln class
- `search_payloads` — curated payloads from `rules/payloads.md`
> **The writeup index is not bundled.** Bulk-redistributing scraped hacktivity violates most platform ToS, so this repo ships the server only. The `search_payloads` + `search_techniques` fallback works out of the box; the semantic/keyword layers activate once you point the server at your own index.
**Three search modes** (auto-detected, graceful fallback):
| Mode | Requires | Searches |
|------|----------|----------|
| **FAISS** (semantic) | `faiss-cpu`, `sentence-transformers`, your `metadata.db` + `index.faiss` | Your writeup corpus via vector embeddings |
| **SQLite** (keyword) | Your `metadata.db` only | Your writeup corpus via `LIKE` over the text column |
| **Local** (default) | Nothing — zero deps | `rules/payloads.md` + `skills/` shipped in this repo |
Point the server at your index by dropping `metadata.db` (+ optionally `index.faiss`) into `~/.local/share/pentest-writeups/`, or set `WRITEUP_DB_DIR=/path/to/dir`.
**Expected schema** (`metadata.db`): a SQLite file with at least one table containing columns `id`, `title`, `url`, and one text column (`content` / `text` / `body` / `writeup`). Row order in the table must match vector order in `index.faiss` when using semantic mode.
### Build your own index — `rag-builder/`
The repo now ships a local RAG/FAISS builder under [`rag-builder/`](rag-builder/) that turns a list of GitHub / GitLab repositories into a `metadata.db` + `index.faiss` pair the writeup-search MCP server consumes. Destructive operations (clone, embed, write) are **always gated behind `--execute`** — running the CLI without it prints the plan and changes nothing, so you can never wipe an existing index by accident.
```bash
cd rag-builder
# 1. Inspect the plan — no network, no writes.
python3 build.py status
python3 build.py ingest # dry-run (the default)
# 2. Opt-in pre-flight: probe every URL with `git ls-remote` (network).
python3 build.py ingest --check-remotes # ~5s for 141 repos at 16 workers
# 3. Actually clone + index every repo from repos.yaml into ./data/.
python3 build.py ingest --execute
python3 build.py ingest --execute --check-remotes # skip unreachable first
# 4. Point the MCP server at the output.
export WRITEUP_DB_DIR="$PWD/data"
python3 ../mcp-writeup-server/server.py --test
```
`rag-builder/repos.yaml` ships with a 146-entry seed covering CTF archives, bug-bounty reports, payload collections, and research aggregators — edit freely. `repos-skipped.yaml` is loaded automatically as an exclusion list (override with `--skip-list` or `--no-skip-list`). `config.yaml` controls the embedding model (`all-MiniLM-L6-v2` by default), host allowlist, clone size cap, and file-size ceiling. See [`rag-builder/README.md`](rag-builder/README.md) for the full reference.
## CC Hooks (automatic cost tracking)
Configured in settings.json, fires automatically:
- **SubagentStop** → `cost_hook.py` logs agent name + session to `cost-tracking.json`
- **Stop** → logs session end
- **SessionStart** → welcome message
Statusline shows live cost from session token data: `$0.57`
## Commands (26)
### Hunting & Analysis
| Command | Description |
|---------|-------------|
| `/hunt <target> [--vuln-class]` | Active hunting — searches writeup DB for techniques first, then tests with concrete payloads |
| `/autopilot <target>` | Autonomous loop with --paranoid/--normal/--yolo checkpoints |
| `/surface <target>` | P1/P2/Kill ranked attack surface |
| `/chain` | Build A→B→C exploit chains via chain-builder agent (9 capability rows + 4 documented deep chains in `rules/chain-table.md`) |
| `/analyze <target>` | AI analysis: crown jewels, attack paths, blind spots |
| `/mindmap <target>` | Attack surface tree with brain status |
| `/sast <repo>` | Source-code vulnerability hunting (entry → flow → gap → exploit pipeline) |
### Validation & Reporting
| Command | Description |
|---------|-------------|
| `/validate <finding>` | 7-Question Gate → PASS/KILL/DOWNGRADE/CHAIN REQUIRED |
| `/triage` | Batch-validate ALL findings, kill weak ones |
| `/quality <draft>` | Score report 1-10 (blocks below 7) |
| `/report [format]` | Reports (hard gate: requires /validate PASS) |
| `/dupcheck <desc>` | Hacktivity + writeup DB for duplicates |
| `/submit <finding>` | Submit (hard gate: /validate PASS + /quality ≥ 7) |
### Session & Memory
| Command | Description |
|---------|-------------|
| `/resume <target>` | Resume — untested endpoints + suggestions |
| `/remember` | Log finding/pattern for cross-target learning |
| `/learn <id> <status>` | Record response — auto-boosts paid techniques |
| `/brain` | init, brief, status, endpoint, endpoints, record, exhausted |
### Infrastructure
| Command | Description |
|---------|-------------|
| `/new`, `/sync`, `/status` | Setup + dashboard |
| `/pipeline`, `/quickscan`, `/fullscan` | Scanning pipelines |
| `/correlate` | Chain discovery across findings |
| `/cost`, `/monitor` | Cost tracking, target change detection |
## Agents (50)
### H1 Weakness Specialists (19)
xss-hunter (#60/#61/#62), sqli-hunter (#67), csrf-hunter (#57), ssrf-hunter (#75), ssti-hunter (#74), idor-hunter (#55), auth-tester (#27), info-disclosure (#18), open-redirect (#38), rce-hunter (#70), xxe-hunter (#63), file-upload (#39), cors-hunter (#58), subdomain-takeover (#145), business-logic (#28), race-condition (#29), privilege-escalation (#26), oauth-hunter (#1/#22/#106/#137), llm-ai-hunter (chains under #18/#55/#61/#70/#106)
### Hunting & Analysis (3)
- **validator** — 7-Question Gate + never-submit list (PASS/KILL/DOWNGRADE/CHAIN)
- **chain-builder** — A→B chain walk against the capability table, searches writeup DB for proven chains
- **recon-ranker** — P1/P2/Kill surface ranking
### Infrastructure / Recon (10)
recon, vuln-scanner, config-auditor, cloud-recon, js-analyzer, waf-profiler, graphql-audit, nuclei-writer, browser-agent (Burp MCP), browser-stealth-agent (Camoufox)
### Meta / Validation (9)
brain, correlator, quality-check, monitor, poc-builder, report-writer, scope-check, browser-verifier (client-side PoC proof), dast-devils-advocate (adversarial downgrade)
### SAST Pipeline (8)
sast-file-ranker, sast-entry-mapper, sast-danger-mapper, sast-flow-tracer, sast-gap-analyzer, sast-devils-advocate, sast-hunter, sast-exploit-builder
### Specialized (1)
web3-auditor — Solidity grep arsenal, Foundry PoC, DeFi patterns
## Hunting Skills (5 deep methodology skills + 6 reference skills = 11)
The hunt-* skills are vuln-class-specific methodology files distilled from
public bug-bounty reports. Each has a verified 2024-2026 CVE catalog and
sub-techniques. The matching specialist agent reads its skill via
`Read $CLAUDE_PROJECT_DIR/skills/hunt-<class>/SKILL.md` before testing.
| Skill | Lines | Pairs With | Highlights |
|-------|-------|------------|------------|
| `skills/hunt-rce/SKILL.md` | 1,135 | rce-hunter | 1,218-report distillation. RSC CVE-2025-55182, runc Leaky Vessels, BentoML pickle, LangChain REPL, Tekton/OpenProject git arg injection, ingress-nginx, container/runtime, ML serving, agentic LLM tool-use, OSS supply chain |
| `skills/hunt-idor/SKILL.md` | 969 | idor-hunter | 1,117-report distillation. Sam Curry automotive chain, OneUptime CVE-2026-30956, Zitadel V2Beta/Mgmt API, Inforcer tenant enum, Apache Answer UUIDv1 prediction, Indico BOLA, GraphQL field-level pivots, agentic AI cross-tenant |
| `skills/hunt-xss/SKILL.md` | 968 | xss-hunter | DOMPurify mXSS family, Auth0 nextjs-auth0 returnTo, RSC DoS family, markdown-to-jsx, listmonk admin-ATO, Trix rich-text editor (H1 #2819573 / #2521419), Jupyter notebook XSS (GHSA-rch3-82jr-f9w9), n8n MCP OAuth XSS (GHSA-537j-gqpc-p7fq), LinkedIn-class iframe-in-article (H1 #2212950), 10 sub-techniques (A-J), Semgrep / ast-grep / ripgrep / CodeQL patterns |
| `skills/hunt-oauth/SKILL.md` | 770 | oauth-hunter | 365-report distillation. ruby-saml parser differentials, Authentik regex `redirect_uri`, workers-oauth-provider PKCE downgrade, Entra ID actor token, Hono JWT alg confusion, nOAuth, Tekton token exfil, Argo CD project token, tinyauth |
| `skills/hunt-llm-ai/SKILL.md` | 930 | llm-ai-hunter | OWASP LLM Top 10 v2025 + Agentic AI Top 10. Microsoft 365 Copilot ASCII Smuggling, LangChain GmailToolkit indirect injection (CVE-2025-46059), LangChain PythonREPLTool semantic RCE (CVE-2025-68613), BentoML pickle, Ollama RCE family, Open WebUI SSE injection, MLflow path traversal |
Reference skills (read by methodology-aware agents): `hunting-methodology`,
`recon-methodology`, `report-writing`, `sast-methodology`,
`triage-validation`, `vuln-classes`.
## CLI Tools (19)
| Tool | Purpose |
|------|---------|
| brain.py | Brain with endpoint tracking + circuit breaker |
| intel_engine.py | Hacktivity patterns + tech→vuln mapping |
| journal.py | JSONL session journal for /resume |
| target_selector.py | Program ROI ranking |
| cost_hook.py | CC hook: auto-logs agent completions via SubagentStop |
| statusline.py | Dashboard (--compact/--watch/--json) |
| scope_check.py | Scope validation with --list |
| scope_hook.py | PreToolUse hook: blocks out-of-scope Bash commands (exact + wildcard) |
| cvss_version_guard.py | Enforces H1 = CVSS 3.1, other platforms = CVSS 4.0 |
| file_path_guard.py | Blocks hallucinated file paths in reports |
| file_safety.py | Shared safety checks for agent-written files |
| dedup_findings.py | Dedup + hacktivity cross-reference |
| global_brain.py | Cross-engagement knowledge (incremental hash-based sync) |
| response_tracker.py | Response learning + auto-boost paid techniques |
| scaffold.py | Workspace scaffolding with update mode |
| capture.py | Screenshots + video (WSL2) |
| cost.py | Token cost tracking + ROI |
| camofox_ctl.sh | Camoufox (stealth Firefox) lifecycle — Cloudflare/Akamai bypass |
| pentest-statusline.sh | CC statusline: findings, brain, context, cost |
## Rules Library (`rules/`)
Single source of truth for every agent — all hunters, validators, and report-writers
read the relevant files at session start.
| File | Lines | Purpose |
|------|-------|---------|
| `hunting.md` | 360 | 31 hunting rules (Rule 0 harm check, Rule 8 sibling check, Rule 9 A→B signal, Rule 19 never-submit, Rule 24 mutation matrix, Rule 28 detection-token rotation, Rule 30 no cross-region inference, Rule 31 unauth state-change battery) |
| `payloads.md` | 2,605 | XSS (incl. Detection Mechanism Rotation Ladder) / SSRF / SQLi / IDOR / OAuth / upload / race / SSTI / deser / JWT / LFI / prototype pollution / NoSQLi / DeFi |
| `techniques.md` | 389 | Proven attack techniques extracted from real paid engagements |
| `waf-bypass-protocol.md` | 166 | WAF bypass iteration ladder for Akamai/Cloudflare/Imperva |
| `vendor-status.md` | 127 | Patched vendor vectors, framework fingerprints, cooldown tables |
| `chain-table.md` | 192 | Capability→next-bug chain table for `/chain` (9 capability rows + 4 documented deep chains) |
| `never-submit.md` | 42 | Never-submit list + conditionally-valid-with-chain table |
| `mistakes.md` | 665 | Top 10 most common mistakes — every agent reads this at session start |
## Key Features
- **Writeup search MCP**: Agents query prior art during hunting — bring your own FAISS/SQLite writeup index, or fall back to the shipped payload/technique library
- **CC hooks**: SubagentStop/Stop auto-log costs, statusline shows live `$X.XX` from token data
- **PreToolUse scope hook**: Bash commands are matched (exact + wildcard) against `scope.yaml`; out-of-scope targets are blocked before the tool call fires
- **7-Question Gate**: Every finding validated — first NO = KILL
- **Depth Engine**: `/autopilot` enforces an anti-shallow protocol — no claim of "exhausted" until the exhaustion matrix is complete
- **Stacked-encoding mandate**: `/hunt` and `/autopilot` require multi-layer encoding in every payload attempt before declaring a surface clean
- **CVSS policy guard**: HackerOne findings use CVSS 3.1; every other platform uses CVSS 4.0 — enforced by `cvss_version_guard.py`
- **Circuit breaker**: 5× consecutive 403/429 → auto-backoff 60s
- **Endpoint tracking**: Brain records every endpoint tested per target
- **Hard validation gates**: /report and /submit refuse without /validate PASS
- **Never-submit filter**: Pipeline auto-kills informational findings
- **Incremental sync**: Global brain hash-based, skips unchanged files
- **Feedback loop**: /learn auto-boosts paid techniques globally
- **Session journal**: JSONL log for /resume continuity
## Requirements
- Python 3.10+, `uv` (MCP servers launch via `uv run --with mcp`)
- Optional: `uv pip install faiss-cpu sentence-transformers` (for writeup semantic search)
- Security tools: nmap, httpx, subfinder, nuclei, ffuf, katana, sqlmap
- GraphQL hunter tools: `graphql-path-enum` — `cargo install --git https://gitlab.com/dee-see/graphql-path-enum` (auto-installed by `setup-mcp.sh` if `cargo` is present)
- Evidence: grim/scrot, wf-recorder/ffmpeg
- jq (for statusline)
## License
For authorized security testing only. Follow responsible disclosure.
Connection Info
You Might Also Like
everything-claude-code
Complete Claude Code configuration collection - agents, skills, hooks,...
markitdown
MarkItDown-MCP is a lightweight server for converting URIs to Markdown.
cc-switch
All-in-One Assistant for Claude Code, Codex & Gemini CLI across platforms.
servers
Model Context Protocol Servers
servers
Model Context Protocol Servers
Time
A Model Context Protocol server for time and timezone conversions.