Your AI agent reads untrusted repos. CloneGuard watches what it does next.
Hook-level defense for AI coding agents. Detects prompt injection, constrains suspicious tool calls, and emits structured audit logs -- before the agent executes. The agent cannot disable it because CloneGuard runs at the hook layer, outside the agent's control.
Built for Claude Code and Gemini CLI. Standalone scanning works with any agent. Active development (v0.6.0) -- feedback and contributions welcome.
A repo with a hidden .clinerules payload — CloneGuard catches it on scan.
More demos
Behavioral sequence detection — reads .env, then tries to curl the data out. First step allowed, second step blocked.
Package hallucination — agent tries to install a package that doesn't exist on PyPI.
pip install cloneguard
cloneguard init --globalThat's it. CloneGuard is now scanning every tool call in Claude Code. No config files to write, no agent restart required.
Want the semantic classifier (recommended):
pip install "cloneguard[mini]"- Prompt injection patterns -- 240 rules across 34 categories, from instruction override to reasoning hijack to MCP tool poisoning
- Behavioral sequences -- credential file read followed by network exfiltration attempt (SEQ-001), config writes for privilege escalation (SEQ-005), and more
- Package hallucination -- agent tries to install a package that doesn't exist on PyPI/npm. If an attacker had registered that name first, you'd be running their code
- Sensitive file access -- detects reads of credentials, SSH keys, and environment files in suspicious context
Four defense layers, each running before the agent can act:
Layer 0 Pre-execution Scans repo files before agent launches
Layer 1 InstructionsLoaded Scans CLAUDE.md / rules files when loaded
Layer 2 PostToolUse Scans all tool output for injected instructions
Layer 3 PreToolUse Gates writes, builds, and config changes
Detection signals:
| Signal | What | Speed |
|---|---|---|
| Pattern matching | 240 compiled regex rules, 34 categories | <50ms |
| Semantic classifier | Fine-tuned MiniLM-L6-v2 ONNX model (94.3% F1) | ~16ms/sample |
| Behavioral sequences | CaMeL-lite session-wide tool-call monitoring | <0.5ms/event |
When a detection fires, CloneGuard can report it (default), constrain the tool call via OS-level sandbox, or block it outright -- configurable per-rule and per-severity via YAML policy.
False positive rates validated against 208,127 real coding-agent sessions from published SWE-bench datasets (SEQ-001 FPR: 0.0024%).
| Platform | Hook Integration | Standalone Scan | Status |
|---|---|---|---|
| Claude Code | Tested | Yes | cloneguard init configures hooks |
| Gemini CLI | Tested | Yes | Manual hook config, auto-normalizes format |
| Cursor | Untested | Yes | Same hook protocol, manual config required |
| Windsurf | Untested | Yes | Same hook protocol, manual config required |
| GitHub Actions | -- | Yes | cloneguard scan --sarif for Security tab |
| Any agent | -- | Yes | cloneguard scan /path/to/repo |
Hook integration tested with Claude Code and Gemini CLI 0.37. Cursor and Windsurf use the same hook protocol and are expected to work with manual configuration but have not been tested. Feedback welcome.
CloneGuard defaults to detection-only mode (dry-run). When enforcement is enabled, tool calls receive one of three verdicts:
| Verdict | Meaning | Default action |
|---|---|---|
| SAFE | No signals fired | Allow |
| SUSPICIOUS | Low-confidence match | Constrain (sandbox) |
| MALICIOUS | High-confidence match | Block |
Constraint uses OS-level sandboxing -- Landlock on Linux, Seatbelt on macOS -- to restrict filesystem and network access for the tool call subprocess without affecting CloneGuard itself. Additional adapters available for Docker, gVisor, Firecracker, and WASM isolation.
Configure via ~/.cloneguard/policy.yaml. See the
policy engine docs
for details.
CloneGuard is in active development. The core detection engine is tested against 240 rules, 1,677 automated tests, and adversarial evaluations including multi-model red teaming. False positive rates were calibrated against 208,127 real coding-agent sessions from published SWE-bench datasets.
Enterprise features (OPA/Cedar policy backends, SIEM connectors, fleet deployment tooling) are early-stage and should be considered experimental.
Known limitations are documented in the evaluation section of the docs site.
git clone https://github.com/prodnull/cloneguard.git
cd cloneguard
uv venv .venv && source .venv/bin/activate
uv pip install -e ".[dev,mini]"
pytestFull documentation at prodnull.github.io/cloneguard.
- Getting Started -- 5-minute setup for Claude Code
- Architecture -- defense layers, signal flow, enforcement pipeline
- Evaluation -- adaptive red team methodology and results
- Limitations -- what CloneGuard does not catch
- Making Prompt Injection Harder Against AI Coding Agents -- architecture and design decisions
- What Happens When Someone Tries to Break It -- adversarial hardening
- From Catching Payloads to Catching Behavior -- behavioral pivot
- What Claude Code's Leaked Permission Classifier Misses -- gap analysis