Skip to content

prodnull/cloneguard

Repository files navigation

CloneGuard

CloneGuard

Your AI agent reads untrusted repos. CloneGuard watches what it does next.

PyPI Python CI Tests License

Hook-level defense for AI coding agents. Detects prompt injection, constrains suspicious tool calls, and emits structured audit logs -- before the agent executes. The agent cannot disable it because CloneGuard runs at the hook layer, outside the agent's control.

Built for Claude Code and Gemini CLI. Standalone scanning works with any agent. Active development (v0.6.0) -- feedback and contributions welcome.

See It In Action

CloneGuard catching a malicious .clinerules file
A repo with a hidden .clinerules payload — CloneGuard catches it on scan.

More demos

Behavioral sequence detection — reads .env, then tries to curl the data out. First step allowed, second step blocked.

Behavioral sequence detection

Package hallucination — agent tries to install a package that doesn't exist on PyPI.

Package hallucination detection

Quick Start

pip install cloneguard
cloneguard init --global

That's it. CloneGuard is now scanning every tool call in Claude Code. No config files to write, no agent restart required.

Want the semantic classifier (recommended):

pip install "cloneguard[mini]"

What It Catches

  • Prompt injection patterns -- 240 rules across 34 categories, from instruction override to reasoning hijack to MCP tool poisoning
  • Behavioral sequences -- credential file read followed by network exfiltration attempt (SEQ-001), config writes for privilege escalation (SEQ-005), and more
  • Package hallucination -- agent tries to install a package that doesn't exist on PyPI/npm. If an attacker had registered that name first, you'd be running their code
  • Sensitive file access -- detects reads of credentials, SSH keys, and environment files in suspicious context

How It Works

Four defense layers, each running before the agent can act:

Layer 0  Pre-execution     Scans repo files before agent launches
Layer 1  InstructionsLoaded Scans CLAUDE.md / rules files when loaded
Layer 2  PostToolUse        Scans all tool output for injected instructions
Layer 3  PreToolUse         Gates writes, builds, and config changes

Detection signals:

Signal What Speed
Pattern matching 240 compiled regex rules, 34 categories <50ms
Semantic classifier Fine-tuned MiniLM-L6-v2 ONNX model (94.3% F1) ~16ms/sample
Behavioral sequences CaMeL-lite session-wide tool-call monitoring <0.5ms/event

When a detection fires, CloneGuard can report it (default), constrain the tool call via OS-level sandbox, or block it outright -- configurable per-rule and per-severity via YAML policy.

False positive rates validated against 208,127 real coding-agent sessions from published SWE-bench datasets (SEQ-001 FPR: 0.0024%).

Platform Support

Platform Hook Integration Standalone Scan Status
Claude Code Tested Yes cloneguard init configures hooks
Gemini CLI Tested Yes Manual hook config, auto-normalizes format
Cursor Untested Yes Same hook protocol, manual config required
Windsurf Untested Yes Same hook protocol, manual config required
GitHub Actions -- Yes cloneguard scan --sarif for Security tab
Any agent -- Yes cloneguard scan /path/to/repo

Hook integration tested with Claude Code and Gemini CLI 0.37. Cursor and Windsurf use the same hook protocol and are expected to work with manual configuration but have not been tested. Feedback welcome.

Enforcement

CloneGuard defaults to detection-only mode (dry-run). When enforcement is enabled, tool calls receive one of three verdicts:

Verdict Meaning Default action
SAFE No signals fired Allow
SUSPICIOUS Low-confidence match Constrain (sandbox)
MALICIOUS High-confidence match Block

Constraint uses OS-level sandboxing -- Landlock on Linux, Seatbelt on macOS -- to restrict filesystem and network access for the tool call subprocess without affecting CloneGuard itself. Additional adapters available for Docker, gVisor, Firecracker, and WASM isolation.

Configure via ~/.cloneguard/policy.yaml. See the policy engine docs for details.

Development Status

CloneGuard is in active development. The core detection engine is tested against 240 rules, 1,677 automated tests, and adversarial evaluations including multi-model red teaming. False positive rates were calibrated against 208,127 real coding-agent sessions from published SWE-bench datasets.

Enterprise features (OPA/Cedar policy backends, SIEM connectors, fleet deployment tooling) are early-stage and should be considered experimental.

Known limitations are documented in the evaluation section of the docs site.

Development

git clone https://github.com/prodnull/cloneguard.git
cd cloneguard
uv venv .venv && source .venv/bin/activate
uv pip install -e ".[dev,mini]"
pytest

Documentation

Full documentation at prodnull.github.io/cloneguard.

Background

License

Apache 2.0

About

Guarding AI agents from Untrusted Content

Resources

License

Security policy

Stars

Watchers

Forks

Contributors