A deterministic proof gate for agentic software development.
Signum helps teams use AI coding agents more safely by turning every change into a contract-driven, auditable workflow.
Instead of trusting that an agent “probably did the right thing”, Signum asks for evidence:
- What was the contract?
- What changed?
- Which checks passed?
- What risks were found?
- What artifacts prove the result?
- Is this safe enough to merge?
At the end, Signum produces a proofpack: a structured evidence bundle that CI and humans can inspect.
AI agents can move fast, but fast changes need reliable boundaries.
Signum adds a release-style verification layer around agentic work:
- Contract-first execution — define intent, scope, and acceptance criteria before implementation.
- Deterministic checks — validate what can be checked without relying on model judgment.
- Policy scanning — catch risky code patterns, dependency changes, secrets, and incomplete work.
- Proofpack output — package evidence into a structured artifact.
- GitHub-ready CI gate — make merge decisions easier to review.
Signum is not a replacement for engineering judgment. It is a guardrail system that makes agentic changes easier to inspect, reproduce, and trust.
Signum follows a simple flow:
Contract → Execute → Audit → Pack → CI Gate
A change starts with a contract: the requested outcome, boundaries, risks, and acceptance criteria.
The implementation runs against the contract. Signum keeps the work tied to the original intent.
Signum runs deterministic checks and policy scans. Optional reviewer tools can add additional review signals.
Signum creates a proofpack: a structured evidence bundle containing the contract, diff, checks, audit summary, and decision metadata.
GitHub Actions can validate the proofpack and expose the result as a merge gate.
The .signum/ directory is a structured registry/state/archive namespace; normal runs do not create root artifact files or root runtime dirs directly in the root .signum/ folder.
- Normal runs do not create root runtime dirs like
.signum/reviews/. - resume checks use the registry first, with root
.signum/contract.jsononly as a legacy import signal.
A Signum run writes canonical artifacts under:
.signum/contracts/<contractId>/
Typical artifacts include:
contract.json
contract-engineer.json
contract-policy.json
combined.patch
execute_log.json
mechanic_report.json
policy_scan.json
holdout_report.json
audit_summary.json
proofpack.json
The most important output is:
proofpack.json
This is the evidence bundle used by CI and reviewers.
Signum expects a minimal local toolchain (>= v4.18.0):
bash
git
jq
python3
Use the canonical init command:
/signum:init --harnessFor Claude Code usage, install the Claude Code CLI according to your environment.
Optional reviewer tools may be used when available, but Signum keeps deterministic checks separate from model-based review.
bash scripts/run-deterministic-tests.shbash scripts/run-cleanroom-smoke.shFor a deeper pre-publish check:
SIGNUM_CLEANROOM_FULL=1 bash scripts/run-cleanroom-smoke.shpython3 scripts/validate_proofpack.py \
.signum/contracts/<contractId>/proofpack.json \
--repo-root .Signum includes a GitHub Actions template for validating proofpacks in CI.
The CI path is intentionally deterministic:
- no hidden background work;
- no required external AI reviewer;
- no secrets needed for deterministic tests;
- pinned GitHub Actions refs;
- fixed Ubuntu runner label;
- clean-room smoke coverage.
The high-risk PR intake gate is intentionally strict. PRs touching sensitive paths such as workflows, scripts, command orchestration, or policy logic may require maintainer review or override.
Signum separates three kinds of evidence.
Checks that can run without model judgment:
- proofpack validation;
- policy scanner;
- DSL runner validation;
- artifact path guards;
- command renderer parity;
- clean-room smoke tests.
Optional reviewer outputs can be included when available, but they are treated as review signals, not as the only source of truth.
Large or high-risk changes still require human judgment. Signum makes that review easier by packaging the relevant evidence.
Signum includes a deterministic policy scanner with stable rule IDs.
It can detect patterns such as:
- dynamic code execution;
- XSS sinks;
- SQL injection patterns;
- shell injection risks;
- weak crypto;
- suspicious incomplete code markers;
- dependency additions.
False positives can be explicitly suppressed with a visible rule-based marker:
SIGNUM_POLICY_ALLOW:<RULE_ID>:<reason>
Critical findings are not suppressible by default.
Proofpacks are validated before CI consumes their result.
The validator checks:
- required fields;
- schema and Signum version;
- decision metadata;
- artifact references;
- safe relative paths;
- optional removal evidence shape when present.
Run it directly:
python3 scripts/validate_proofpack.py path/to/proofpack.json --repo-root .The main Signum command is generated from fragments.
Runtime command files remain checked in, but renderer checks ensure fragments reproduce them byte-for-byte:
python3 scripts/render_signum_command.py \
--manifest commands/signum.fragments/manifest.json \
--output commands/signum.md \
--checkClaude Code overlay rendering is checked separately:
python3 platforms/claude-code/scripts/render_signum_command.py \
--manifest platforms/claude-code/commands/signum.fragments/manifest.json \
--output platforms/claude-code/commands/signum.md \
--checkUse Signum when:
- AI agents are modifying important code;
- changes need auditability;
- PRs should include structured evidence;
- you want deterministic gates before merge;
- you need a repeatable contract-first workflow.
Signum is especially useful for:
- AI coding agent workflows;
- internal developer tools;
- CI/CD guardrails;
- security-sensitive automation;
- multi-agent development experiments.
Signum may be too heavy if:
- you only need a simple one-off script;
- there is no CI or review process;
- you do not need audit artifacts;
- you want fully autonomous merging without human oversight.
Signum is designed to make agentic work safer, not invisible.
Signum is a stabilized baseline, not a full production certification system.
Known limitations:
- policy scanning is still regex-based, not a full semantic parser;
- optional reviewer tools depend on external CLI availability and authentication;
- GitHub-hosted runner images can still receive upstream patch updates;
- clean-room smoke is not a real package publish/install test;
- remote Emporium push is not tested by the local smoke path;
- high-risk PRs may still require maintainer review or override.
Run the deterministic suite:
bash scripts/run-deterministic-tests.shRun clean-room smoke:
bash scripts/run-cleanroom-smoke.shRun evals:
python3 evals/run.pyRun renderer checks:
python3 scripts/render_signum_command.py \
--manifest commands/signum.fragments/manifest.json \
--output commands/signum.md \
--check
python3 platforms/claude-code/scripts/render_signum_command.py \
--manifest platforms/claude-code/commands/signum.fragments/manifest.json \
--output platforms/claude-code/commands/signum.md \
--checkSignum includes a maintainer release path for syncing the plugin entry with the Emporium marketplace.
- Release smoke test: run
bash lib/release-smoke.shbefore publishing release metadata. - Marketplace sync: the
Sync Emporium marketplace entryworkflow updatesheurema/emporium/.claude-plugin/marketplace.json. - Automation secret: non-dry-run cross-repo sync requires
EMPORIUM_SSH_KEY. - Manual trigger: the workflow supports
workflow_dispatchso maintainers can run a controlled release dry-run or sync. - Release trigger: the workflow also runs on release publication so marketplace metadata stays aligned with Signum releases.
This is maintainer-facing release documentation. It does not change the user-facing Signum runtime flow.
Signum is built around a simple principle:
AI-generated changes should be easy to inspect, reproduce, and verify.
A good agentic workflow should not ask reviewers to trust invisible reasoning.
It should produce a clear contract, deterministic checks, and a verifiable proofpack.
Signum is the seal on that process.
