Skip to content

feat(exec): detect and recover from zero-commit agent runs 🕵️#54

Closed
ivy wants to merge 44 commits intomainfrom
hive/675e291e-3009-416a-a9c9-e2ef3247b69f
Closed

feat(exec): detect and recover from zero-commit agent runs 🕵️#54
ivy wants to merge 44 commits intomainfrom
hive/675e291e-3009-416a-a9c9-e2ef3247b69f

Conversation

@ivy
Copy link
Copy Markdown
Owner

@ivy ivy commented Feb 10, 2026

Why

When agents analyze design proposals or get stuck, they can exit successfully without producing any commits. Previously, this failure was only discovered at publish time when gh pr create crashed with "No commits between main and branch". This violated Principle 3 (Single Responsibility) — publish shouldn't discover that exec didn't do its job.

What

Two-layer approach: proactive prompt + reactive validation.

Proactive: Updated agent system prompt to explicitly state that Ready items should produce code changes and commits.

Reactive: After agent execution completes, exec now:

  1. Checks for commits using git rev-list --count main..HEAD (fast, happy-path check)
  2. If zero commits, asks agent for structured completion report via --json-schema
  3. If agent claims completion without commits, resumes with nudge to actually implement
  4. If agent reports blockers, fails the workspace with the blocker message
  5. Retries up to 3 times

Publish gets a belt-and-suspenders safety guard: checks for commits and skips PR creation if none exist.

Pattern: Follows prdraft's approach — RunCapture with --json-schema for structured output, retry loop with nudge messages. Preserves real-time logging during main agent run by only using RunCapture for validation.

Implementation details

  • workspace.HasNewCommits: Helper that counts commits on branch vs main
  • validateCompletion: Orchestrates the validation retry loop
  • getCompletionReport: Uses --resume + --json-schema to get structured {completed, summary, blockers}
  • nudgeForImplementation: Resumes agent with message to implement changes when it claims completion without commits

Test coverage

  • Specs for HasNewCommits (fresh worktree → false, after commit → true)
  • Fixed test helper to use git init -b main for Git 2.53.0 compatibility
  • All existing tests pass

Generated with Hive | Closes #48

ivy and others added 30 commits February 10, 2026 01:31
…ths ✨

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New commands: `hive ls` lists sessions, `hive cd` spawns a shell in a
session's workspace. Shared resolveSession logic handles both UUID
direct lookup and ref-based resolution via claims. `hive attach` now
resolves by ref or UUID. `hive list` becomes a deprecated alias for ls.
Replaces manual `hive cleanup` with session-aware reaping. Scans for
terminal sessions past retention, removes workspaces/sessions/claims.
Detects stale claims with no active systemd unit and marks them failed.
Deprecates the cleanup command.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract reapSessions with injectable deps (isUnitActive, releaseOnSource,
removeWorkspace) so core reap logic is testable without systemd or GitHub
API. Tests cover: expired published/failed reaping, retention-window
preservation, stale session detection, and active unit skipping.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Manual runs (hive run owner/repo#N) don't go through poll so sessions
lack board_item_id in SourceMetadata. Skip the source release for these
instead of hitting "unknown ref" from the ghprojects adapter's empty
cache. Also populate SourceMetadata when --board-item-id flag is passed.
Add poll.max-concurrent, poll.instance, reap section with retention
settings, and github.ready-option-id. Improve comments to document
each key's purpose and per-instance config pattern.
Repo map, pipeline, command table, and tech stack now reflect the
session/claim/source packages, systemd dispatch, reap, and new CLI
surface (ls, cd, attach).
Reflects the new dispatch model (systemd units, claims, sessions),
source abstraction, XDG data layout, session status lifecycle, and
updated CLI surface (ls, cd, reap).
The end-state specification that guided the 8-phase lifecycle
migration is now reality. Track it alongside architecture.md as
the reference for dispatch, sessions, claims, and systemd units.
Explains threat model (helpful-not-malicious), trust boundaries across
pipeline stages, credential isolation via systemd-run mounts, the .git
write trade-off, network access stance, author authorization, and
what's explicitly out of scope.
Guided tutorial taking the operator from a freshly built hive binary
through manually dispatching a single issue to a completed pull request.
Covers prepare, exec, publish, and the hive run shortcut.
Extract CLI examples (tutorial/how-to material), session status lifecycle
and poll loop detail (now in lifecycle.md). Add conceptual "why" framing
to each component, cross-links to security model and ADRs, data flow
diagram, and authz module.
Step-by-step: build from source, install binary, install systemd
units, verify toolchain.
Finding GraphQL IDs, setting up config.toml, per-instance config
for multi-project setups, poll and reap tuning.
Systemd user service setup, loginctl enable-linger, log checking,
zero-downtime binary upgrades, multi-instance deployment.
hive ls, hive cd, hive attach, reading .hive/ metadata, journald
logs, manual resume with hive exec --resume.
Writing GitHub issues that work as agent prompts, based on real
learnings from prototype usage.
Complete reference for every hive command with synopsis, flags,
config keys, credentials, exit codes, and examples.
All config keys organized by section, environment variables,
resolution order, and named instance support.
Session JSON schema, status state machine, claim file format,
workspace directory layout, and metadata files.
Source interface contract, WorkItem struct, ref format conventions,
and GitHub Projects adapter specifics.
Jail interface contract, RunOpts struct, systemd-run backend details,
sandbox properties, mount strategy, and credential isolation.
ivy and others added 14 commits February 10, 2026 02:31
All unit templates, instance specifiers, unit relationships,
and useful systemctl/journalctl commands.
Trim duplicated content that now lives in docs/, add Diátaxis-structured
documentation navigation, streamline quick start, and add commands table.
Link out to tutorial, how-to, reference, and explanation docs instead of
inlining setup/usage/sandboxing/workspace details.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Set version via -ldflags at build time so `hive --version` shows the
build timestamp. Logs version on startup for debugging. Defaults to
"dev" for unset builds.
Remove board-item-id (not written by prepare), replace metadata table
with inline mention + cross-link, cut flags table, trim reap details,
and shorten sandbox/retry explanations to single sentences with links.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Aligns with Diátaxis convention for how-to guide naming.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
VERSION defaults to "dev" and is overridable via env var
(e.g. VERSION=v1.0.0 make build). Build timestamp is always
injected separately so version and build time are independent.
debug-session: move status table and metadata listing to session ref links.
configure: move config search order to config reference link.
write-issues: remove "Why issue quality matters" explanation and
"Authorization"/"Board workflow" reference sections, add cross-links.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
jail-interface: stdin is not connected (only stdout/stderr are set).
source-interface: Complete is not called by publish — publish calls
gh.MoveToInReview() directly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
architecture: replace inline Data Layout tree with cross-link to
reference/session.md, add links to jail and source interface refs.
lifecycle: update data layout links to reference/session.md, make
"reference docs" mention link to specific docs.
security-model: fix "silently fail" to "rejected with permission denied".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Repo map now includes reference/, how-to/, explanation/, tutorial/,
and prototype/ directories added during docs reorganization. Corrects
logging entry from nonexistent slog-journal to actual TextHandler,
and adds Key Docs entries for the full Diátaxis structure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When an agent produces no commits (e.g. analyzing a design proposal
instead of implementing), exec now:

1. Proactively tells agents that Ready items should produce code
2. Detects zero-commit runs after exec completes
3. Asks for structured completion report via JSON schema
4. Retries with nudge if agent claims completion without commits
5. Fails with blocker message if agent reports impediments

Publish gets a safety guard (belt-and-suspenders) to skip PR creation
if no commits exist, preventing the GraphQL error from #28.

Follows the prdraft pattern: git check first (happy path), structured
validation on failure. Preserves real-time logging during main run.
ivy pushed a commit that referenced this pull request Feb 10, 2026
Prevents publish crashes when agents produce analysis without code changes.

Two-layer approach:
1. Proactive: Updated agent system prompt to require implementation
2. Reactive: Post-exec validation using structured output

When zero commits detected, exec requests completion report via
--json-schema. If agent claims completion (but no commits), retries with
nudge to implement. If agent reports blockers, fails with reason.

Publish now guards against zero-commit branches as safety net.

Closes #54
@ivy ivy closed this Feb 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

exec should validate agent produced commits

1 participant