feat(exec): detect and recover from zero-commit agent executions :face_with_spiral_eyes: by ivy · Pull Request #55 · ivy/hive

ivy · 2026-02-10T11:13:06Z

Why

hive exec was silently succeeding when agents produced zero commits, pushing failure detection to publish, which would crash with GraphQL: No commits between main and <branch> when attempting to create a PR. This violated Principle 3 (Single Responsibility) — publish shouldn't be responsible for detecting that exec didn't produce work.

The blind spot: exec only checked for uncommitted changes after execution, but had no validation for "no work at all." An agent could analyze a design issue, exit cleanly with zero commits and zero uncommitted changes, and exec would report success.

What

Implemented a two-layer approach:

Proactive (prevention): Updated the agent system prompt to explicitly require implementation and commits. If an issue is not implementable (e.g., a design discussion), the agent should report completion as false with blockers.

Reactive (detection & recovery): After exec completes:

Check for commits with git rev-list --count main..HEAD (happy path is free)
If zero commits, request a structured completion report via --json-schema (following the prdraft pattern)
If agent claims completion but no commits exist, retry with a nudge to actually implement (up to 3 attempts)
If agent reports blockers, fail with the blocker reason

Safety net: Added a guard in publish that checks for commits before attempting to push/PR. This is belt-and-suspenders — exec should catch it first, but publish won't crash if something slips through.

Notes for reviewers

The validation uses j.RunCapture() (not j.Run()) to parse structured output without losing real-time logs during the main execution
The completion schema mirrors the pattern from prdraft — same retry loop, same structured output parsing
Test coverage includes specs for HasNewCommits in the workspace package
Also includes an unrelated fix: changed workspace.Create to branch from main instead of HEAD (was causing test failures)

Generated with Hive | Closes #48

…ths ✨ Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

New commands: `hive ls` lists sessions, `hive cd` spawns a shell in a session's workspace. Shared resolveSession logic handles both UUID direct lookup and ref-based resolution via claims. `hive attach` now resolves by ref or UUID. `hive list` becomes a deprecated alias for ls.

Replaces manual `hive cleanup` with session-aware reaping. Scans for terminal sessions past retention, removes workspaces/sessions/claims. Detects stale claims with no active systemd unit and marks them failed. Deprecates the cleanup command.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Extract reapSessions with injectable deps (isUnitActive, releaseOnSource, removeWorkspace) so core reap logic is testable without systemd or GitHub API. Tests cover: expired published/failed reaping, retention-window preservation, stale session detection, and active unit skipping. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Manual runs (hive run owner/repo#N) don't go through poll so sessions lack board_item_id in SourceMetadata. Skip the source release for these instead of hitting "unknown ref" from the ghprojects adapter's empty cache. Also populate SourceMetadata when --board-item-id flag is passed.

Add poll.max-concurrent, poll.instance, reap section with retention settings, and github.ready-option-id. Improve comments to document each key's purpose and per-instance config pattern.

Repo map, pipeline, command table, and tech stack now reflect the session/claim/source packages, systemd dispatch, reap, and new CLI surface (ls, cd, attach).

Reflects the new dispatch model (systemd units, claims, sessions), source abstraction, XDG data layout, session status lifecycle, and updated CLI surface (ls, cd, reap).

The end-state specification that guided the 8-phase lifecycle migration is now reality. Track it alongside architecture.md as the reference for dispatch, sessions, claims, and systemd units.

Explains threat model (helpful-not-malicious), trust boundaries across pipeline stages, credential isolation via systemd-run mounts, the .git write trade-off, network access stance, author authorization, and what's explicitly out of scope.

Guided tutorial taking the operator from a freshly built hive binary through manually dispatching a single issue to a completed pull request. Covers prepare, exec, publish, and the hive run shortcut.

Extract CLI examples (tutorial/how-to material), session status lifecycle and poll loop detail (now in lifecycle.md). Add conceptual "why" framing to each component, cross-links to security model and ADRs, data flow diagram, and authz module.

Step-by-step: build from source, install binary, install systemd units, verify toolchain.

Finding GraphQL IDs, setting up config.toml, per-instance config for multi-project setups, poll and reap tuning.

Systemd user service setup, loginctl enable-linger, log checking, zero-downtime binary upgrades, multi-instance deployment.

hive ls, hive cd, hive attach, reading .hive/ metadata, journald logs, manual resume with hive exec --resume.

Writing GitHub issues that work as agent prompts, based on real learnings from prototype usage.

Complete reference for every hive command with synopsis, flags, config keys, credentials, exit codes, and examples.

All config keys organized by section, environment variables, resolution order, and named instance support.

Session JSON schema, status state machine, claim file format, workspace directory layout, and metadata files.

Source interface contract, WorkItem struct, ref format conventions, and GitHub Projects adapter specifics.

Jail interface contract, RunOpts struct, systemd-run backend details, sandbox properties, mount strategy, and credential isolation.

All unit templates, instance specifiers, unit relationships, and useful systemctl/journalctl commands.

Trim duplicated content that now lives in docs/, add Diátaxis-structured documentation navigation, streamline quick start, and add commands table. Link out to tutorial, how-to, reference, and explanation docs instead of inlining setup/usage/sandboxing/workspace details. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Set version via -ldflags at build time so `hive --version` shows the build timestamp. Logs version on startup for debugging. Defaults to "dev" for unset builds.

Remove board-item-id (not written by prepare), replace metadata table with inline mention + cross-link, cut flags table, trim reap details, and shorten sandbox/retry explanations to single sentences with links. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Aligns with Diátaxis convention for how-to guide naming. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

VERSION defaults to "dev" and is overridable via env var (e.g. VERSION=v1.0.0 make build). Build timestamp is always injected separately so version and build time are independent.

debug-session: move status table and metadata listing to session ref links. configure: move config search order to config reference link. write-issues: remove "Why issue quality matters" explanation and "Authorization"/"Board workflow" reference sections, add cross-links. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

jail-interface: stdin is not connected (only stdout/stderr are set). source-interface: Complete is not called by publish — publish calls gh.MoveToInReview() directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

architecture: replace inline Data Layout tree with cross-link to reference/session.md, add links to jail and source interface refs. lifecycle: update data layout links to reference/session.md, make "reference docs" mention link to specific docs. security-model: fix "silently fail" to "rejected with permission denied". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Repo map now includes reference/, how-to/, explanation/, tutorial/, and prototype/ directories added during docs reorganization. Corrects logging entry from nonexistent slog-journal to actual TextHandler, and adds Key Docs entries for the full Diátaxis structure. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… loop 🐛

Use coreos/go-systemd daemon.SdNotify to report poll loop and run pipeline status via STATUS= strings, visible in systemctl status. Poll service upgraded to Type=notify with READY=1 on startup. Run service gets NotifyAccess=all for status passthrough. No-op when not running under systemd.

Prevents publish crashes when agents produce analysis without code changes. Two-layer approach: 1. Proactive: Updated agent system prompt to require implementation 2. Reactive: Post-exec validation using structured output When zero commits detected, exec requests completion report via --json-schema. If agent claims completion (but no commits), retries with nudge to implement. If agent reports blockers, fails with reason. Publish now guards against zero-commit branches as safety net. Closes #54

ivy and others added 30 commits February 10, 2026 01:31

feat(session): add session metadata package with JSON CRUD and XDG pa…

3b73020

…ths ✨ Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(claim): add atomic claim files for dispatch dedup 🔒

6c4896b

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(source): add Source interface and GitHub Projects adapter 🔌

82e288a

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(systemd): add unit templates and Makefile install targets ⚙️

ac4b4d7

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(workspace): migrate paths to XDG with UUID-based naming 🏡

a3aa229

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(run): accept UUID argument and read session metadata 🔗

d9e44e3

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(poll): rewrite dispatch with Source, claims, and systemd units 🚀

7b2053d

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: remove dead code and release claims on run completion 🧹

fedf7e8

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

test: add concurrent claim dedup and RegisterItem coverage 🧪

73573b3

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: update hive.example.toml with all config keys ✨

72e021d

Add poll.max-concurrent, poll.instance, reap section with retention settings, and github.ready-option-id. Improve comments to document each key's purpose and per-instance config pattern.

docs: update AGENTS.md for lifecycle architecture 💅

ad4db54

Repo map, pipeline, command table, and tech stack now reflect the session/claim/source packages, systemd dispatch, reap, and new CLI surface (ls, cd, attach).

docs(architecture): rewrite for lifecycle architecture 💅

a25825b

Reflects the new dispatch model (systemd units, claims, sessions), source abstraction, XDG data layout, session status lifecycle, and updated CLI surface (ls, cd, reap).

docs: track lifecycle spec as permanent documentation 🎉

7146b32

The end-state specification that guided the 8-phase lifecycle migration is now reality. Track it alongside architecture.md as the reference for dispatch, sessions, claims, and systemd units.

docs(explanation): add security model 🔒

4800f93

Explains threat model (helpful-not-malicious), trust boundaries across pipeline stages, credential isolation via systemd-run mounts, the .git write trade-off, network access stance, author authorization, and what's explicitly out of scope.

docs(tutorial): add first-run walkthrough 📖

8715afa

Guided tutorial taking the operator from a freshly built hive binary through manually dispatching a single issue to a completed pull request. Covers prepare, exec, publish, and the hive run shortcut.

docs(how-to): add installation guide 🔧

aad0485

Step-by-step: build from source, install binary, install systemd units, verify toolchain.

docs(how-to): add configuration guide ⚙️

cffdda3

Finding GraphQL IDs, setting up config.toml, per-instance config for multi-project setups, poll and reap tuning.

docs(how-to): add deployment guide 🚀

d9950fa

Systemd user service setup, loginctl enable-linger, log checking, zero-downtime binary upgrades, multi-instance deployment.

docs(how-to): add session debugging guide 🔍

5aa273e

hive ls, hive cd, hive attach, reading .hive/ metadata, journald logs, manual resume with hive exec --resume.

docs(how-to): add issue writing guide ✏️

417682a

Writing GitHub issues that work as agent prompts, based on real learnings from prototype usage.

docs(reference): add CLI reference 📚

825bd3c

Complete reference for every hive command with synopsis, flags, config keys, credentials, exit codes, and examples.

docs(reference): add configuration reference ⚙️

1d06d8e

All config keys organized by section, environment variables, resolution order, and named instance support.

docs(reference): add session and data reference 📁

4e00ff3

Session JSON schema, status state machine, claim file format, workspace directory layout, and metadata files.

docs(reference): add source interface reference 🔌

faa0edf

Source interface contract, WorkItem struct, ref format conventions, and GitHub Projects adapter specifics.

docs(reference): add jail interface reference 🔒

d8e93ed

Jail interface contract, RunOpts struct, systemd-run backend details, sandbox properties, mount strategy, and credential isolation.

ivy and others added 15 commits February 10, 2026 02:31

docs(reference): add systemd units reference ⚙️

7bb7b25

All unit templates, instance specifiers, unit relationships, and useful systemctl/journalctl commands.

feat(cli): add --version flag with build timestamp 🚀

1e47bed

Set version via -ldflags at build time so `hive --version` shows the build timestamp. Logs version on startup for debugging. Defaults to "dev" for unset builds.

docs(how-to): add "How to" prefix to all guide titles 🧹

0b85aa7

Aligns with Diátaxis convention for how-to guide naming. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat(cli): separate version and build timestamp in --version 🔧

e2d8bc0

VERSION defaults to "dev" and is overridable via env var (e.g. VERSION=v1.0.0 make build). Build timestamp is always injected separately so version and build time are independent.

fix(cli): use ISO 8601 format for build timestamp 💅

e6352f5

fix(poll): use --no-block for systemd dispatch to avoid blocking poll…

7ff2e5c

… loop 🐛

feat(poll): log configuration at startup for observability 👀

bd0224f

ivy closed this Feb 10, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(exec): detect and recover from zero-commit agent executions 😵‍💫#55

feat(exec): detect and recover from zero-commit agent executions 😵‍💫#55
ivy wants to merge 45 commits intomainfrom
hive/ca7be0e4-afb4-40e0-89a4-506c7f8bebe8

ivy commented Feb 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ivy commented Feb 10, 2026

Why

What

Notes for reviewers

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant