feat(exec): detect and recover from zero-commit agent executions 😵💫#55
Closed
feat(exec): detect and recover from zero-commit agent executions 😵💫#55
Conversation
…ths ✨ Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New commands: `hive ls` lists sessions, `hive cd` spawns a shell in a session's workspace. Shared resolveSession logic handles both UUID direct lookup and ref-based resolution via claims. `hive attach` now resolves by ref or UUID. `hive list` becomes a deprecated alias for ls.
Replaces manual `hive cleanup` with session-aware reaping. Scans for terminal sessions past retention, removes workspaces/sessions/claims. Detects stale claims with no active systemd unit and marks them failed. Deprecates the cleanup command.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract reapSessions with injectable deps (isUnitActive, releaseOnSource, removeWorkspace) so core reap logic is testable without systemd or GitHub API. Tests cover: expired published/failed reaping, retention-window preservation, stale session detection, and active unit skipping. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Manual runs (hive run owner/repo#N) don't go through poll so sessions lack board_item_id in SourceMetadata. Skip the source release for these instead of hitting "unknown ref" from the ghprojects adapter's empty cache. Also populate SourceMetadata when --board-item-id flag is passed.
Add poll.max-concurrent, poll.instance, reap section with retention settings, and github.ready-option-id. Improve comments to document each key's purpose and per-instance config pattern.
Repo map, pipeline, command table, and tech stack now reflect the session/claim/source packages, systemd dispatch, reap, and new CLI surface (ls, cd, attach).
Reflects the new dispatch model (systemd units, claims, sessions), source abstraction, XDG data layout, session status lifecycle, and updated CLI surface (ls, cd, reap).
The end-state specification that guided the 8-phase lifecycle migration is now reality. Track it alongside architecture.md as the reference for dispatch, sessions, claims, and systemd units.
Explains threat model (helpful-not-malicious), trust boundaries across pipeline stages, credential isolation via systemd-run mounts, the .git write trade-off, network access stance, author authorization, and what's explicitly out of scope.
Guided tutorial taking the operator from a freshly built hive binary through manually dispatching a single issue to a completed pull request. Covers prepare, exec, publish, and the hive run shortcut.
Extract CLI examples (tutorial/how-to material), session status lifecycle and poll loop detail (now in lifecycle.md). Add conceptual "why" framing to each component, cross-links to security model and ADRs, data flow diagram, and authz module.
Step-by-step: build from source, install binary, install systemd units, verify toolchain.
Finding GraphQL IDs, setting up config.toml, per-instance config for multi-project setups, poll and reap tuning.
Systemd user service setup, loginctl enable-linger, log checking, zero-downtime binary upgrades, multi-instance deployment.
hive ls, hive cd, hive attach, reading .hive/ metadata, journald logs, manual resume with hive exec --resume.
Writing GitHub issues that work as agent prompts, based on real learnings from prototype usage.
Complete reference for every hive command with synopsis, flags, config keys, credentials, exit codes, and examples.
All config keys organized by section, environment variables, resolution order, and named instance support.
Session JSON schema, status state machine, claim file format, workspace directory layout, and metadata files.
Source interface contract, WorkItem struct, ref format conventions, and GitHub Projects adapter specifics.
Jail interface contract, RunOpts struct, systemd-run backend details, sandbox properties, mount strategy, and credential isolation.
All unit templates, instance specifiers, unit relationships, and useful systemctl/journalctl commands.
Trim duplicated content that now lives in docs/, add Diátaxis-structured documentation navigation, streamline quick start, and add commands table. Link out to tutorial, how-to, reference, and explanation docs instead of inlining setup/usage/sandboxing/workspace details. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Set version via -ldflags at build time so `hive --version` shows the build timestamp. Logs version on startup for debugging. Defaults to "dev" for unset builds.
Remove board-item-id (not written by prepare), replace metadata table with inline mention + cross-link, cut flags table, trim reap details, and shorten sandbox/retry explanations to single sentences with links. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Aligns with Diátaxis convention for how-to guide naming. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
VERSION defaults to "dev" and is overridable via env var (e.g. VERSION=v1.0.0 make build). Build timestamp is always injected separately so version and build time are independent.
debug-session: move status table and metadata listing to session ref links. configure: move config search order to config reference link. write-issues: remove "Why issue quality matters" explanation and "Authorization"/"Board workflow" reference sections, add cross-links. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
jail-interface: stdin is not connected (only stdout/stderr are set). source-interface: Complete is not called by publish — publish calls gh.MoveToInReview() directly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
architecture: replace inline Data Layout tree with cross-link to reference/session.md, add links to jail and source interface refs. lifecycle: update data layout links to reference/session.md, make "reference docs" mention link to specific docs. security-model: fix "silently fail" to "rejected with permission denied". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Repo map now includes reference/, how-to/, explanation/, tutorial/, and prototype/ directories added during docs reorganization. Corrects logging entry from nonexistent slog-journal to actual TextHandler, and adds Key Docs entries for the full Diátaxis structure. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use coreos/go-systemd daemon.SdNotify to report poll loop and run pipeline status via STATUS= strings, visible in systemctl status. Poll service upgraded to Type=notify with READY=1 on startup. Run service gets NotifyAccess=all for status passthrough. No-op when not running under systemd.
Prevents publish crashes when agents produce analysis without code changes. Two-layer approach: 1. Proactive: Updated agent system prompt to require implementation 2. Reactive: Post-exec validation using structured output When zero commits detected, exec requests completion report via --json-schema. If agent claims completion (but no commits), retries with nudge to implement. If agent reports blockers, fails with reason. Publish now guards against zero-commit branches as safety net. Closes #54
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
hive execwas silently succeeding when agents produced zero commits, pushing failure detection topublish, which would crash withGraphQL: No commits between main and <branch>when attempting to create a PR. This violated Principle 3 (Single Responsibility) — publish shouldn't be responsible for detecting that exec didn't produce work.The blind spot: exec only checked for uncommitted changes after execution, but had no validation for "no work at all." An agent could analyze a design issue, exit cleanly with zero commits and zero uncommitted changes, and exec would report success.
What
Implemented a two-layer approach:
Proactive (prevention): Updated the agent system prompt to explicitly require implementation and commits. If an issue is not implementable (e.g., a design discussion), the agent should report completion as false with blockers.
Reactive (detection & recovery): After exec completes:
git rev-list --count main..HEAD(happy path is free)--json-schema(following theprdraftpattern)Safety net: Added a guard in
publishthat checks for commits before attempting to push/PR. This is belt-and-suspenders — exec should catch it first, but publish won't crash if something slips through.Notes for reviewers
j.RunCapture()(notj.Run()) to parse structured output without losing real-time logs during the main executionprdraft— same retry loop, same structured output parsingHasNewCommitsin the workspace packageworkspace.Createto branch frommaininstead ofHEAD(was causing test failures)Generated with Hive | Closes #48