Skip to content

feat: Workflow Boards — kanban state machines that drive coding agents#3135

Draft
ccdwyer wants to merge 1 commit into
pingdotgg:mainfrom
ccdwyer:ft/hyperion
Draft

feat: Workflow Boards — kanban state machines that drive coding agents#3135
ccdwyer wants to merge 1 commit into
pingdotgg:mainfrom
ccdwyer:ft/hyperion

Conversation

@ccdwyer

@ccdwyer ccdwyer commented Jun 18, 2026

Copy link
Copy Markdown

Per-project boards as state machines: lanes hold pipelines of agent/approval/ script/PR steps; one git worktree per ticket; event-sourced engine with durable sagas, lane-entry tokens, and durable approvals.

Includes: board creation + visual editor (form/canvas/history+revert), script steps, smart routing, WIP enforcement, retention, delete-board cascade, route colors, lane drag, per-ticket status, rename, ticket collaboration; external events/webhooks, aging+digest, dry-run simulator; GitHub PR loop; board notifications + mobile; one-way GitHub Issues / Asana sync + curated import picker; flexible work sources; collaborating agents (per-agent session resume + inter-agent handoff); markdown + editable ticket comments; design-board templates; provider-aware ticket-description spill + intake braindump guard.

Single consolidated migration (033_WorkflowSchema).

What Changed

Adds Workflow Boards — per-project kanban boards that are executable state machines. Each lane holds a pipeline of typed steps (agent / approval / script / merge / pull-request), tickets flow lane-to-lane under JSON-Logic routing rules, and every ticket runs in its own isolated git worktree.

Engine (server)

  • Event-sourced core: append-only event store, projection pipeline, and read model, with durable sagas, lane-entry tokens, and durable approvals so in-flight work survives restarts. Startup is gated on workflowRecovery.recover() (retried with backoff) before mutating RPCs unblock.
  • Step executors: agent steps (drive coding agents in a worktree), approval gates, script steps, merge, and a GitHub pull-request loop.
  • Smart routing via predicate rules, WIP limits, retention/aging sweeps + digests, and a dry-run simulator for evaluating routing without side effects.
  • One consolidated migration, 033_WorkflowSchema, creates every workflow table in a single step.

RPC + contracts

  • Full WebSocket RPC surface for board/ticket CRUD, intake proposals, dry-run, diffs, metrics, webhooks, and work-source / outbound connections, with typed contracts in packages/contracts.

Web UI

  • Drag-and-drop board view and a fullscreen canvas editor (lane/step/routing editing, version history + revert, dry-run panel, self-improve dialog), plus board creation, rename, delete-with-cascade, route colors, lane drag, and per-ticket status.

Mobile

  • A "Needs You" inbox and ticket action-sheet screen, push notifications for attention events (per-device preference gating), and deep-link routing into the right ticket.

Integrations & ingest

  • One-way GitHub Issues / Asana → ticket sync plus a curated work-item import picker; external events/webhooks (SSRF-safe URL validation, Slack/generic formatters); collaborating agents (per-agent session resume + inter-agent handoff); markdown + editable ticket comments; design-board templates; and provider-aware ticket-description spill with an intake braindump guard so long inputs stay within provider limits.

Why

Driving coding agents by hand doesn't scale past a couple of tickets: there's no durable record of where each piece of work is, no enforced gates between "agent proposed" and "merged," and no isolation between concurrent tasks. Modeling a project as a state machine makes the pipeline explicit and inspectable — each lane is a stage, each step is a typed unit of work, and routing rules decide what moves next.

The engine is event-sourced rather than mutating rows in place so that the full history is replayable and crashes mid-pipeline recover cleanly (durable sagas + lane-entry tokens), which is essential when steps spawn long-running agents and external side effects (PRs, webhooks, merges). Each ticket gets its own worktree so parallel agents never clobber each other's working tree. Routing, WIP, and approvals are first-class instead of conventions, so the board enforces the process rather than relying on the operator to remember it.

UI Changes

image image image

Checklist

  • This PR is small and focused (lol, oh no)
  • I explained what changed and why
  • I included before/after screenshots for any UI changes
  • I included a video for animation/interaction changes
WorkflowsDemo.mov

Note

Add kanban workflow boards with agent-driven pipelines, RPC surface, and mobile inbox

  • Introduces a full workflow board system: boards are kanban state machines with lanes, agent/script/approval/merge/pull-request steps, and predicate-based routing driven by JSON Logic rules
  • Adds a SQL-backed event store, projection pipeline, read model, and version history for boards and tickets; migration 033_WorkflowSchema.ts creates all workflow tables in one step
  • Wires a comprehensive WebSocket RPC surface (rpc.ts) covering board/ticket CRUD, intake proposals, dry-run simulation, diffs, metrics, webhooks, and work-source/outbound connections; exposed via WorkflowRpcHandlers.ts
  • Adds a drag-and-drop board UI in BoardView.tsx and a fullscreen canvas workflow editor (CanvasView.tsx) with lane/step/routing editing, version history, dry-run panel, and self-improve dialog
  • Implements outbound notification dispatch with Slack/generic formatters, SSRF-safe URL validation, and relay push notifications for board ticket attention events with per-device preference gating
  • Adds a mobile 'Needs You' inbox screen and ticket action sheet screen, plus deep-link routing for ticket notifications
  • Risk: the WorkflowEngine requires successful recovery (workflowRecovery.recover()) on startup before mutating RPCs are unblocked; recovery failure gates all workflow operations and retries with exponential backoff up to 3 times
📊 Macroscope summarized 5fe0ecf. 122 files reviewed, 0 issues evaluated, 0 issues filtered, 0 comments posted

🗂️ Filtered Issues

No issues evaluated.

@coderabbitai

coderabbitai Bot commented Jun 18, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ee564629-645a-472c-ad38-71ca1fb8bbf6

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added vouch:unvouched PR author is not yet trusted in the VOUCHED list. size:XXL 1,000+ changed lines (additions + deletions). labels Jun 18, 2026
Comment thread apps/server/src/workflow/Layers/WorkflowFileLoader.ts
Comment thread apps/server/src/workflow/outbound/OutboundUrlValidator.ts
Comment thread apps/web/src/components/board/editor/canvas/RoutingEdges.tsx
Comment thread apps/server/src/workflow/Layers/SetupRunService.ts
Comment thread apps/mobile/src/features/board/NeedsYouInboxScreen.tsx Outdated
Per-project boards as state machines: lanes hold pipelines of agent/approval/
script/PR steps; one git worktree per ticket; event-sourced engine with durable
sagas, lane-entry tokens, and durable approvals.

Includes: board creation + visual editor (form/canvas/history+revert), script
steps, smart routing, WIP enforcement, retention, delete-board cascade, route
colors, lane drag, per-ticket status, rename, ticket collaboration; external
events/webhooks, aging+digest, dry-run simulator; GitHub PR loop; board
notifications + mobile; one-way GitHub Issues / Asana sync + curated import
picker; flexible work sources; collaborating agents (per-agent session resume +
inter-agent handoff); markdown + editable ticket comments; design-board
templates; provider-aware ticket-description spill + intake braindump guard.

Single consolidated migration (033_WorkflowSchema).
Comment on lines +153 to +160
const onRefresh = useCallback(() => {
setRefreshing(true);
void load().finally(() => {
if (mountedRef.current) {
setRefreshing(false);
}
});
}, [load]);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium board/NeedsYouInboxScreen.tsx:153

During pull-to-refresh from the error state, load() clears error at line 91 before the async work completes, and onRefresh() never sets loading to true. This produces the state {loading: false, refreshing: true, rows: [], error: null}, which causes deriveInboxViewState to return {kind: "empty"} — so the screen briefly flashes "You're all caught up" instead of showing a loading indicator or preserving the error display until new data arrives.

  const onRefresh = useCallback(() => {
    setRefreshing(true);
+   setLoading(true);
    void load().finally(() => {
      if (mountedRef.current) {
        setRefreshing(false);
      }
    });
  }, [load]);
🤖 Copy this AI Prompt to have your agent fix this:
In file @apps/mobile/src/features/board/NeedsYouInboxScreen.tsx around lines 153-160:

During pull-to-refresh from the error state, `load()` clears `error` at line 91 before the async work completes, and `onRefresh()` never sets `loading` to `true`. This produces the state `{loading: false, refreshing: true, rows: [], error: null}`, which causes `deriveInboxViewState` to return `{kind: "empty"}` — so the screen briefly flashes "You're all caught up" instead of showing a loading indicator or preserving the error display until new data arrives.

Evidence trail:
apps/mobile/src/features/board/NeedsYouInboxScreen.tsx lines 83-93 (load() clears error before await), lines 153-160 (onRefresh never sets loading to true), line 168 (deriveInboxViewState call). apps/mobile/src/features/board/inboxViewState.ts lines 33-45 (deriveInboxViewState does not check refreshing in the empty case at line 43, causing it to return {kind: 'empty'} when refreshing=true, loading=false, rows=[], error=null).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix

This comment was marked as off-topic.

Comment on lines +150 to +154
if (currentSnapshot !== null && currentSnapshot.status === "exited") {
yield* Deferred.succeed(done, { exitCode: currentSnapshot.exitCode ?? 1 }).pipe(
Effect.asVoid,
);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium Layers/SetupRunService.ts:150

When terminals.getSnapshot returns null (no session exists, e.g., terminal was cleaned up), the code proceeds to Deferred.await(done) and waits for events that will never arrive. With timeoutMs set, this waits the full timeout; without it, the effect hangs indefinitely. Consider treating null the same as the exited case—resolve immediately with an error since the terminal no longer exists.

-    if (currentSnapshot !== null && currentSnapshot.status === "exited") {
+    if (currentSnapshot === null) {
+      yield* Deferred.succeed(done, { exitCode: 1 }).pipe(Effect.asVoid);
+    } else if (currentSnapshot.status === "exited") {
       yield* Deferred.succeed(done, { exitCode: currentSnapshot.exitCode ?? 1 }).pipe(
         Effect.asVoid,
       );
     }
🤖 Copy this AI Prompt to have your agent fix this:
In file @apps/server/src/workflow/Layers/SetupRunService.ts around lines 150-154:

When `terminals.getSnapshot` returns `null` (no session exists, e.g., terminal was cleaned up), the code proceeds to `Deferred.await(done)` and waits for events that will never arrive. With `timeoutMs` set, this waits the full timeout; without it, the effect hangs indefinitely. Consider treating `null` the same as the exited case—resolve immediately with an error since the terminal no longer exists.

Evidence trail:
apps/server/src/workflow/Layers/SetupRunService.ts lines 146-155 (REVIEWED_COMMIT): null check condition on line 150, fallthrough to Deferred.await on line 155. apps/server/src/terminal/Services/Manager.ts lines 138-144: getSnapshot documentation 'Returns null if no session exists for the given ids' and return type `TerminalSessionSnapshot | null`. apps/server/src/terminal/Layers/Manager.ts lines 2218-2221: getSnapshot implementation returns null when session is None.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XXL 1,000+ changed lines (additions + deletions). vouch:unvouched PR author is not yet trusted in the VOUCHED list.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants