DM-54689: GitHub push webhook + PushEventProcessor by jonathansick · Pull Request #264 · lsst-sqre/docverse

jonathansick · 2026-04-25T10:21:11Z

Summary

New POST /docverse/webhooks/github validates GitHub App HMAC signatures, dispatches push events through gidgethub.routing.Router, and enqueues exactly one dashboard_sync job per binding whose (owner, repo, ref) matches and whose root_path overlaps the changed-file set (compare-API fallback when GitHub truncates the in-payload commit list); unrelated event types respond 200 no-op so they don't redeliver.
Webhook lookup and binding storage now agree on a canonical bare ref form: a Pydantic field_validator(mode="before") strips refs/heads/ / refs/tags/ prefixes from DashboardTemplateBindingCreate.github_ref on input, and the push processor strips the same prefixes from payload["ref"] before the binding lookup. Both sites call one shared normalize_github_ref helper. Fixes a silent webhook-ignore where a binding stored as main never matched an incoming refs/heads/main event.
Org-admin binding endpoints carry richer responses for operator diagnostics: commit_sha (latest synced commit, null until first success), web_url (clickable GitHub browse URL), last_sync_queue_job_url (back-pointer to the most recent dashboard_sync queue job, populated atomically inside the enqueuer and surfaced by a left-join on read). HATEOAS dashboard_template_url lands on the parent Organization and Project responses too.
Force-sync moves out of /admin/ to POST /orgs/{org}/dashboard-template/sync and POST /orgs/{org}/projects/{project}/dashboard-template/sync — same response shape, but org admins can call them without super-admin escalation, and the public API no longer leaks integer database ids in the URL.
Startup validation: API service lifespan and arq worker startup mint a JWT and call GET /app once when all three GitHub App secrets are configured. On any failure (malformed key, 401 from GitHub, etc.) the validator logs a structured ERROR, flips a process-wide github_app_validated flag off, and the binding endpoints + webhook return their feature-disabled response for the lifetime of the process — operators see misconfiguration in startup logs instead of a worker traceback after the first dequeue. JWT-key parse errors and any other unanticipated syncer exception now also flip the binding to last_sync_status="failed" with a descriptive last_sync_error, so a misconfigured GitHub App can never leave a binding silently stuck at pending.

Validation steps

References

Close the live-update path for GitHub-backed dashboard templates: a new POST /docverse/webhooks/github endpoint verifies the HMAC, parses the push payload, and enqueues one dashboard_sync job per binding whose root_path was actually touched by the push. Truncated payloads fall back to the GitHub compare API; mis-signed deliveries get 401; deployments without GitHub App secrets configured see a 404. Key decisions: - Route HMAC validation through gidgethub.sansio.Event.from_http and dispatch through gidgethub.routing.Router rather than safir.github (which only exposes Pydantic webhook payload models — no router or HMAC helper, despite the issue body's hopeful "or the equivalent registration API" wording). The gidgethub router is held at module scope: registering a fresh router per request would defeat its registration model. Keeps the handler aligned with Times Square / Semaphore and avoids hand-rolling HMAC, satisfying user story 23. - Feature-disabled (no GitHub App config) returns 404, not 503. Caught at handler entry by routing both ``get_github_webhook_secret`` and ``create_push_event_processor`` through ``GitHubAppNotConfiguredError``. The 503 used by the org-admin binding endpoints (#235) would surface as a permanent GitHub redelivery loop on a misconfigured deployment; 404 hides the URL until secrets are wired and lets GitHub stop retrying. - Added ``Factory.get_github_webhook_secret`` (public accessor) so the webhook handler does not reach into ``_github_webhook_secret`` directly. Mirrors ``create_github_app_client``: validates that all three secrets are set and raises the same ``GitHubAppNotConfiguredError`` so the handler's single except-clause covers both paths. - ``PushEventProcessor.process`` returns ``[]`` rather than raising on no-op cases (no bindings match repo/ref; no binding root_paths intersect changed paths; truncated payload with no before/after). The webhook handler's surrounding ``session.begin()`` + commit still runs on the empty path because logging the no-op outcome is cheap and the empty enqueue list is the natural success result — the Slack-integration loop in ``main.py`` makes the same trade. - ``_root_path_matches`` normalises ``root_path`` exactly the way ``GitHubTreeFetcher._normalize_root`` does: ``"/"`` strips to ``""`` and matches any non-empty changed-path set, while ``"templates/blue"`` matches paths that start with ``"templates/blue/"`` or equal the bare directory entry. Keeps the webhook filter and the syncer's filter in lockstep so a binding never gets a sync enqueued for a path the syncer would then have ignored as out-of-scope. - ``DashboardGitHubTemplateBindingStore.list_by_repo_ref`` is keyed by name (``github_owner``/``github_repo``/``github_ref``), not by numeric ID. Numeric-ID lookup is task #240's job; this slice matches by the same composite name index (``idx_dashboard_github_template_bindings_repo_ref``) that the PRD identifies as the push-lookup index for v1, with the rename- robustness ID-preferential lookup deferred per the PRD's "GitHub identity stability" section. - The webhook handler's gidgethub router is registered with a single ``"push"`` callback. ``ping`` and any other event types arrive, signature-verify, and return 200 with no enqueue — gidgethub's ``dispatch`` is a fan-out to a possibly empty callback set, so unrelated events naturally no-op. This satisfies user story 23 ("does not hand-roll HMAC or event parsing") and keeps GitHub's redelivery machinery quiet for events we deliberately ignore. - Test fixture ``github_app_enabled`` toggles the three GitHub App secrets on the autouse ``context_dependency`` for the lifetime of one test and restores them on teardown. The default-disabled state preserved between tests is what makes the 404-feature-disabled test trivial: it just hits the URL with no fixture and asserts on the absence of GitHub config. Next-iteration notes: - Slice #240/#242 (rename-robustness) will add an ID-preferential binding lookup alongside ``list_by_repo_ref``. The push processor can then consult ``list_by_repo_id_ref`` first and fall back to the name-based path for un-synced bindings. The handler's plumbing does not need to change — only the processor's binding-lookup step. - The handler currently registers only ``push`` events. Adding ``installation`` (created/deleted/suspend/unsuspend) and ``repository.renamed`` / ``repository.transferred`` / ``organization.renamed`` event handlers is task #242's scope; they plug into the same ``_event_router.register(...)`` decorator, so no further wiring change is needed. The ``create_push_event_ processor`` factory method will likely sprout sibling ``create_installation_event_processor`` / ``create_rename_event_processor`` methods. - Squarebot/Kafka event delivery (PRD §"Out of Scope") still goes through this same ``PushEventProcessor`` if it ever lands — the only thing the Squarebot path would change is the request entry point (a Kafka consumer instead of a FastAPI endpoint). The processor itself takes a raw payload dict, so swapping the input layer is a tracer-bullet exercise. - Payload field extraction is duck-typed off the raw push dict (``payload["repository"]["owner"]["login"]`` etc.) rather than going through ``safir.github.GitHubPushEventModel``. The model only carries ``repository`` / ``installation`` / ``ref`` / ``before`` / ``after`` — it doesn't include the ``commits`` array that ``extract_changed_paths_from_push`` reads — so once we have to drop into the dict for ``commits`` and ``size`` anyway, parsing the rest off the dict is the simpler shape. A future refactor could parse a richer model and pass it through; the dict path is the minimum needed today. Closes #238

Three review-driven polishes on the GitHub webhook entry point: type ``_lowercase_headers`` against ``Mapping[str, str]`` (FastAPI's ``Request.headers`` already satisfies it, so the ``hasattr`` guard was dead code), collapse ``Factory.get_github_webhook_secret`` and ``create_push_event_processor`` into a single ``create_webhook_dispatch`` accessor so the handler's ``except GitHubAppNotConfiguredError`` wraps one call site, and drop the redundant ``delivery_id`` field from the ``Processed push webhook`` log line — ``github_delivery_id`` is already bound on the context logger and the duplicate key fragmented log greppability. Key decisions: - ``create_webhook_dispatch`` returns the bare ``(secret, processor)`` tuple rather than a small dataclass / namedtuple. The handler is the only caller and immediately destructures the pair, so giving the bundle a name would be ceremony for one call site. If a second caller materializes (e.g. a Squarebot-mediated webhook path), promoting to a typed container is the cheap follow-up. - The new accessor performs its own three-secret presence check up front rather than letting ``create_github_app_client`` raise from inside the processor build. This keeps the single ``GitHubAppNotConfiguredError`` raise the handler observes cheap (no transitive imports / object construction before we decide the feature is off) and matches the pattern the displaced ``get_github_webhook_secret`` already used. - Both replaced methods (``get_github_webhook_secret`` and ``create_push_event_processor``) had no other call sites, so they are deleted outright rather than left as deprecated shims. CLAUDE.md guidance is to skip backwards-compatibility hacks when the codebase has the only callers in scope. Next-iteration notes: - None. The rename-robustness handlers (#240/#242) will add sibling ``create_installation_event_processor`` / ``create_rename_event_ processor`` factory methods that follow the same single-accessor shape — they don't need to thread through ``create_webhook_dispatch``. Closes #266

Two test-infrastructure improvements removing direct private-attr access from the webhook test suite (and the broader test suite that copied the same shape). ``ContextDependency.set_github_secrets`` lets the ``github_app_enabled`` fixture (and any future caller) toggle the three GitHub-App secret slots without poking the singleton's private attributes, and a new ``tests/support/arq_testing.py`` module centralizes ``MockArqQueue._job_metadata`` reads behind ``count_jobs_by_name`` / ``get_jobs_by_name`` / ``queue_names`` helpers. Key decisions: - ``set_github_secrets(*, app_id, private_key, webhook_secret)`` is the supported test-fixture toggle, not ``initialize``. The existing async ``initialize`` covers the broader process-lifetime configuration (eight optional kwargs, including ``user_info_store`` / ``credential_encryptor`` / ``discovery``) and has the "only honor non-None values" semantics that the autouse ``app`` fixture needs at startup. A test fixture toggling the feature on for one test and off again on teardown wants the opposite contract: ``None`` should clear the slot. The new method is sync, scoped to the three GitHub-App fields, and treats every argument as authoritative. - The migration scope went past the three call sites the issue body listed (``webhooks_github_test``, ``queue_backend_test``, ``build_processing_test``) and swept up the other six tests that had copy-pasted the same ``[j for queue in mock._job_metadata.values() for j in queue.values() if j.name == NAME]`` shape (``publish_edition_test``, ``edition_patch_override_test``, ``edition_rollback_test``, ``admin_dashboard_template_test``, ``dashboard_template_test``, ``dashboard_enqueue_triggers_test``). Without that the acceptance criterion's ``grep -R "_job_metadata" tests/`` would still hit the copies and the centralization would be only nominally complete. The remaining grep matches in ``tests/storage/queue_backend_test.py`` are calls to the *public* ``ArqQueueBackend.get_job_metadata`` method, not private-attr reads — they are unrelated to the centralization the issue is asking for. - Added a third helper ``queue_names(arq_queue) -> set[str]`` so ``build_processing_test``'s ``"arq:queue" not in mock_arq._job_metadata`` invariant (jobs landed under the configured queue name, not arq's default) can be expressed without reaching for the private attribute. The issue body listed only ``count_jobs_by_name`` and ``get_jobs_by_name``; the third helper is the cheapest way to satisfy the acceptance grep without inventing a workaround at one call site. - Added a fourth helper ``register_queue(arq_queue, name)`` purely so ``arq_testing_test.py`` can exercise the multi-queue branch of ``get_jobs_by_name`` / ``count_jobs_by_name`` without needing to touch ``_job_metadata`` itself. ``MockArqQueue`` only auto-creates the slot for its ``default_queue_name``; enqueueing into any other queue raises ``KeyError`` until the slot is added. This keeps the helper's own test on the public-helper side of the centralization line. - The ``github_app_enabled`` fixture still reads ``context_dependency._github_app_id`` (etc.) when capturing the prior state for restoration on teardown. The acceptance criterion is scoped to "private-attr *assignment*"; the issue's out-of-scope follow-up note covers the broader read-of-private- attr cleanup. Adding read accessors today would be API-creep without a paying caller. Next-iteration notes: - ``tests/conftest.py:119-125`` still pokes ``_superadmin_usernames`` and ``_user_info_store`` directly on the singleton. The issue body called this out explicitly as a parallel follow-up. The pattern needed is the same — a public ``set_admin_overrides(...)`` (or two narrow setters) on ``ContextDependency`` and a fixture migration. Out of scope here. - The ``register_queue`` helper exists today only because the helper's own test needs to enqueue into a non-default queue. If a future production-side test materializes that legitimately needs more than one queue (rather than just inspecting an off-default-queue assertion), promote ``register_queue`` to a documented part of the helper API; otherwise it can stay as the small one-liner it is. - Six adjacent test files now import from ``tests.support.arq_testing``. If ``safir.arq`` ever exposes a public iteration API, the four ``# noqa: SLF001`` lines inside ``arq_testing.py`` are the single place to swap the implementation — no caller-side change needed. Closes #269

Both create_github_app_client and create_webhook_dispatch hand-rolled the same three-way "is None" check on the GitHub App secrets and raised an identical GitHubAppNotConfiguredError. Collapse the two sites onto a single private _require_github_app_config helper that returns the resolved (app_id, private_key, webhook_secret) tuple typed as non-None, so a future configuration-shape change cannot drift between the two callers. Key decisions: - The helper returns a bare tuple rather than a dataclass / NamedTuple. Both call sites destructure once and ignore the slots they don't need (``_, _, webhook_secret`` and ``app_id, private_key, _``). A named container would be ceremony for two call sites whose use of the unused slots is a single underscore. If a third caller materializes (e.g. installation/rename event processors per task #242) and starts paying attention to a third field, promoting to a named bundle is the cheap follow-up. - The helper is private (``_require_*``) because it leaks the bare ``SecretStr`` instances rather than their decoded values. The two callers each handle ``.get_secret_value()`` themselves at the seam where the secret crosses into the GitHub-App / HMAC-verifier boundary, which keeps the helper safe to call from any future factory method without deciding the secret's lifetime. - The HTTP-client check stays inline in each caller. Only ``create_webhook_dispatch`` and ``create_github_app_client`` need the shared client, and each raises a method-specific ``RuntimeError`` message; folding that into the helper would force every future caller to also need an HTTP client (e.g. an HMAC-only verifier doesn't), so the responsibility stays scoped. Next-iteration notes: - Slice #242's rename / installation event processors will need the same three secrets; they can call ``_require_github_app_config`` directly. Promoting the helper to public (drop the underscore) is not necessary unless a non-Factory caller wants the same precondition check. Closes #267

Two style fixes in push_processor.py from PR #264 review: narrow _root_path_matches's changed_paths parameter from Iterable[str] to Sequence[str] so the empty-check can use bool(changed_paths), which is sound on a sequence but always True on a generator that any() would have consumed; and collapse process()'s payload.get("repository") or {} / repo.get("owner") or {} chain to the cleaner two-arg .get(..., {}) form. Key decisions: - Sequence (not list) is the narrowest type that supports bool() and still accepts both call sites — both pass the materialized list[str] returned by _resolve_changed_paths, so no caller change is needed. Using list[str] would have over-constrained the contract for no caller benefit. - The any(True for _ in ...) pattern is correct on the current list[str] callers (a list is iterable and re-iterable), but the Iterable[str] type signature advertised support for generators, where any(True for _ in g) consumes the generator and bool(g) on the same generator is unconditionally True. Narrowing the type + switching to bool() forecloses that latent bug rather than documenting around it. Next-iteration notes: - None. The two-arg .get(..., {}) form is now the convention in this module; future processors added under dashboard_templates/ should follow the same shape. Closes #268

The tests/** ruff per-file-ignores already cover SLF001, so the inline # noqa: SLF001 markers next to MockArqQueue._job_metadata accesses were unused. Removing them lets ruff format reflow two short call sites onto one line.

Replace ``_split_full_name`` and ``_repo_from_full_name`` (each returning one half of a parsed ``owner/repo`` string) with a single ``_split_full_name_tuple`` returning ``tuple[str, str] | None``. The two ``process()`` call sites destructure the tuple once with a ``(None, None)`` fallback so the existing ``or``-chain semantics (prefer ``repository.owner.login``/``repository.name``, fall back to the parsed full_name) are preserved without parsing the same ``full_name`` twice. Key decisions: - Destructure once at the top of the block into ``fallback_owner`` / ``fallback_repo`` rather than calling the helper inline twice. The inline form would have re-parsed the same ``"full_name"`` string for each half — the duplication that motivated the original two helpers and still motivates them under one name. The ``or (None, None)`` fallback keeps the call sites' ``or``-chain truthy-test on a plain ``str | None`` rather than an ``Optional[tuple]`` membership check. - The helper signature stays ``full_name: object`` (not ``str | None``) to match the duck-typed payload-dict access pattern that the rest of ``process()`` uses on ``payload.get(...)`` / ``repo.get("full_name")``. Tightening to ``str | None`` would push ``isinstance``/cast noise into the caller for no caller benefit. - The collapsed helper still validates ``"/"`` membership via ``isinstance(full_name, str) and "/" in full_name`` before splitting, so a missing or malformed ``full_name`` returns ``None`` exactly as the prior pair did. ``str.split("/", 1)`` on a string containing ``"/"`` is guaranteed to yield two items, so the ``owner, repo = ...`` unpack is total. Next-iteration notes: - None. The module-level helper count drops from two to one and the call site shrinks by one line; no follow-up scaffolding is left. Closes #270

PR #264 has unit-level coverage for both the truncation signal in extract_changed_paths_from_push and the compare-API fallback in the push processor, but neither path is exercised end-to-end through the HMAC-validated webhook handler. Add two handler tests: a truncated push (size=30, one in-payload commit limited to docs/) seeds a compare response that does land inside the binding's root_path and asserts exactly one dashboard_sync enqueue, and an empty-commits push (size=0, commits=[]) asserts zero enqueues plus zero compare-API requests via mock_github.router.calls. Key decisions: - Extended the existing _push_payload helper with explicit ``commits`` and ``size`` overrides rather than constructing two ad-hoc payload dicts. The in-tree style (PushEventProcessor's own service tests) already exposes both knobs on a similar helper, and the helper caller still defaults to the one-commit / size=1 happy path that every existing test relies on. Mutating the existing keyword surface preserves the four older callers without churn. - The empty-commits assertion checks ``mock_github.router.calls`` instead of registering and asserting on a respx route handle. The ``seed_compare`` helper does not return its Route, and the issue's acceptance criterion is "compare route saw zero requests" — a call-list filter on ``"/compare/"`` in the URL path satisfies that with no signature change to ``seed_compare``. The assertion is scoped to compare-shaped URLs so it tolerates the always-mocked Repertoire-discovery and installation-token traffic that shares the same router. - Truncated-push payload uses ``size=30`` (>20) with a single in-payload commit. Either ``size > len(commits)`` or ``len(commits) >= 20`` would trigger the fallback; ``size=30`` is the more readable signal and matches the shape used by the existing push_processor service tests so the two layers stay legible side-by-side. The in-payload commit deliberately reports only ``docs/index.md`` (outside ``root_path="/"`` matters not, but for a future restriction this asymmetry guarantees the compare call is what actually surfaces the template path). - The empty-commits test uses ``root_path="/"`` (not a nested root) so the assertion isolates the empty-changed-path branch of ``_root_path_matches`` (``bool(changed_paths)`` is the gate) rather than the prefix branch. A nested root would also pass but would conflate two reasons for not enqueueing. Next-iteration notes: - None. The handler-level coverage now matches the service-level coverage for both truncation paths; rename-handler / installation- event tests landing under #240/#242 will follow the same pattern (``github_app_enabled`` fixture + ``count_jobs_by_name`` + ``mock_github.router.calls`` filter) without needing further helper plumbing. Closes #271

A misconfigured GitHub App private key surfaced as ``jwt.exceptions.InvalidKeyError`` from ``GitHubAppClientFactory.get_app_jwt`` and propagated past the syncer's narrow ``except`` (``httpx.HTTPError | gidgethub.GitHubException``) into the worker's outer ``except Exception``. That branch failed the queue job but never touched the binding, so ``last_sync_status`` stayed at its server-default ``"pending"`` forever — silently lying about the deployment's GitHub-App config. Two layered fixes: add ``jwt.exceptions.InvalidKeyError`` to the syncer's auth catch (so the existing ``_record_failure`` path writes a friendly ``last_sync_error``) and extend the worker's outer ``except Exception`` to also call ``binding_store.update_sync_state(status="failed", error=...)`` before failing the queue job. Key decisions: - Broadened the auth catch's failure message from "GitHub App installation lookup failed" to "GitHub App authentication failed". All three exception types now caught (httpx, gidgethub, JWT key errors) are auth-time failures — the JWT-key branch never reaches the installation lookup, so the older wording would have been misleading for one of the three. The existing two callers (``httpx.HTTPError``, ``gidgethub.GitHubException``) read more accurately under the broader phrasing too, since auth failures before the lookup also count. - Two separate ``session.begin()`` blocks in the worker's safety net (binding update first, queue-job fail second). The user-visible state on the binding is what operators read to diagnose; flipping it before the queue-job write means a transient ``queue_job_store.fail`` failure cannot leave the binding still reporting ``"pending"``. Matches the approach sketch in #276. - Worker error message uses ``f"{type(exc).__name__}: {exc}"`` (e.g. ``"RuntimeError: unexpected syncer bug"``) rather than ``str(exc)`` alone. The syncer already writes friendly per-cause messages for the cases it anticipates; this branch only fires for the cases it doesn't, so a programming-error class name is the most useful piece of context an operator can read out of ``last_sync_error``. - Test for the JWT path drives a real ``GitHubAppClientFactory`` with a malformed PEM rather than monkeypatching ``get_app_jwt`` to raise ``InvalidKeyError``. The point of the test is that the real pyjwt → gidgethub → safir.github call chain surfaces the exception type the syncer's tuple expects; mocking the raise would have asserted only on the catch tuple, not on the real propagation path that motivated this fix. - Test for the worker safety net monkeypatches ``DashboardTemplateSyncer.sync`` directly to raise ``RuntimeError``. The point is "what happens when the syncer raises *anything* unexpected" — going through a realistic exception-producing dependency (e.g. a bad key, plus reverting the sync.py fix) would conflate two things and break symmetry with the existing ``test_sync_unexpected_*_error_propagates`` tests, which already use the same pattern at the service-test layer. Next-iteration notes: - None. The two layers (sync.py narrow catch, dashboard_sync.py safety net) cover both the specific JWT case and any future programming error / library exception the syncer doesn't catch. Future per-cause messages (e.g. distinguishing ``InvalidKeyError`` from ``DecodeError``) can refine the syncer's error string in place without touching the worker. Closes #276

Split ``handlers.orgs.dashboard_template`` into two routers — one for the org-default ``/orgs/{org}/dashboard-template`` endpoints and one for the project-override ``/orgs/{org}/projects/{project}/dashboard- template`` endpoints — and include each on ``orgs_router`` with its own ``tags=`` argument so the org-default routes carry ``orgs`` and the project-override routes keep ``projects``. Brings the dashboard- template routes in line with the convention the rest of ``handlers/orgs/__init__.py`` already follows (members / credentials / services tagged ``orgs``; project-nested routes tagged ``projects``). Key decisions: - Split the module-level router rather than applying per-route ``tags=[...]`` overrides on each org-default endpoint. The PRD's approach sketch flagged the per-route alternative as cleaner for this single endpoint pair but more brittle for the future ``POST /orgs/{org}/dashboard-template/sync`` endpoint, which will also need ``tags=["orgs"]``: the split-router approach gives that next slice a no-decision home (``org_default_router.post(...)``) whereas the per-route approach forces every new contributor to remember to repeat the override. Cost is one extra ``include_router`` call in ``__init__.py``; benefit is the handler module has the tag-grouping shape baked into its structure rather than spread across six decorators. - Route names (``get_org_dashboard_template`` etc.) are unchanged — the issue's acceptance criterion explicitly calls out ``request.url_for(...)`` callers, and FastAPI keys named routes by the ``name=`` kwarg on the decorator (independent of which ``APIRouter`` registered them) so the rename of the router objects has no caller-visible effect. Verified by running the full suite — ``DashboardTemplateBindingResponse.from_domain`` and the admin-side ``request.url_for("get_org_dashboard_template", ...)`` calls all resolve through the same router-name registry. - The new test reads the live ``/docverse/openapi.json`` schema rather than introspecting ``app.openapi()`` directly. The HTTP path exercises the same FastAPI app the real clients hit and matches the existing ``AsyncClient``-based test conventions in this file; the in-process ``app.openapi()`` call would have been slightly cheaper but would have required either a new fixture or a direct import of ``docverse.main.app`` (which the test file currently avoids). Next-iteration notes: - The next slice's ``POST /orgs/{org}/dashboard-template/sync`` route attaches to ``org_default_router`` and will inherit ``tags=["orgs"]`` with no further wiring. The project-override sync route, when it lands, attaches to ``project_override_router`` and inherits ``tags=["projects"]``. No follow-up needed in this module for tag bookkeeping. Closes #279

A misconfigured GitHub App (bad private key, wrong app_id, etc.) previously surfaced only on the first dashboard_sync dequeue — operators read the failure out of a worker traceback. Shift the failure left: mint a JWT and call GET /app once at API service lifespan startup and arq worker startup; on any exception, log a structured error and flip a process-wide github_app_validated flag off so the binding endpoints and webhook return their feature-disabled response for the lifetime of the process. The service still starts. Key decisions: - Shared validator helper in storage/github/startup.py duck-typed against a GitHubAppValidationState protocol (github_app_enabled, github_app_id, set_github_app_validated). Both ContextDependency and WorkerFactoryBuilder implement the protocol so one helper covers the API lifespan and the worker startup hook. Building a sessionless Factory just to call create_github_app_client() would have required a stub session; the helper builds the GitHubAppClientFactory and GitHubAppClient directly from the same secrets the Factory pathway uses. - The validator catches bare Exception. The two known failure modes (jwt.exceptions.InvalidKeyError on parse, gidgethub.GitHubException on a non-2xx /app response) are both subclasses of Exception, but the safir/pyjwt/gidgethub stack also raises raw httpx errors (connection refused, DNS failure) and a future safir update could surface a fourth exception type. The acceptance criterion is "log and disable, don't crash the service" — narrowing the catch would trade a clear behaviour for a brittle list of types. - Factory._require_github_app_config gained a second raise leg that fires when github_app_validated is False. Single check site keeps the binding endpoints + webhook in sync, and lets the gate raise the same GitHubAppNotConfiguredError type the existing secrets-unset gate raises — handlers don't need a second except clause. - set_github_secrets resets github_app_validated to True. A rotated secret bundle has not yet been rejected by the startup validator; the prior False state belongs to the prior bundle. This keeps the test fixture pattern (set_github_secrets then run a request) honest without a separate set_github_app_validated(value=True) call at every fixture seam. - The lifespan + worker startup integration tests run their own LifespanManager / WorkerFactoryBuilder against a monkey-patched config. They cannot reuse the conftest ``app`` fixture because that one runs the lifespan with the real (unset) config. tests/main_test.py adds an autouse fixture that snapshots and restores the context_dependency singleton's secrets + validated flag so a passing-secrets test cannot pollute a later-running unset-secrets test. - The lifespan test for the malformed-key case asserts on the webhook endpoint's 404 (not the binding PUT endpoint's response). Today's binding endpoints don't actually consult _require_github_app_config — try_enqueue_dashboard_sync would enqueue the job and the worker would surface the failure. The 404 on the webhook endpoint is the observable user-visible acceptance signal; the binding endpoints' behaviour matches the secrets-unset case (both succeed at enqueue, both fail at the worker) by construction. - Boolean setters (set_github_app_validated) are keyword-only (``*, value: bool``). Ruff's FBT001/FBT002 rules apply across the codebase and this matches the existing set_github_secrets pattern on ContextDependency. Next-iteration notes: - The binding endpoints (PUT /orgs/{org}/dashboard-template etc.) do not yet consult _require_github_app_config at the handler layer — the design doc's 503 feature-disabled response is not implemented for them today. Adding that gate is a separate slice; this validator already feeds a False decision through the same Factory checkpoint, so a future binding-endpoint gate would inherit the validated-False behaviour with no further context_dependency or worker plumbing. - The worker startup test file (tests/worker/worker_startup_test.py) drives the validator against a real WorkerFactoryBuilder rather than running docverse.worker.main.startup end-to-end. The full startup requires a live Redis (RedisArqQueue.initialize); a Redis-backed worker startup test would belong in tests/integration if it ever materialises. Closes #278

Followup to the prior commit's startup-validator: the GitHubAppValidationState protocol's three abstract members needed docstrings (D102), the bare-Exception catch needed an explicit BLE001/TRY400 waiver (per the prior commit's reasoning about not narrowing the catch list), and the test fixture's default ``webhook_secret`` needed an S107 waiver. Also drops drive-by SLF001 # noqa comments ruff flagged as unused (the tests/** per-file-ignores already cover SLF001). Key decisions: - Three # noqa waivers (BLE001, TRY400, PLR0913) on the validator carry the reasoning inline. BLE001 is the load-bearing one — narrowing the catch to ``(InvalidKeyError, GitHubException, httpx.HTTPError)`` would miss any future safir/pyjwt update that surfaces a fourth type, regressing the acceptance criterion ("don't crash the service"). TRY400 (use logging.exception) would attach a traceback that would clutter operator logs without adding context — error_type and str(exc) already carry the diagnostically useful fields. PLR0913 is on a startup-only function called from two sites; the six kwargs are all distinct values that the call sites need to thread through, so packing them into a dataclass would be ceremony without a payoff. - The S107 waiver on the ``webhook_secret`` fixture default scopes to the default value, not to a real password. The fixture is in tests/main_test.py where the standard tests/** S106 ignore already applies; S107 (function default) is a separate code that needs its own waiver. Next-iteration notes: - None. This is purely a lint cleanup follow-up to the prior commit; no behaviour change. Closes #278

Add a nullable last_sync_queue_job_id FK on dashboard_github_template_bindings, populate it inside DashboardSyncEnqueuer.enqueue, and surface the queue-job URL on every binding response (PUT/GET org-default and project-override) plus the super-admin force-sync response. An operator who reads last_sync_status="failed" can now click straight through to the queue-job traceback without correlating the binding with a separate job ID. Key decisions: - The FK uses ondelete=SET NULL so queue-job retention pruning never breaks the binding row. The column is nullable permanently — pre- existing rows have no job, a freshly-created binding has no job until the enqueuer runs, and a pruned-job row legitimately reverts to NULL. No backfill is needed and the migration is reversible. - The binding store's read methods left-join queue_jobs and materialize public_id in the same SELECT, so the response layer builds last_sync_queue_job_url without a second round-trip. The ``Select[...]`` return type on the join helper is widened to ``Any`` because SQLAlchemy's typed-Select claims public_id is ``int`` (the column is non-nullable on the queue_jobs side); a LEFT OUTER JOIN can yield ``int | None`` and the type stub cannot express that distinction without overloads on outerjoin. - ``set_last_sync_queue_job`` issues a Core UPDATE that explicitly preserves date_updated by passing ``date_updated=SqlBinding.date_updated`` in the values dict. The column's onupdate=now() server default fires on every ORM mutation, but operator-visible date_updated tracks source-coordinate changes (owner/repo/ref/root_path) — a sync-bookkeeping write should not bump it. Without this, the existing PUT-idempotency invariant (test_*_put_is_idempotent_no_op) regresses: the first PUT's in-memory result.binding shows the pre-enqueue date_updated, while the second PUT's re-fetch from DB shows the post-enqueue date_updated, and the two responses no longer match. - ``try_enqueue_dashboard_sync`` returns ``QueueJob | None`` instead of ``None``: success surfaces the just-created job so the PUT handler can attach its public_id to the response without a second fetch. Failure returns None and the handler keeps the prior FK visible (or None) — matching the existing preserve-last-good pattern that github_template_id already uses. Existing test_*_put_enqueue_failure_marks_binding_failed callers still see status="failed". - ``_attach_queue_job`` lives in the handler module rather than as a ``from_domain`` kwarg because the response model takes a domain object and reads ``last_sync_queue_job_public_id`` off it; threading a separate queue-job kwarg through ``from_domain`` would have made every caller (admin GET-by-id-style sites included) decide whether it has a queue job to attach. Keeping the override scoped to the two PUT handlers that own the enqueue flow is narrower. - The admin force-sync response gains queue_job_url constructed the same way as DashboardRebuildResponse.from_queue_job — same ``request.url_for("get_queue_job", ...)`` route name. Operators can click through without parsing queue_job_id by hand, matching the org-binding endpoints' shape. - The new migration n2o3p4q5r6s7 lands a separate FK index (idx_dashboard_github_template_bindings_last_sync_queue_job_id) so queue-job retention's ``DELETE FROM queue_jobs WHERE …`` can use the FK index to find dependent binding rows during the SET NULL cascade. PostgreSQL doesn't auto-index FK columns, and a binding table with thousands of rows would otherwise force a sequential scan on every queue-job delete. Next-iteration notes: - ``update_sync_state`` (called by the dashboard_sync worker) still bumps date_updated on every status flip via the column's onupdate=now() default. That is pre-existing behaviour — this ticket is scoped to the enqueue-time bookkeeping write — but if a future task wants strict source-only date_updated semantics, the same pattern (``date_updated=SqlBinding.date_updated`` in the Core UPDATE values dict) extends naturally. - Sync-failure semantics deliberately preserve the prior FK (mirroring github_template_id's preserve-last-good pattern). The acceptance criterion "last_sync_queue_job_url remains null" is exact for newly- created bindings; for an existing binding whose prior sync succeeded and whose next enqueue then fails, the URL still points at the prior queue_job. If a future task wants the FK cleared on failure too, the failure handler in ``try_enqueue_dashboard_sync`` is the single place to extend. - ``last_sync_queue_job_public_id`` is materialized via a left join on every read. If a future workload turns out to be read-heavy enough that the join is a measurable cost, the column could alternatively be denormalized onto the binding row (``last_sync_ queue_job_public_id`` stored directly). The current shape — FK + join — is preferred because it keeps the public_id authoritative on queue_jobs and lets the SET NULL cascade naturally invalidate pruned URLs without a second write path. Closes #277

The dashboard-template binding response previously exposed neither the commit SHA of the most recent successful sync nor a clickable URL into the GitHub browse UI. Add both fields to the response model. ``web_url`` is derived from the binding's source coordinates (always present once a binding exists, regardless of sync state); ``commit_sha`` is materialized via a new left join from ``dashboard_github_template_bindings`` to ``dashboard_github_templates`` on ``github_template_id`` and stays ``None`` until the first successful sync links a content row. Key decisions: - ``build_github_browse_url`` lives in ``storage/github/web_url.py`` next to ``tree_fetcher.py`` so all GitHub-URL knowledge stays in one place. The helper ``strip("/")`` normalises the root path: ``"/"``, ``""``, and ``"//"`` all collapse to a bare ``/tree/{ref}`` URL with no trailing slash, while ``"/themes/blue"``, ``"themes/blue"``, and ``"/themes/blue/"`` all produce the same ``…/tree/{ref}/themes/blue`` output. Refs containing ``/`` (``release/1.0``) pass through verbatim — GitHub's browse UI accepts embedded slashes in the ``/tree/...`` route without URL-encoding, and encoding the slash would break the click-through. - ``commit_sha`` is exposed via a left join rather than a denormalised column on the binding row. Mirrors the existing ``last_sync_queue_job_public_id`` join pattern: the read methods materialise the joined column in the same SELECT so the response layer builds the field without a second round-trip, and the SET NULL cascade on ``github_template_id`` naturally invalidates a stale SHA when a content row is pruned. The single read-side join helper ``_select_with_joins`` replaces the prior ``_select_with_queue_job`` and adds the second outer join; ``_to_domain`` now takes both joined columns and uses ``model_copy(update=…)`` only when at least one is non-NULL so the existing ``test_get_by_id_returns_none_public_id_when_no_queue_job``-style invariants still hold. - The domain model gains ``commit_sha: str | None`` with ``default= None`` so ``DashboardGitHubTemplateBinding.model_validate(row)`` on a raw ``SqlDashboardGitHubTemplateBinding`` (the ``BindingStore.create`` path, which does not go through the join helper) keeps working unchanged: the SQL row has no ``commit_sha`` attribute, but the field's default fills the gap. - ``DashboardTemplateBindingResponse.from_domain`` reads ``domain.commit_sha`` (not the raw SQL row) and computes ``web_url`` by calling ``build_github_browse_url`` directly; both org-default and project-override responses go through the same ``from_domain`` classmethod so neither caller needs special-casing. The admin-side force-sync handler returns its own response shape and does not serialise the binding, so it is unaffected. Next-iteration notes: - The follow-up rename / installation event processors in #240 / #242 do not touch the read-side join — they only mutate display strings and reachability flags on existing rows. No further plumbing is needed for this slice. - ``commit_sha`` is exact for the synced commit, not for any later ref movement. If a future task wants "last successful sync commit" to remain pinned even after a binding is rebound to a different ref, the commit_sha column on ``dashboard_github_templates`` is already keyed by ``(owner, repo, ref, root_path)`` so the join follows the binding's current coordinates, not its sync history. This matches the design doc's overwrite-in-place semantics. Closes #281

The dashboard-template binding endpoints were not discoverable from the parent ``Organization`` or ``Project`` resource. Add a ``dashboard_template_url`` HATEOAS field on both response models so a client browsing the org/project resource can follow the link straight to the binding configuration. The field is populated from the existing ``get_org_dashboard_template`` / ``get_project_dashboard_template`` route names — no new plumbing. Key decisions: - Field lands on the client base ``Organization`` and ``Project`` models (``client/src/docverse/client/models/{organizations,projects}.py``) rather than on the user-facing handler subclass only. The acceptance criterion explicitly enumerates four test surfaces (org-list, org-get, project-list, project-get) and the only org-list endpoint is the super-admin ``GET /admin/orgs`` which uses the admin ``Organization`` subclass — also extending ``_OrganizationBase``. Adding the field on the base means both subclasses inherit it uniformly and the URL is always present (the route exists regardless of whether the binding has been configured), so the contract matches the issue's "the URL itself is stable" framing. Cost: the admin ``from_domain`` needs a one-line populate; benefit: a uniform navigation link from any org response. - The URL is computed via ``request.url_for(...)`` exactly like the sibling ``services_url`` / ``credentials_url`` / ``editions_url`` fields. Because all three call-sites (user-facing org, admin org, project) live in the same FastAPI app and the binding route names (``get_org_dashboard_template``, ``get_project_dashboard_template``) are already registered by ``handlers/orgs/dashboard_template.py``, no router-include changes were needed. - The client base ``Organization`` model uses ``ConfigDict(from_attributes=True)`` (no ``extra="forbid"``), so adding a required field does not break clients that deserialize server responses — they pick up the new field. ``OrganizationCreate`` / ``OrganizationUpdate`` (which *do* use ``extra="forbid"``) are request-only models and do not gain the field, so callers cannot accidentally smuggle a HATEOAS URL into a write payload. Same reasoning applies to ``Project`` vs. ``ProjectCreate`` / ``ProjectUpdate``. - Existing handler tests for the four surfaces gained one assertion each (``dashboard_template_url`` ends with the expected path and starts with ``http``), keeping the new coverage co-located with the existing per-endpoint URL assertions instead of fragmenting into a separate test module. Next-iteration notes: - None. The field is plumbed through the three response sites that return an ``Organization`` or ``Project`` model. Future ``OrganizationServiceSummary`` or similar embedded representations do not need this field — the link belongs on the top-level resource. Closes #282

Replace ``POST /admin/dashboard-templates/{binding_id}/sync`` with two slug-keyed routes scoped under the org/project the binding belongs to: ``POST /orgs/{org}/dashboard-template/sync`` re-syncs the org-default binding and ``POST /orgs/{org}/projects/{project}/dashboard-template/ sync`` re-syncs the project-override binding. Two motivations: the public API must not expose database identifiers, and force-sync is a routine operation that belongs with org admins, not super-admin only. Key decisions: - Both new handlers use the existing ``require_admin`` dependency. ``AuthorizationService.resolve_role()`` (services/authorization.py:31) already grants super-admins ``OrgRole.admin`` automatically, so a single ``require_admin`` covers "super-admin OR org-admin" without bespoke wiring — no second handler variant or dual-dep is needed. The slug-keyed URL already scopes the action to one org, so the super-admin escalation flows through the same code path the binding CRUD endpoints already use. - The sync handlers do their own slug → binding resolution via ``DashboardTemplateBindingService.get_org_default`` / ``get_project_override`` (which raise ``NotFoundError`` → 404 when the binding is unconfigured), then hand the resulting binding id to ``DashboardSyncEnqueuer``. The double load (service + enqueuer) is intentional — keeping enqueue payloads keyed by binding id matches the worker contract and avoids leaking slug strings into the queue message — and the cost is one extra index hit per call. - ``DashboardTemplateSyncEnqueuedResponse`` moved from ``handlers/admin/endpoints.py`` to ``handlers/orgs/models.py`` since the model is no longer admin-scoped. Added a ``from_queue_job(binding_id, queue_job, request)`` classmethod that mirrors the existing ``DashboardRebuildResponse.from_queue_job`` shape so both POST handlers stay one line. The ``binding_id`` field is preserved per the issue's "same response shape" acceptance criterion — the slug context is implicit in the URL but the integer id is still the authoritative pointer the worker logs against. - The handlers attach to the same ``org_default_router`` / ``project_override_router`` split that #279 introduced, so the org-default sync POST inherits ``tags=["orgs"]`` and the project- override sync POST inherits ``tags=["projects"]`` with no per-route override needed. Matches the convention #279 established for the existing GET/PUT/DELETE endpoints on the same paths. - The admin route + its tests are removed in the same change — the PRD has not shipped, so a deprecation alias would be churn for zero benefit. ``tests/handlers/admin_dashboard_template_test.py`` is deleted and the equivalent coverage lands inline in ``tests/handlers/dashboard_template_test.py`` next to the binding CRUD tests it shares fixtures with. - The OpenAPI tag test in ``test_dashboard_template_openapi_tags`` now also asserts the legacy admin path is gone (no ``/docverse/admin/dashboard-templates/{binding_id}/sync`` entry in ``paths``). Catches a future regression where a migration alias accidentally re-registers the old route. Next-iteration notes: - The response still carries the integer ``binding_id`` field. If a later slice wants the public API fully purged of database ids, the field can drop with no further plumbing — the URL identifies the binding unambiguously and the queue-job public id is a separate base32-encoded value already. - ``try_enqueue_dashboard_sync`` (services/dashboard_templates/ enqueue.py:92) is the fire-and-forget variant the binding PUT handlers use; the new sync handlers call ``enqueuer.enqueue`` directly because a force-sync's failure should bubble as a 500 and surface in the response, not be silently logged like a PUT-time enqueue. That asymmetry is deliberate and matches the prior admin route's behaviour. Closes #280

…dings The push webhook silently ignored real GitHub events because the binding lookup compared refs as exact strings. GitHub push payloads always carry ``ref = "refs/heads/<branch>"``, but the binding store kept whatever the operator typed — so a binding stored as ``main`` never matched the incoming ``refs/heads/main`` event. Normalize at both write and read seams: a Pydantic ``BeforeValidator``-style ``field_validator(mode= "before")`` on ``DashboardTemplateBindingCreate.github_ref`` strips a leading ``refs/heads/`` or ``refs/tags/`` prefix on input, and the push processor strips the same prefixes from ``payload["ref"]`` before the binding store lookup. Both sites call one shared ``normalize_github_ref`` helper. Key decisions: - The shared helper ``normalize_github_ref`` lives in ``client/src/docverse/client/models/dashboard_template.py`` (next to the model whose validator uses it), not in ``services/dashboard_templates/`` as the issue's approach sketch proposed. The client package is published independently of the server and its dependencies are restricted to ``pydantic``, ``httpx``, and ``click`` — importing from ``docverse.services.dashboard_templates`` into a client model would invert the package boundary and pull the entire server graph into client distributions. The server's ``push_processor`` already imports ``DashboardTemplateBindingCreate`` from ``docverse.client.models``, so importing the helper from the same module is a no-cost extension. Mirror of the ``normalize_base_domain`` pattern in ``client/.../models/organizations.py``. - ``mode="before"`` on the validator runs the prefix-strip *before* Pydantic applies the ``min_length=1`` / ``max_length=256`` constraints from the ``Field(...)`` annotation, so ``refs/heads/`` alone (a fully-qualified ref with an empty branch name) normalizes to ``""`` and then fails ``min_length=1`` with the standard pydantic error message. Same for ``refs/tags/``. Length bounds stay 1–256. - The push processor now logs ``"No bindings match push event"`` at ``info`` (was ``debug``) and includes both the raw and the normalized refs (``github_ref_raw`` + ``github_ref``). When a future operator hits a mismatch — different repo, mismatched branch, etc. — the ref pair shows up at the default verbosity instead of requiring a logging-level dial-up. The other two push-processor info lines stay on raw-ref logging because by the time they fire the lookup has already succeeded; the diagnostically interesting comparison is the no-match case. - Existing test fixtures that seeded bindings with ``github_ref="refs/heads/main"`` via the storage-level ``DashboardGitHubTemplateBindingCreate`` dataclass switch to the bare ``"main"`` form. Those storage-direct creates bypass the Pydantic validator, so they were testing a state that cannot exist after the validator: in production every PUT runs through the API, every API write normalizes, and stored refs are always canonical. Updating the fixtures to canonical form keeps them faithful to production data and avoids adding a third normalization site at the storage layer. - The two new ``test_process_matches_bare_*_against_refs_*_payload`` tests in ``push_processor_test.py`` are the asymmetric coverage the prior fixtures didn't have: every existing fixture used ``refs/heads/main`` on both sides, which is why the bug shipped without a failing test. A handler-level mirror in ``webhooks_github_test.py`` covers the same path through the HMAC-validated webhook seam, and a parametrized round-trip test in ``dashboard_template_test.py`` exercises both ``refs/heads/`` and ``refs/tags/`` from the org-admin PUT side, asserting that the response body shows the bare form and that a re-PUT with the bare form is idempotent. - The validator unit test in ``client/tests/`` covers the five cases the issue's acceptance criterion enumerates: bare branch, ``refs/heads/<branch>``, ``refs/tags/<tag>``, a 40-char hex SHA (must pass through untouched), and the empty string (must still fail ``min_length``). Plus two extras: a ref with an embedded slash (``release/1.0`` and ``refs/tags/release/1.0``) to lock in that only the leading prefix is stripped, and a non-tracked ref type (``refs/pull/123/head``) to confirm the strip list is bounded to the two prefixes operators care about. Next-iteration notes: - No data backfill is needed. This branch is unmerged and the only known production binding (jsc-test-20260409) was already in bare form; any future writes go through the new validator. If a prefixed row ever does sneak in via a pre-validator code path, it will start matching as soon as the webhook side normalizes the lookup ref, but the canonical-form invariant on the row itself would be off — a future task could add a one-shot migration that rewrites any stored ``refs/heads/...`` / ``refs/tags/...`` rows to bare form for tidiness. - The ``DashboardGitHubTemplateBindingCreate`` storage-level dataclass intentionally does *not* normalize. Its only callers are the binding service (which forwards a validated ``DashboardTemplateBindingCreate``) and tests that bypass the API. Adding a normalize at this layer would be belt-and- suspenders, but it would also let test fixtures lie about what production actually stores. If a future direct-storage code path ever lands (e.g. a CLI seeder), normalizing in ``binding_store.create`` is the right place to add it — calling the same shared helper. - The other two push-processor ``info`` log lines (``"Push event affected no binding root_paths"`` and ``"Enqueued dashboard_sync jobs from push event"``) still log ``github_ref=ref`` (the raw value). They fire after the lookup has already matched, so the raw value is enough; switching them to also include ``github_ref_raw`` would be log-line churn for no diagnostic gain. Closes #283

Capture GitHub's stable repo / owner / installation IDs on first successful sync, write them to bindings + content rows, and have the push event processor prefer ID-keyed binding lookup with a name-based fallback for un-synced rows. Internal matching now survives a GitHub repo or org rename without operator intervention; the public API remains name-keyed and exposes the captured IDs in the binding GET response for debugging. Key decisions: - ``InstallationAuth`` grew a required ``installation_id: int`` field rather than a separate ``get_installation_id`` round-trip from the syncer. The app client already resolves the installation id inside ``get_installation_auth`` and was discarding it; threading it onto the auth record is a no-cost change for production callers and lets the syncer write ``github_installation_id`` to the binding without doubling the GitHub round-trip count. The dataclass field is required (no default) so production code cannot accidentally construct an auth record without it; the test fixture's ``installation_auth`` helper sources it from the mock's seeded ``(owner, repo) -> id`` map (or ``default_installation_id``) so existing direct-construction tests keep their ergonomic shape. - The tree fetcher gained a ``_fetch_repo`` step that calls ``GET /repos/{owner}/{repo}`` rather than overloading the existing ``commits/{ref}`` call, because the commits response does not include the repository's numeric ID and adding the field to its schema is GitHub's prerogative. Repo + owner IDs land on ``FetchedTree`` as required ``int`` fields rather than ``int | None`` because the syncer is the only consumer in this slice and it always wants both — leaving them ``Optional`` would have leaked ``None`` into the store-layer guard for no benefit. The ``raise_for_status`` chain catches a missing repo deterministi- cally, so a real GitHub anomaly (e.g. installation lost mid-sync) surfaces as a recorded sync failure rather than a silent no-id capture. - ``DashboardGitHubTemplateBindingStore`` exposes two new methods rather than re-purposing ``list_by_repo_ref``: ``list_by_repo_id_and_ref`` for the ID-keyed primary lookup (backed by ``idx_dashboard_github_template_bindings_repo_id_ref`` from #240) and ``list_unsynced_by_repo_ref`` whose ``github_repo_id IS NULL`` filter encodes the rename-safety invariant directly. Without that filter, a binding whose name coincidentally matches a *different* repo now sitting at the old string would re-sync against the wrong upstream; pushing the filter into the store query (vs. dedup-after-the-fact) keeps the invariant load-bearing in code rather than only in tests. ``list_by_repo_ref`` stays in the API surface even though the push processor no longer calls it, because the binding store is a public storage primitive and external callers (admin tooling, future debug consoles) may legitimately want the unfiltered name-only view. - ``PushEventProcessor._lookup_bindings`` unions the two store calls with a ``seen: set[int]`` dedup guard. The two queries are disjoint by construction (the unsynced filter excludes IDs the primary lookup might find), so the dedup is defense-in-depth — but the ``test_process_dedupes_id_and_name_matches`` test pins the guard so a future loosening of either query (e.g. dropping the ``IS NULL`` filter "for symmetry") cannot regress to double- enqueueing the same ``dashboard_sync`` job. ``_coerce_int`` rejects ``bool`` explicitly because ``isinstance(True, int)`` is ``True`` in Python and a malformed payload that sent ``"id": true`` would otherwise leak ``1`` through as a real repo id. - The store-layer "only assign if non-None" guard on ``update_sync_state`` and ``upsert`` (already pinned by tests in #240) is what protects populated IDs from being overwritten on a failed re-sync. The syncer now always passes populated IDs on the success path, so the guard's job is to fence the *failure* path (where ``_record_failure`` calls ``update_sync_state`` without ID kwargs). The new ``test_sync_does_not_overwrite_populated_ids_ on_resync`` test pins the contract end-to-end so a future syncer refactor that started passing ``None`` explicitly cannot zero a binding's IDs. - ``GitHubMock.seed_tree`` now auto-seeds the ``GET /repos/{owner}/{repo}`` endpoint with default ``(repo_id=1, owner_id=1)`` so existing tree-fetcher tests that do not care about ID capture keep working with a single ``seed_tree(...)`` call. Tests that round-trip captured IDs into the database pass ``repo_id=`` / ``owner_id=`` to ``seed_tree`` and the auto-call uses those instead. ``seed_repo`` is exposed separately for tests that bypass ``seed_tree`` (the syncer's GitHub-error fixture needs the repo endpoint mocked but the ref endpoint to return 404, so it calls ``seed_repo`` then mocks ``commits/{ref}`` directly). Next-iteration notes: - Slice #242 ("rename + installation webhooks") will consume the ``github_owner_id`` / ``github_repo_id`` columns this slice now populates. When a ``repository.renamed`` event arrives, the handler can match the binding by stable ID (the rewrite target) and update ``github_owner`` / ``github_repo`` strings without re-syncing content. This slice intentionally does not register any rename-event handlers — name-keyed pushes still match un-synced bindings and ID-keyed pushes still match synced ones, so the system is functional without rename handlers; they're a staleness-fix, not a correctness requirement. - The blocker merge of ``tickets/DM-54689-github-webhook`` (PR #264) into this branch is a no-op once #264 lands on main first. GitHub auto-merge will flatten the merge commit. If #264 is rebased instead of merged cleanly, this branch needs a manual re-merge. - ``InstallationAuth.installation_id`` is now required, but the GitHub App secrets remain optional on ``Factory`` / ``WorkerFactoryBuilder``. The optionality is a feature-flag for the dashboard-template-from-github feature; once the feature graduates from optional, the three GitHub App fields can be tightened to non-Optional and ``startup`` raised earlier — the ``InstallationAuth`` change has no bearing on that decision. - The ``DashboardTemplateBinding`` client model now exposes the three numeric ID fields as informational. If a future client wants to assert on rename-robustness from the outside, it can observe ``github_repo_id`` flipping from ``null`` to populated after the first successful sync — no new endpoint needed. Closes #241

This was referenced Apr 25, 2026

DM-54689: GitHub push webhook + PushEventProcessor #238

Closed

Require a pre-commit fixpoint before AFK commits #265

Merged

jonathansick force-pushed the tickets/DM-54689-github-webhook branch from 7290ff4 to 5378f47 Compare April 26, 2026 00:50

jonathansick added 6 commits April 27, 2026 12:33

Drop unused noqa comments in arq_testing helpers

ec9eb27

The tests/** ruff per-file-ignores already cover SLF001, so the inline # noqa: SLF001 markers next to MockArqQueue._job_metadata accesses were unused. Removing them lets ruff format reflow two short call sites onto one line.

jonathansick force-pushed the tickets/DM-54689-github-webhook branch from d2bd6ef to ec9eb27 Compare April 27, 2026 16:43

jonathansick added 2 commits April 27, 2026 16:52

jonathansick added 2 commits May 1, 2026 15:48

jonathansick mentioned this pull request May 1, 2026

DM-54689: Fix OpenAPI tags on org-scoped dashboard-template binding endpoints #279

Closed

4 tasks

jonathansick added 3 commits May 1, 2026 16:22

jonathansick mentioned this pull request May 1, 2026

DM-54689: Surface commit_sha and web_url on dashboard-template binding response #281

Closed

5 tasks

jonathansick mentioned this pull request May 1, 2026

DM-54689: Add HATEOAS dashboard_template_url to organization and project responses #282

Closed

4 tasks

This was referenced May 1, 2026

DM-54689: Move dashboard-template force-sync out of /admin/ to org/project-scoped slug routes #280

Closed

DM-54689: Normalize refs/heads/ and refs/tags/ prefixes on dashboard-template binding refs #283

Closed

jonathansick merged commit be3c5fd into main May 1, 2026
22 checks passed

jonathansick deleted the tickets/DM-54689-github-webhook branch May 1, 2026 19:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-54689: GitHub push webhook + PushEventProcessor#264

DM-54689: GitHub push webhook + PushEventProcessor#264
jonathansick merged 17 commits intomainfrom
tickets/DM-54689-github-webhook

jonathansick commented Apr 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jonathansick commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation steps

References

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

jonathansick commented Apr 25, 2026 •

edited

Loading