Skip to content

feat(client+core): CLI side of direct bucket upload for Percy resources (PPLT-5303)#2223

Draft
Manoj-Katta wants to merge 6 commits into
masterfrom
PPLT-5303/direct-bucket-upload-cli
Draft

feat(client+core): CLI side of direct bucket upload for Percy resources (PPLT-5303)#2223
Manoj-Katta wants to merge 6 commits into
masterfrom
PPLT-5303/direct-bucket-upload-cli

Conversation

@Manoj-Katta
Copy link
Copy Markdown
Contributor

What this changes

CLI side of PPLT-5303 — GCS network cost reduction. Implements the client end of the Direct Bucket Upload tech spec.

Today every Percy resource (HTML / CSS / JS / fonts / inline images) goes CLI → base64 JSON → percy-api → GCS. percy-api decodes, re-hashes, streams to GCS — pure byte brokerage on the hot path. This PR teaches the CLI to upload resources directly to GCS via a per-resource V4 signed URL minted by percy-api in the missing-resources response. Integrity is preserved at the GCS layer via Content-MD5 + Content-Length baked into the signed URL.

Rollout is Strategy A (locked by review comment on the tech spec): per-org LD flag, legacy POST /resources path stays live indefinitely as the per-resource fallback. Nothing in this PR breaks any existing customer — old API ignores the new header and the CLI uses the legacy path; new API + new CLI start using the direct path once the LD flag is on for the org.

Wire contract added

CLI → API on POST /builds/:id/snapshots:

  • Request header: X-Percy-Capabilities: direct-upload-v1
  • Each entry in relationships.resources.data[].attributes gains:
    • content-md5 — base64-encoded MD5 of the resource bytes (RFC 1864)
    • content-length — byte count

API → CLI in the missing-resources response, each entry may now carry:

  • signed-upload-url — V4-signed PUT URL
  • signed-upload-expires — ISO 8601 expiry timestamp

CLI → GCS when a signed URL is returned:

PUT <signed-upload-url>
Content-MD5:    <same base64 we declared>
Content-Length: <same length we declared>
Content-Type:   <resource mimetype>
Body:           <raw bytes — not base64, not JSON>

Failure handling

Per-resource decision, not per-build:

  • Transport-class (network errors, 5xx after retries, 403 expired, 429) → resource falls back to the legacy POST /resources path silently. Build still succeeds.
  • Correctness-class (400 BadDigest, 412 Precondition Failed) → thrown. These indicate a real bug (wrong md5/length declared, or single-use URL replayed). Falling back here would mask the regression; fail-loud surfaces it.

Customer escape hatch

PERCY_DISABLE_DIRECT_UPLOAD=1 suppresses the capability header and forces every resource through the legacy path. Lets a latency-sensitive customer self-serve back to the old behavior without waiting on an LD flag flip. Strict parsing — only "1" or "true" (case-insensitive) count as enabled; "0"/"false" do NOT silently enable like a naive !!process.env.X would.

Commits (6, in a clean bottom-up order)

# SHA What
1 b09794a createResource attaches md5 + contentLength to every resource (incl. PERCY_GZIP lockstep)
2 ebaeeb7 request() accepts rawBody: true to send raw Buffer bodies
3 c36c538 createSnapshot sends capability header + new attrs
4 3fc4608 uploadResourceDirect / uploadResourcesDirect primitives + failure classification
5 821c402 sendSnapshot partitions missing-resources between direct + legacy + strict env parsing
6 861047b End-to-end integration test in @percy/core

Compatibility

Scenario Result
New CLI + new API, LD flag on Direct upload happens
New CLI + new API, LD flag off API returns no signed URLs → CLI uses legacy. Transparent.
New CLI + old API API ignores header, returns no signed URLs → CLI uses legacy. Transparent.
Old CLI + new API No capability header → API doesn't mint URLs → legacy. Transparent.
New CLI + PERCY_DISABLE_DIRECT_UPLOAD=1 Capability header suppressed → API treats as old CLI → legacy.
New CLI + GCS unreachable for customer Each PUT fails transport-class → falls back per-resource. Build succeeds.

What's intentionally NOT in this PR

  • createBuild parity for declared resources. The static-image upload command uses createBuild declared resources, and symmetry suggests it should also carry content-md5/content-length. Held back until the percy-api side confirms it accepts those attrs on POST /builds — gated on O1 in the plan. Easy follow-up commit once confirmed.
  • Docs. PERCY_DISABLE_DIRECT_UPLOAD doesn't appear in any README yet. Punted to a follow-up PR so the doc can describe the end-to-end behavior with the API side landed.

Open items (need API-side confirmation, not coding-blockers)

  • O1POST /builds accepts content-md5 + content-length on declared resources (re-enables createBuild parity).
  • O2 — Wire-attr naming: kebab-case (content-md5, content-length) as used here, or camelCase? Will mirror whatever the API side ships.
  • O3 — Capability string: direct-upload-v1 is the working name. Confirm with API.

Test plan

  • Unit tests: md5base64 (3), createResource & factories (6), request() rawBody (3), createSnapshot header + attrs (3 + 2 strict-parsing), uploadResourceDirect / uploadResourcesDirect (11), sendSnapshot partition + fallback (5)
  • End-to-end in @percy/core with mocked GCS endpoint: happy path, transport failure → fallback, PERCY_DISABLE_DIRECT_UPLOAD forces legacy, Content-MD5 integrity matches declared md5 (4 tests)
  • Existing regressions: full @percy/client suite (283/283 minus 1 pre-existing PAC proxy flake); existing createSnapshot / sendSnapshot tests updated to assert the new attrs; legacy POST /resources path still exercised by every test that doesn't return signed URLs.
  • Staging end-to-end against real percy-api once the matching API PR lands (cross-repo, tracked separately).
  • Manual smoke on a real production build with PERCY_DISABLE_DIRECT_UPLOAD=1 to confirm legacy still works for customers on the escape hatch.

Plan doc

Full implementation plan (both CLI and API sides, plus rollout/observability/rollback) is at docs/plans/2026-05-11-003-pplt5303-direct-bucket-upload-cli.md. That doc is gitignored — local reference only.

🤖 Generated with Claude Code

Manoj-Katta and others added 6 commits May 11, 2026 21:13
…ation (PPLT-5303)

Prep for direct bucket upload: percy-api needs to mint Content-MD5 /
Content-Length-bound signed GCS URLs on snapshot creation, so the CLI
must declare both alongside the existing sha-256.

Computed once at createResource() time so every resource factory
(createResource, createRootResource, createPercyCSSResource,
createLogResource) gets the fields for free. PERCY_GZIP path in
discovery.js keeps md5 + contentLength in lockstep with sha when it
mutates the content buffer, so direct upload composes with gzip.

Wire shape is not changed yet; these fields are read by Step 3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…atim (PPLT-5303)

Direct bucket upload needs to PUT raw bytes to a GCS signed URL with
caller-supplied Content-MD5 / Content-Length / Content-Type. The shared
request() helper currently JSON-stringifies any non-string body and
forces Content-Type: application/json — that wraps the bytes and breaks
GCS's Content-MD5 check.

Add an explicit rawBody flag. When set, body and headers pass through
unchanged. Retries, proxy handling, error parsing are unaffected.
Default behavior is unchanged for every existing caller.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-md5/length on snapshot creation (PPLT-5303)

Adds the wire contract for direct bucket upload on the snapshot side:

- X-Percy-Capabilities: direct-upload-v1 header on POST /builds/:id/snapshots.
  Old percy-api ignores it. New percy-api uses it as one of the eligibility
  checks before minting Content-MD5-bound GCS signed URLs.

- content-md5 and content-length attrs alongside the existing resource-url /
  is-root / for-widths / mimetype. Prefers caller-supplied r.md5 / r.contentLength
  (set by createResource in Step 1); falls back to computing on the fly when
  resources arrive without them (e.g. filepath-loaded). Null when content is
  absent (sha-only resources), which the server treats as ineligible and
  routes through legacy.

- PERCY_DISABLE_DIRECT_UPLOAD env var suppresses the capability header for
  customers who hit latency issues and want to self-serve back to the legacy
  POST /resources path without waiting on a server-side flag flip.

createBuild parity is intentionally not touched until the percy-api side
confirms it accepts the new attrs (D3 in the plan).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… for GCS PUT (PPLT-5303)

PUTs a resource's raw bytes to its percy-api-issued GCS signed URL using
request() with rawBody: true. Sends Content-MD5, Content-Length, and
Content-Type headers matching what percy-api baked into the signature.

Failure handling distinguishes two classes:

- Transport (network errors, 5xx after retries, 403 expired, 429
  rate-limited): the resource is returned to the caller so it can be
  routed through the legacy POST /resources path. Build still succeeds.

- Correctness (400 BadDigest, 412 Precondition Failed): thrown. These
  indicate a real bug — either the CLI computed md5/length wrong, or a
  single-use signed URL is being replayed. Silently falling back here
  would mask the regression; the customer's build fails loud instead.

uploadResourcesDirect is a pool-based parallel orchestrator sharing
PERCY_RESOURCE_UPLOAD_CONCURRENCY with the legacy uploader. Returns the
list of transport-failed resources for the caller to retry via legacy.

Not yet wired into sendSnapshot — Step 5 does that.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…t upload and legacy fallback (PPLT-5303)

When percy-api advertises direct upload for a build, missing-resources
entries may carry signed-upload-url + signed-upload-expires attrs. The
new flow:

- per-resource partition: signed URL present → direct PUT to GCS via
  uploadResourcesDirect (Step 4); absent → legacy POST /resources
- transport failures returned from the direct path merge back into the
  legacy bucket, so a partial-failure scenario degrades gracefully and
  the build still finalizes
- correctness failures (400 BadDigest / 412) thrown — these are real
  bugs, not transport noise to mask
- PERCY_DISABLE_DIRECT_UPLOAD forces every resource through legacy,
  giving customers a self-serve escape hatch without needing the
  server-side LD flag flipped

Also: replaced `!!process.env.X` truthy check with a strict envBool()
helper that only accepts "1" or "true" (case-insensitive). Without
this, PERCY_DISABLE_DIRECT_UPLOAD=0 would silently disable direct
upload — surprising and wrong.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-5303)

Stands up a fake GCS endpoint in-test via mockRequests('storage.googleapis.com'),
mocks the percy-api snapshot-creation response to embed signed-upload-url in
missing-resources attributes, and drives a real percy.snapshot(...) end-to-end.

Four scenarios covered:

1. Happy path — every missing resource PUT direct to GCS with valid
   Content-MD5/Length headers; legacy POST /resources never touched;
   snapshot finalizes.

2. Transport-class failure — GCS returns 403 on every PUT; every resource
   falls back through legacy and the build still succeeds.

3. Customer escape hatch — PERCY_DISABLE_DIRECT_UPLOAD=1 suppresses the
   capability header and routes everything through legacy even when the
   server returns signed URLs.

4. Integrity — the Content-MD5 each PUT sends matches the content-md5 the
   CLI declared on snapshot creation for that same sha. Guards against
   buffer-mutation drift that would otherwise produce 400 BadDigest in prod.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant