feat(client+core): CLI side of direct bucket upload for Percy resources (PPLT-5303)#2223
Draft
Manoj-Katta wants to merge 6 commits into
Draft
feat(client+core): CLI side of direct bucket upload for Percy resources (PPLT-5303)#2223Manoj-Katta wants to merge 6 commits into
Manoj-Katta wants to merge 6 commits into
Conversation
…ation (PPLT-5303) Prep for direct bucket upload: percy-api needs to mint Content-MD5 / Content-Length-bound signed GCS URLs on snapshot creation, so the CLI must declare both alongside the existing sha-256. Computed once at createResource() time so every resource factory (createResource, createRootResource, createPercyCSSResource, createLogResource) gets the fields for free. PERCY_GZIP path in discovery.js keeps md5 + contentLength in lockstep with sha when it mutates the content buffer, so direct upload composes with gzip. Wire shape is not changed yet; these fields are read by Step 3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…atim (PPLT-5303) Direct bucket upload needs to PUT raw bytes to a GCS signed URL with caller-supplied Content-MD5 / Content-Length / Content-Type. The shared request() helper currently JSON-stringifies any non-string body and forces Content-Type: application/json — that wraps the bytes and breaks GCS's Content-MD5 check. Add an explicit rawBody flag. When set, body and headers pass through unchanged. Retries, proxy handling, error parsing are unaffected. Default behavior is unchanged for every existing caller. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-md5/length on snapshot creation (PPLT-5303) Adds the wire contract for direct bucket upload on the snapshot side: - X-Percy-Capabilities: direct-upload-v1 header on POST /builds/:id/snapshots. Old percy-api ignores it. New percy-api uses it as one of the eligibility checks before minting Content-MD5-bound GCS signed URLs. - content-md5 and content-length attrs alongside the existing resource-url / is-root / for-widths / mimetype. Prefers caller-supplied r.md5 / r.contentLength (set by createResource in Step 1); falls back to computing on the fly when resources arrive without them (e.g. filepath-loaded). Null when content is absent (sha-only resources), which the server treats as ineligible and routes through legacy. - PERCY_DISABLE_DIRECT_UPLOAD env var suppresses the capability header for customers who hit latency issues and want to self-serve back to the legacy POST /resources path without waiting on a server-side flag flip. createBuild parity is intentionally not touched until the percy-api side confirms it accepts the new attrs (D3 in the plan). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… for GCS PUT (PPLT-5303) PUTs a resource's raw bytes to its percy-api-issued GCS signed URL using request() with rawBody: true. Sends Content-MD5, Content-Length, and Content-Type headers matching what percy-api baked into the signature. Failure handling distinguishes two classes: - Transport (network errors, 5xx after retries, 403 expired, 429 rate-limited): the resource is returned to the caller so it can be routed through the legacy POST /resources path. Build still succeeds. - Correctness (400 BadDigest, 412 Precondition Failed): thrown. These indicate a real bug — either the CLI computed md5/length wrong, or a single-use signed URL is being replayed. Silently falling back here would mask the regression; the customer's build fails loud instead. uploadResourcesDirect is a pool-based parallel orchestrator sharing PERCY_RESOURCE_UPLOAD_CONCURRENCY with the legacy uploader. Returns the list of transport-failed resources for the caller to retry via legacy. Not yet wired into sendSnapshot — Step 5 does that. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…t upload and legacy fallback (PPLT-5303) When percy-api advertises direct upload for a build, missing-resources entries may carry signed-upload-url + signed-upload-expires attrs. The new flow: - per-resource partition: signed URL present → direct PUT to GCS via uploadResourcesDirect (Step 4); absent → legacy POST /resources - transport failures returned from the direct path merge back into the legacy bucket, so a partial-failure scenario degrades gracefully and the build still finalizes - correctness failures (400 BadDigest / 412) thrown — these are real bugs, not transport noise to mask - PERCY_DISABLE_DIRECT_UPLOAD forces every resource through legacy, giving customers a self-serve escape hatch without needing the server-side LD flag flipped Also: replaced `!!process.env.X` truthy check with a strict envBool() helper that only accepts "1" or "true" (case-insensitive). Without this, PERCY_DISABLE_DIRECT_UPLOAD=0 would silently disable direct upload — surprising and wrong. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-5303)
Stands up a fake GCS endpoint in-test via mockRequests('storage.googleapis.com'),
mocks the percy-api snapshot-creation response to embed signed-upload-url in
missing-resources attributes, and drives a real percy.snapshot(...) end-to-end.
Four scenarios covered:
1. Happy path — every missing resource PUT direct to GCS with valid
Content-MD5/Length headers; legacy POST /resources never touched;
snapshot finalizes.
2. Transport-class failure — GCS returns 403 on every PUT; every resource
falls back through legacy and the build still succeeds.
3. Customer escape hatch — PERCY_DISABLE_DIRECT_UPLOAD=1 suppresses the
capability header and routes everything through legacy even when the
server returns signed URLs.
4. Integrity — the Content-MD5 each PUT sends matches the content-md5 the
CLI declared on snapshot creation for that same sha. Guards against
buffer-mutation drift that would otherwise produce 400 BadDigest in prod.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this changes
CLI side of PPLT-5303 — GCS network cost reduction. Implements the client end of the Direct Bucket Upload tech spec.
Today every Percy resource (HTML / CSS / JS / fonts / inline images) goes
CLI → base64 JSON → percy-api → GCS.percy-apidecodes, re-hashes, streams to GCS — pure byte brokerage on the hot path. This PR teaches the CLI to upload resources directly to GCS via a per-resource V4 signed URL minted by percy-api in themissing-resourcesresponse. Integrity is preserved at the GCS layer viaContent-MD5+Content-Lengthbaked into the signed URL.Rollout is Strategy A (locked by review comment on the tech spec): per-org LD flag, legacy
POST /resourcespath stays live indefinitely as the per-resource fallback. Nothing in this PR breaks any existing customer — old API ignores the new header and the CLI uses the legacy path; new API + new CLI start using the direct path once the LD flag is on for the org.Wire contract added
CLI → API on
POST /builds/:id/snapshots:X-Percy-Capabilities: direct-upload-v1relationships.resources.data[].attributesgains:content-md5— base64-encoded MD5 of the resource bytes (RFC 1864)content-length— byte countAPI → CLI in the
missing-resourcesresponse, each entry may now carry:signed-upload-url— V4-signed PUT URLsigned-upload-expires— ISO 8601 expiry timestampCLI → GCS when a signed URL is returned:
Failure handling
Per-resource decision, not per-build:
POST /resourcespath silently. Build still succeeds.400 BadDigest,412 Precondition Failed) → thrown. These indicate a real bug (wrong md5/length declared, or single-use URL replayed). Falling back here would mask the regression; fail-loud surfaces it.Customer escape hatch
PERCY_DISABLE_DIRECT_UPLOAD=1suppresses the capability header and forces every resource through the legacy path. Lets a latency-sensitive customer self-serve back to the old behavior without waiting on an LD flag flip. Strict parsing — only"1"or"true"(case-insensitive) count as enabled;"0"/"false"do NOT silently enable like a naive!!process.env.Xwould.Commits (6, in a clean bottom-up order)
createResourceattachesmd5+contentLengthto every resource (incl. PERCY_GZIP lockstep)request()acceptsrawBody: trueto send rawBufferbodiescreateSnapshotsends capability header + new attrsuploadResourceDirect/uploadResourcesDirectprimitives + failure classificationsendSnapshotpartitions missing-resources between direct + legacy + strict env parsing@percy/coreCompatibility
PERCY_DISABLE_DIRECT_UPLOAD=1What's intentionally NOT in this PR
createBuildparity for declared resources. The static-imageuploadcommand usescreateBuilddeclared resources, and symmetry suggests it should also carrycontent-md5/content-length. Held back until the percy-api side confirms it accepts those attrs onPOST /builds— gated on O1 in the plan. Easy follow-up commit once confirmed.PERCY_DISABLE_DIRECT_UPLOADdoesn't appear in any README yet. Punted to a follow-up PR so the doc can describe the end-to-end behavior with the API side landed.Open items (need API-side confirmation, not coding-blockers)
POST /buildsacceptscontent-md5+content-lengthon declared resources (re-enablescreateBuildparity).content-md5,content-length) as used here, or camelCase? Will mirror whatever the API side ships.direct-upload-v1is the working name. Confirm with API.Test plan
md5base64(3),createResource& factories (6),request() rawBody(3),createSnapshotheader + attrs (3 + 2 strict-parsing),uploadResourceDirect/uploadResourcesDirect(11),sendSnapshotpartition + fallback (5)@percy/corewith mocked GCS endpoint: happy path, transport failure → fallback,PERCY_DISABLE_DIRECT_UPLOADforces legacy, Content-MD5 integrity matches declared md5 (4 tests)@percy/clientsuite (283/283 minus 1 pre-existing PAC proxy flake); existingcreateSnapshot/sendSnapshottests updated to assert the new attrs; legacyPOST /resourcespath still exercised by every test that doesn't return signed URLs.PERCY_DISABLE_DIRECT_UPLOAD=1to confirm legacy still works for customers on the escape hatch.Plan doc
Full implementation plan (both CLI and API sides, plus rollout/observability/rollback) is at
docs/plans/2026-05-11-003-pplt5303-direct-bucket-upload-cli.md. That doc is gitignored — local reference only.🤖 Generated with Claude Code