Skip to content

chore: serialise tests builds on arm64-linux-16#6089

Merged
ggreif merged 1 commit intomasterfrom
gabor/serialize-arm64-tests
May 6, 2026
Merged

chore: serialise tests builds on arm64-linux-16#6089
ggreif merged 1 commit intomasterfrom
gabor/serialize-arm64-tests

Conversation

@ggreif
Copy link
Copy Markdown
Contributor

@ggreif ggreif commented May 6, 2026

Summary

Extends the same max_jobs=1 workaround used by gc-tests in #6009 to the tests matrix on arm64-linux-16. ubuntu-24-large keeps max-jobs=auto.

Why

tests (arm64-linux-16, release) and tests (arm64-linux-16, debug) have been OOM-killed by the runner whenever nix-build-uncached --max-jobs auto schedules three parallel ~3 GB derivation builds. Most recent observation: https://github.com/caffeinelabs/motoko/actions/runs/25409615273/job/74528169567 (initial CI on the pocket-ic bump PR #6088 — failure was unrelated to the bump itself).

Why max-jobs=1 (not 2)

We probed a higher ceiling but it's not stable enough to ship:

The result depends on which derivations happen to be cachix-cached vs cold-rebuilt at any given moment. An asymmetric gc-tests=1, tests=2 configuration would also be brittle: a future test addition could shift the memory profile of release-systems-go and silently reintroduce OOMs. Going uniform max-jobs=1 is slower but stable.

Test plan

  • CI green
  • No tests (arm64-linux-16, *) OOM on the resulting run

🤖 Generated with Claude Code

Same fix as PR #6009 applied to `gc-tests`. The `tests` matrix on
`arm64-linux-16` runs `nix-build-uncached --max-jobs auto`, which
schedules three parallel ~3 GB derivation builds and OOMs the runner
("Killed" → "The runner has received a shutdown signal").

Use the matrix `include:` pattern to override `max_jobs` per-OS and
plumb it through to `test-blueprint` like `gc-tests` already does.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ggreif ggreif requested a review from a team as a code owner May 6, 2026 00:31
@ggreif ggreif self-assigned this May 6, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

Comparing from 53943c3 to 677ddc3:
The produced WebAssembly code seems to be completely unchanged.
In terms of gas, no changes are observed in 5 tests.
In terms of size, no changes are observed in 5 tests.

@ggreif
Copy link
Copy Markdown
Contributor Author

ggreif commented May 6, 2026

Bumped from max-jobs=1 to max-jobs=2 based on the experiment in PR #6090.

PR #6090 (cache-busted run_test.rs) ran the full tests matrix on arm64-linux-16 with max-jobs=2 against a cold rebuild of test-runner and every downstream test-* derivation. Every job passed:

Job Duration
tests (arm64-linux-16, release) 18m26s
tests (arm64-linux-16, debug) 18m15s
gc-tests (arm64-linux-16, 2) 6m14s
common-tests (arm64-linux-16) (max-jobs=auto) 18m16s

So 2 parallel ~3 GB builds fit in 16 GB. Going to 3 was the original failure mode; 2 is the right ceiling.

@ggreif ggreif changed the title ci: serialise tests builds on arm64-linux-16 ci: throttle tests builds on arm64-linux-16 to max-jobs=2 May 6, 2026
@ggreif
Copy link
Copy Markdown
Contributor Author

ggreif commented May 6, 2026

Update: gc-tests can't go to 2

PR #6088 with a fresh cache-buster (lib.rs instead of run_test.rs) forced a real cold rebuild of the GC-variant derivations and hit OOM:

building 'test-drun-compacting-gc.drv'...
building 'test-drun-generational-gc.drv'...
Killed   nix-build-uncached -build-flags "--max-jobs 2"

Run: https://github.com/caffeinelabs/motoko/actions/runs/25411132891/job/74532926605

PR #6090's gc-tests-at-2 only appeared to pass because cachix served the GC variants from a different content hash. Two RTS-variant builds in parallel (each ~3 GB) exceed the 16 GB runner.

So the matrix is asymmetric:

Awaiting tests (arm64, *) results from PR #6088 to confirm the asymmetric stance on tests.

@ggreif ggreif force-pushed the gabor/serialize-arm64-tests branch from ee68f4f to 677ddc3 Compare May 6, 2026 01:19
@ggreif ggreif changed the title ci: throttle tests builds on arm64-linux-16 to max-jobs=2 ci: serialise tests builds on arm64-linux-16 May 6, 2026
@ggreif ggreif enabled auto-merge May 6, 2026 05:10
@ggreif ggreif changed the title ci: serialise tests builds on arm64-linux-16 chore: serialise tests builds on arm64-linux-16 May 6, 2026
@ggreif ggreif added this pull request to the merge queue May 6, 2026
Merged via the queue into master with commit 6b01a41 May 6, 2026
34 checks passed
@ggreif ggreif deleted the gabor/serialize-arm64-tests branch May 6, 2026 07:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants