ci: migrate primary CI jobs to ARM runners#7232
ci: migrate primary CI jobs to ARM runners#7232thepastaclaw wants to merge 1 commit intodashpay:developfrom
Conversation
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
✅ No Merge Conflicts DetectedThis PR currently has no conflicts with other open PRs. |
WalkthroughCI workflows and test environment scripts were adjusted to remove the aarch64-specific setup and shift many linux64 jobs to run on ARM runners. The GitHub Actions workflow removed aarch64 job blocks, renamed and migrated several linux64 jobs to run on arm64, and changed the bootstrap job runner default. The ci matrix and setup scripts were updated: the aarch64 environment script was deleted, several native setup scripts had HOST detection/export removed or altered, and ci/test/00_setup_env.sh now derives HOST from uname/dpkg before falling back to config.guess. A release helper expanded exact-platform mappings for aarch64/x86_64 GNU triplets. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
PastaPastaPasta
left a comment
There was a problem hiding this comment.
your branch ci fails here: https://github.com/thepastaclaw/dash/actions/runs/23168985036/job/67317597479
please continue tracking this until push ci on your repo is successful
4b72d41 to
cba08c0
Compare
|
The CI failure on the old commit (
The fixes in the force-push:
|
|
✅ Review complete (commit e7f1160) |
d619b9f to
cba08c0
Compare
thepastaclaw
left a comment
There was a problem hiding this comment.
Code Review
The CI host-triplet cleanup is mostly correct, but it introduces one real regression in previous-release handling for ARM-native linux64 functional jobs. The only confirmed issue is blocking: the new ARM host mapping reaches a legacy release path that has no matching aarch64 artifact, so the job aborts before tests run.
Reviewed commit: cba08c0
🔴 1 blocking
1 additional finding
🔴 blocking: ARM native linux64 jobs fail when downloading the oldest previous-release binary
test/get_previous_releases.py (lines 127-134)
test-linux64 now runs on an ARM runner, 00_setup_env.sh sets HOST=aarch64-linux-gnu for that environment, and the native linux64 test target still enables --previous-releases. In check_host(), that host maps to args.platform = "aarch64-linux-gnu", and the default tag list includes v0.12.1.5. For tags earlier than v0.12.3, download_binary() only remaps arm-linux-gnueabihf to RPi2 and x86_64-linux-gnu to linux64, so v0.12.1.5 is requested as dashcore-0.12.1.5-aarch64-linux-gnu.tar.gz. That artifact is not present in SHA256_SUMS and does not exist in the release assets, so the script returns Binary tag was not found and the functional job fails before running tests.
🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.
In `test/get_previous_releases.py`:
- [BLOCKING] lines 127-134: ARM native linux64 jobs fail when downloading the oldest previous-release binary
`test-linux64` now runs on an ARM runner, `00_setup_env.sh` sets `HOST=aarch64-linux-gnu` for that environment, and the native `linux64` test target still enables `--previous-releases`. In `check_host()`, that host maps to `args.platform = "aarch64-linux-gnu"`, and the default tag list includes `v0.12.1.5`. For tags earlier than `v0.12.3`, `download_binary()` only remaps `arm-linux-gnueabihf` to `RPi2` and `x86_64-linux-gnu` to `linux64`, so `v0.12.1.5` is requested as `dashcore-0.12.1.5-aarch64-linux-gnu.tar.gz`. That artifact is not present in `SHA256_SUMS` and does not exist in the release assets, so the script returns `Binary tag was not found` and the functional job fails before running tests.
cba08c0 to
a9c7f69
Compare
thepastaclaw
left a comment
There was a problem hiding this comment.
Code Review
The arm64 CI migration is mostly coherent, but one test path still assumes legacy x86_64 release binaries and will break the native linux64 functional job. I also confirmed a cache-key mismatch that will keep the arm64 previous-releases cache cold after the workflow switch, but that is a performance issue rather than a correctness blocker.
Reviewed commit: a9c7f69
🔴 1 blocking | 🟡 1 suggestion(s)
1 additional finding
🟡 suggestion: The releases cache key no longer distinguishes x86_64 and arm64 contents
.github/workflows/test-src.yml (lines 46-52)
PREVIOUS_RELEASES_DIR now includes $HOST, so the migrated linux64 job writes downloaded releases under releases/aarch64-linux-gnu instead of the old x86_64 directory. The cache key here still depends only on file hashes, so an existing cache entry created by the former x86_64 linux64 job can be restored on the new arm64 job under the same key. Because that restore is an exact key hit, the cache action will not publish the newly downloaded arm64 subtree back under a distinct key, which leaves the arm64 job redownloading previous releases on subsequent runs.
🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.
In `test/get_previous_releases.py`:
- [BLOCKING] lines 127-134: Native arm64 `linux64` jobs still fetch the x86_64 v0.12.1.5 tarball
The new `linux64` workflow now runs on arm64 runners and `ci/test/00_setup_env.sh` derives `HOST=aarch64-linux-gnu` on those machines, while the Qt5 functional job still enables `--previous-releases`. In `download_binary()`, tags older than `v0.12.3` still remap both `x86_64-linux-gnu` and `aarch64-linux-gnu` to the legacy `linux64` tarball, which is the x86_64 build. `feature_unsupported_utxo_db.py` explicitly starts `v0.12.1.5`, and the functional harness launches that binary directly from `PREVIOUS_RELEASES_DIR` without any emulator path. On the new native arm64 runner this turns into an exec-format failure, so the `linux64` functional job will fail until old-release handling is made architecture-aware or the test is excluded there.
In `.github/workflows/test-src.yml`:
- [SUGGESTION] lines 46-52: The releases cache key no longer distinguishes x86_64 and arm64 contents
`PREVIOUS_RELEASES_DIR` now includes `$HOST`, so the migrated `linux64` job writes downloaded releases under `releases/aarch64-linux-gnu` instead of the old x86_64 directory. The cache key here still depends only on file hashes, so an existing cache entry created by the former x86_64 `linux64` job can be restored on the new arm64 job under the same key. Because that restore is an exact key hit, the cache action will not publish the newly downloaded arm64 subtree back under a distinct key, which leaves the arm64 job redownloading previous releases on subsequent runs.
| elif platform in ["x86_64-apple-darwin", "arm64-apple-darwin"]: | ||
| print(f"Binaries not available for {tag} on {platform}") | ||
| return 1 | ||
| elif platform in ["x86_64-linux-gnu"]: | ||
| elif platform in ["x86_64-linux-gnu", "aarch64-linux-gnu"]: | ||
| platform = "linux64" |
There was a problem hiding this comment.
🔴 Blocking: Native arm64 linux64 jobs still fetch the x86_64 v0.12.1.5 tarball
The new linux64 workflow now runs on arm64 runners and ci/test/00_setup_env.sh derives HOST=aarch64-linux-gnu on those machines, while the Qt5 functional job still enables --previous-releases. In download_binary(), tags older than v0.12.3 still remap both x86_64-linux-gnu and aarch64-linux-gnu to the legacy linux64 tarball, which is the x86_64 build. feature_unsupported_utxo_db.py explicitly starts v0.12.1.5, and the functional harness launches that binary directly from PREVIOUS_RELEASES_DIR without any emulator path. On the new native arm64 runner this turns into an exec-format failure, so the linux64 functional job will fail until old-release handling is made architecture-aware or the test is excluded there.
source: ['codex']
🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.
In `test/get_previous_releases.py`:
- [BLOCKING] lines 127-134: Native arm64 `linux64` jobs still fetch the x86_64 v0.12.1.5 tarball
The new `linux64` workflow now runs on arm64 runners and `ci/test/00_setup_env.sh` derives `HOST=aarch64-linux-gnu` on those machines, while the Qt5 functional job still enables `--previous-releases`. In `download_binary()`, tags older than `v0.12.3` still remap both `x86_64-linux-gnu` and `aarch64-linux-gnu` to the legacy `linux64` tarball, which is the x86_64 build. `feature_unsupported_utxo_db.py` explicitly starts `v0.12.1.5`, and the functional harness launches that binary directly from `PREVIOUS_RELEASES_DIR` without any emulator path. On the new native arm64 runner this turns into an exec-format failure, so the `linux64` functional job will fail until old-release handling is made architecture-aware or the test is excluded there.
a9c7f69 to
3841458
Compare
thepastaclaw
left a comment
There was a problem hiding this comment.
Code Review
The ARM migration is mostly consistent, but it introduces a real previous-release functional test failure on the native linux64 job.
Reviewed commit: 3841458
🔴 1 blocking
🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.
In `test/get_previous_releases.py`:
- [BLOCKING] lines 133-134: ARM linux64 job now downloads an x86_64-only v0.12.1.5 binary for previous-release tests
Validated against the checked-out CI wiring: the migrated `linux64` job now runs on ARM runners, `00_setup_env.sh` sets `HOST=aarch64-linux-gnu` there, and `00_setup_env_native_qt5.sh` still enables `--previous-releases`. The linux64 functional test list still includes `feature_unsupported_utxo_db.py`, which explicitly starts version `120105` (v0.12.1.5). For tags `< v0.12.3`, this code maps `aarch64-linux-gnu` to `linux64`, which resolves to `dashcore-0.12.1.5-linux64.tar.gz` from `SHA256_SUMS` and that artifact is the legacy x86_64 tarball. Unlike the removed cross-build path, the new native ARM job does not configure qemu to run x86_64 binaries, so this test path will fail when it tries to launch the downloaded release.
| elif platform in ["x86_64-linux-gnu", "aarch64-linux-gnu"]: | ||
| platform = "linux64" |
There was a problem hiding this comment.
🔴 Blocking: ARM linux64 job now downloads an x86_64-only v0.12.1.5 binary for previous-release tests
Validated against the checked-out CI wiring: the migrated linux64 job now runs on ARM runners, 00_setup_env.sh sets HOST=aarch64-linux-gnu there, and 00_setup_env_native_qt5.sh still enables --previous-releases. The linux64 functional test list still includes feature_unsupported_utxo_db.py, which explicitly starts version 120105 (v0.12.1.5). For tags < v0.12.3, this code maps aarch64-linux-gnu to linux64, which resolves to dashcore-0.12.1.5-linux64.tar.gz from SHA256_SUMS and that artifact is the legacy x86_64 tarball. Unlike the removed cross-build path, the new native ARM job does not configure qemu to run x86_64 binaries, so this test path will fail when it tries to launch the downloaded release.
source: ['claude']
🤖 Fix this with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.
In `test/get_previous_releases.py`:
- [BLOCKING] lines 133-134: ARM linux64 job now downloads an x86_64-only v0.12.1.5 binary for previous-release tests
Validated against the checked-out CI wiring: the migrated `linux64` job now runs on ARM runners, `00_setup_env.sh` sets `HOST=aarch64-linux-gnu` there, and `00_setup_env_native_qt5.sh` still enables `--previous-releases`. The linux64 functional test list still includes `feature_unsupported_utxo_db.py`, which explicitly starts version `120105` (v0.12.1.5). For tags `< v0.12.3`, this code maps `aarch64-linux-gnu` to `linux64`, which resolves to `dashcore-0.12.1.5-linux64.tar.gz` from `SHA256_SUMS` and that artifact is the legacy x86_64 tarball. Unlike the removed cross-build path, the new native ARM job does not configure qemu to run x86_64 binaries, so this test path will fail when it tries to launch the downloaded release.
18e3dce to
49bf536
Compare
thepastaclaw
left a comment
There was a problem hiding this comment.
Code Review
The current HEAD still has one real CI regression in the native ARM functional-test path. The new ARM-specific exclusion is injected through TEST_RUNNER_EXTRA, but the workflow passes a second --exclude later on the command line, so test_runner.py drops feature_unsupported_utxo_db from the final exclusion set.
Reviewed commit: 49bf536
🔴 1 blocking
1 additional finding
🔴 blocking: ARM-specific functional-test exclusion is still overridden by workflow arguments
ci/dash/test_integrationtests.sh (line 47)
On native ARM, ci/test/00_setup_env_native_qt5.sh now exports TEST_RUNNER_EXTRA with --exclude feature_pruning,feature_dbcrash,feature_unsupported_utxo_db, because the previous-release binary required by feature_unsupported_utxo_db is unavailable on aarch64. This script still expands ${TEST_RUNNER_EXTRA} before $PASS_ARGS, and test/functional/test_runner.py defines --exclude as a single-value option, so the later workflow-provided --exclude feature_pruning,feature_dbcrash replaces the ARM-specific list instead of extending it. The linux64 ARM job will therefore continue to schedule feature_unsupported_utxo_db and fail in the exact scenario the setup change was meant to avoid.
💡 Suggested change
LD_LIBRARY_PATH="$DEPENDS_DIR/$HOST/lib" ./test/functional/test_runner.py --ci --attempts=3 --ansi --combinedlogslen=4000 --timeout-factor="${TEST_RUNNER_TIMEOUT_FACTOR}" $PASS_ARGS ${TEST_RUNNER_EXTRA} --failfast --nocleanup --tmpdir="$(pwd)/testdatadirs" $EXTRA_ARGS
🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.
In `ci/dash/test_integrationtests.sh`:
- [BLOCKING] line 47: ARM-specific functional-test exclusion is still overridden by workflow arguments
On native ARM, `ci/test/00_setup_env_native_qt5.sh` now exports `TEST_RUNNER_EXTRA` with `--exclude feature_pruning,feature_dbcrash,feature_unsupported_utxo_db`, because the previous-release binary required by `feature_unsupported_utxo_db` is unavailable on aarch64. This script still expands `${TEST_RUNNER_EXTRA}` before `$PASS_ARGS`, and `test/functional/test_runner.py` defines `--exclude` as a single-value option, so the later workflow-provided `--exclude feature_pruning,feature_dbcrash` replaces the ARM-specific list instead of extending it. The linux64 ARM job will therefore continue to schedule `feature_unsupported_utxo_db` and fail in the exact scenario the setup change was meant to avoid.
Move depends-linux64 and its consumers (linux64, fuzz, sqlite, ubsan) to ARM runners, making aarch64 the primary CI architecture. This includes both build and test stages. Drop the aarch64-linux cross-compile job (depends + src) and its setup env script since we don't ship arm-linux-gnueabihf binaries and now have native aarch64 coverage through the migrated jobs. Jobs remaining on x86: - nowallet (build + test): sole x86_64 native coverage - mac, win64: cross-compile targets (x86 host) - lint: architecture-independent Update 00_setup_env_native_qt5.sh to dynamically detect HOST based on the runner architecture instead of hardcoding x86_64.
e7f1160 to
7b0e40c
Compare
|
This pull request has conflicts, please rebase. |
|
@coderabbitai review |
|
This pull request has conflicts, please rebase. |
✅ Actions performedReview triggered.
|
thepastaclaw
left a comment
There was a problem hiding this comment.
Code Review
I validated the current HEAD at e7f1160 against the available rerun artifacts and the workflow/script changes in the working tree. The agent findings input was empty, there were no available CodeRabbit findings to react to in .review-e7f11606, and I did not confirm any actionable correctness issue in the current patch.
Reviewed commit: e7f1160
7b0e40c to
e7f1160
Compare
229d466 ci: use cheaper ARM bootstrap runner for check-skip (PastaClaw) Pull request description: # Summary Move only the `check-skip` bootstrap job onto a cheaper ARM runner path. ## Problem `check-skip` currently uses the generic amd64 runner setting, so even the lightweight bootstrap step can consume the more expensive primary runner pool. ## Solution - add a dedicated `RUNNER_CHECK_SKIP` override for the `check-skip` job - default that job to GitHub-hosted ARM (`ubuntu-24.04-arm`) - leave downstream dynamic runner selection unchanged for the heavier CI jobs This keeps the tiny bootstrap step cheap by default while still allowing an org-level Blacksmith ARM label, such as a small 1 vCPU runner, to be configured later without affecting the main build/test jobs. ## Related - split out from <#7232> ## Validation - parsed `.github/workflows/build.yml` successfully with PyYAML after the change - reviewed the workflow context around `check-skip`, `select_dynamic_runner.py`, and `cache-depends-sources.yml` - ran the mandatory code-review gate on `upstream/develop..ci/check-skip-arm-bootstrap` → *ship* Top commit has no ACKs. Tree-SHA512: a0163facade75a60bc69f668225b069660a8cb3faa12da4b9d43fbbd248b7291a0db521873f46ba98f37929031d09d1571f3c020bcaade5bce6c1fadc4614b70
Summary
Migrate the primary CI path from x86 to ARM runners, reducing costs and providing native aarch64 build+test coverage.
Changes
Moved to ARM runners (
runner-arm64)depends-linux64(shared depends for linux64, fuzz, sqlite, ubsan)src-linux64+test-linux64(primary native build+test)src-linux64_fuzz(fuzz build)src-linux64_sqlite+test-linux64_sqlite(SQLite build+test)src-linux64_ubsan+test-linux64_ubsan(UBSan build+test)Removed
depends-aarch64-linux+src-aarch64-linux— the cross-compile job foraarch64-linux-gnufrom x86 runners. We don't shiparm-linux-gnueabihfbinaries, and native aarch64 coverage is now provided by the migrated jobs.ci/test/00_setup_env_aarch64.sh— the cross-compile setup scriptRemaining on x86 (
runner-amd64)nowallet(build + test) — sole native x86_64 coveragemac,win64— cross-compile targetslint— architecture-independentHOST detection centralization
Moved architecture-based HOST detection (
aarch64-linux-gnu/x86_64-pc-linux-gnu) from individual setup scripts intoci/test/00_setup_env.sh. This:aarch64-linux-gnuinstead ofconfig.guess'saarch64-unknown-linux-gnu)00_setup_env_native_qt5.sh,00_setup_env_native_multiprocess.sh,00_setup_env_native_tsan.shPREVIOUS_RELEASES_DIRpath computation which depended on HOST being set before the target script ranPrevious releases download fix
Added exact-match patterns for
aarch64-linux-gnuandx86_64-linux-gnuintest/get_previous_releases.py, since the existing fnmatch globaarch64-*-linux*requires a vendor field (e.g.aarch64-unknown-linux-gnu) and won't match the vendorless triplet.Validation
runner-arm64in build.yml00_setup_env.shcovers aarch64/x86_64 with dpkg fallbackaarch64-linux-gnuin SHA256_SUMSdepends-aarch64-linuxorsrc-aarch64-linuxremain in build.yml