ci: require secrets.PYCLOUDLIB_TOML, add lxd_vm and ec2 scheduled jobs#6715
ci: require secrets.PYCLOUDLIB_TOML, add lxd_vm and ec2 scheduled jobs#6715blackboxsw wants to merge 11 commits intocanonical:mainfrom
Conversation
4f73b0f to
ed8690d
Compare
| awk '/cloud-init version: /{printf DEB_VERSION=$NF; exit}' pytest-${{ inputs.platform }}-${{ inputs.release }}-${{ inputs.image_type }}.log | ||
| awk '/image-serial: /{printf IMAGE_SERIAL=$NF; exit}' pytest-${{ inputs.platform }}-${{ inputs.release }}-${{ inputs.image_type }}.log | ||
| shell: bash |
There was a problem hiding this comment.
What is are the awk commands doing here?
And why bash?
There was a problem hiding this comment.
Ahh I was going to use these to help create a top-level report output that would announce what version of cloud-init was installed during the test. I'll drop this from this PR until I have a working approach. I didn't like how opaque our GH workflow runs are when compared to jenkins jobs which announce the version of cloud-init being tested.
There was a problem hiding this comment.
I was originally stuffing env vars them into GITHUB_ENV per these docs and the ctrf.io step can extract and use the environment variables for report headers or summary.
b16406d to
be806da
Compare
| uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2 | ||
| - name: Setup LXD | ||
| if: ${{ env.REQUIRED_SECRET != '' and contains(fromJSON('["lxd_vm", "lxd_container"]'), env.CLOUD_INIT_PLATFORM ) }} |
There was a problem hiding this comment.
Avoid the setup-lxd step if we are not on CLOUD_INIT_OS_PLATFORM lxd_container of lxd_vm.
| test '${{ secrets.PYCLOUDLIB_TOML }}' != '' || echo "ERROR: Missing required repo secrets.PYCLOUDLIB_TOML non-empty value." | ||
| test '${{ secrets.PYCLOUDLIB_TOML }}' == '' && exit 1 |
There was a problem hiding this comment.
This checks for the same thing twice, I think an if / else would be cleaner
There was a problem hiding this comment.
Done. We only really care to exit 1 in the face of no secrets.PYCLOUDLIB_TOML or empty string. So, I just put the operation in a single if clause.
| - name: Checkout | ||
| if: ${{ env.REQUIRED_SECRET != '' }} |
There was a problem hiding this comment.
Since env.REQUIRED_SECRET is assigned from secrets.PYCLOUDLIB_TOML, and the first step has an exit 1 when secrets.PYCLOUDLIB_TOML is empty, is this not redundant?
Same comment elsewhere.
There was a problem hiding this comment.
Dropped all unneessary conditionals checks. An exit 1 above will prevent running these steps anyway.
0c75a04 to
3937aae
Compare
|
@blackboxsw let me know when this is ready for re-review |
| on: | ||
| workflow_dispatch: | ||
| schedule: | ||
| # Run Mon & Thurs under reduced frequency for high-cost test runs |
There was a problem hiding this comment.
"under reduced frequency" seems out of place
There was a problem hiding this comment.
Removed phrasing in all ec2 workflows.
| @@ -0,0 +1,13 @@ | |||
| name: "Daily: Integration Resolute lxd_vm" | |||
There was a problem hiding this comment.
The naming scheme isn't going to work with future expansion of new series, additional platforms, etc.
I would prefer that we do it in a way that isn't going to force us to do the same thing again the next time we add more platforms - something a little bit more sustainable would be better.
Also - numbers simultaneously increasing and decreasing is chaotic.
There was a problem hiding this comment.
I have dropped all unrelated renames and moved all integration platform runs to the end of the ordered list. Each platform is now grouped and reserves 10 workflow test slots for each platform.
Platform group prefix will be the following for each platform:
- 100: lxd_container
- 110: lxd_vm
- 120: ec2
The oldest tested series (22.04) will be the first index at prefix 100-. Each additional supported series will increment from that initial platform index:
- XX0: 22.04
- XX1: 24.04
- XX2: 25.10
- XX3: 26.04
|
@holmanb I have updated this branch and linked a successful runs on Ec2. |
|
Thanks for the additional organization, @blackboxsw. I left some more feedback. |
There was a problem hiding this comment.
Pull request overview
This PR expands scheduled integration test coverage by adding new scheduled workflows for lxd_vm (daily) and ec2 (twice weekly) and consolidates scheduled workflows to call a shared reusable workflow that now expects repository secrets to configure pycloudlib at runtime.
Changes:
- Add new scheduled integration workflows for
lxd_vm(22.04/24.04/25.10/26.04) andec2(22.04/24.04/25.10/26.04). - Update existing
lxd_containerscheduled workflows to use100-dispatch-common.ymland standardize workflow names. - Update
100-dispatch-common.ymlto consume repo secrets (PYCLOUDLIB_TOML, SSH keys) and add basic secret presence assertions.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/100-dispatch-common.yml | Central reusable workflow updated to handle secrets and support lxd_vm and ec2 platforms. |
| .github/workflows/110-daily-integration-22.04-lxd_container.yml | Standardize name; switch to 100-dispatch-common.yml. |
| .github/workflows/111-daily-integration-24.04-lxd_container.yml | Standardize name; switch to 100-dispatch-common.yml. |
| .github/workflows/112-daily-integration-25.10-lxd_container.yml | Standardize name; switch to 100-dispatch-common.yml. |
| .github/workflows/113-daily-integration-26.04-lxd_container.yml | Standardize name; switch to 100-dispatch-common.yml. |
| .github/workflows/120-daily-integration-22.04-lxd_vm.yml | New daily scheduled lxd_vm integration run. |
| .github/workflows/121-daily-integration-24.04-lxd_vm.yml | New daily scheduled lxd_vm integration run. |
| .github/workflows/122-daily-integration-25.10-lxd_vm.yml | New daily scheduled lxd_vm integration run. |
| .github/workflows/123-daily-integration-26.04-lxd_vm.yml | New daily scheduled lxd_vm integration run. |
| .github/workflows/130-daily-integration-22.04-ec2.yml | New twice-weekly scheduled ec2 integration run. |
| .github/workflows/131-daily-integration-24.04-ec2.yml | New twice-weekly scheduled ec2 integration run. |
| .github/workflows/132-daily-integration-25.10-ec2.yml | New twice-weekly scheduled ec2 integration run. |
| .github/workflows/133-daily-integration-26.04-ec2.yml | New twice-weekly scheduled ec2 integration run. |
Comments suppressed due to low confidence (7)
.github/workflows/100-dispatch-common.yml:66
- The
image_typeandinstall_sourceinputs are optional, but this workflow unconditionally exportsCLOUD_INIT_OS_IMAGE_TYPEandCLOUD_INIT_CLOUD_INIT_SOURCEfrom them. When these inputs are omitted (as in the scheduled callers added in this PR), they become empty strings and override the non-empty defaults intests/integration_tests/integration_settings.py, likely breaking image selection / install source resolution. Consider providing defaults inworkflow_call/workflow_dispatchinput definitions or using expression fallbacks inenv(e.g., defaultgenericandppa:cloud-init-dev/daily/IN_PLACE, whichever is intended).
.github/workflows/100-dispatch-common.yml:76 - The required secrets are documented as
PYCLOUDLIB_TOML,SSH_PUBLIC_KEY, andSSH_PRIVATE_KEY, but onlyPYCLOUDLIB_TOMLis asserted. If either SSH key secret is missing/empty, the workflow will silently write empty key files and fail later in less obvious ways. Suggest adding similar non-empty assertions forSSH_PRIVATE_KEYandSSH_PUBLIC_KEYhere (or in the same assertion step).
.github/workflows/100-dispatch-common.yml:75 - This assertion uses workflow expression interpolation (
${{ env.REQUIRED_SECRET }}) inside the shell script, which injects the secret value directly into the generated script text. It’s safer to reference the environment variable at runtime (e.g.,$REQUIRED_SECRET) to avoid quoting/newline edge cases and reduce chances of accidental exposure in debug output.
.github/workflows/100-dispatch-common.yml:99 - The PR description and linked “success run” indicate
secrets.PYCLOUDLIB_TOMLcan be a plain TOML snippet like[ec2], but this step base64-decodes the secret. If the secret isn’t base64-encoded,base64 -dwill fail (or write garbage) and break the workflow. Either storePYCLOUDLIB_TOMLas base64 and document that requirement, or write the secret to the file directly using a quoting-safe approach.
.github/workflows/100-dispatch-common.yml:106 - Piping the tox run into
teecan mask test failures because, withoutpipefail, the step’s exit status will typically betee’s exit status (often 0) rather thantox’s. This could cause the workflow to pass even when integration tests fail. Consider enablingset -o pipefailfor this step (or avoid the pipe) so tox failures correctly fail the job.
.github/workflows/100-dispatch-common.yml:112 - The CTRF reporter title includes
${{ env.DEB_VERSION }}, but this workflow doesn’t setDEB_VERSIONanywhere. As written, the title will contain an empty version (e.g.,:v). Either setDEB_VERSIONearlier (e.g., fromdpkg-parsechangelog/packaging metadata) or drop it from the title to avoid misleading report labels.
.github/workflows/100-dispatch-common.yml:76 - PR description mentions conditional logic to skip integration test steps when
PYCLOUDLIB_TOMLis absent to avoid wasting runner cycles, but this workflow currently exits 1 immediately when the secret is empty. If skipping (with a neutral conclusion) is the intended behavior for scheduled/dispatch runs, this implementation doesn’t match that description.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| name: "Bi-weekly 22.04: ec2" | ||
|
|
||
| on: | ||
| workflow_dispatch: | ||
| schedule: | ||
| # Run Mon & Thurs for high-cost test runs | ||
| - cron: '2 22 * * 1,4' |
There was a problem hiding this comment.
The workflow name says “Bi-weekly”, but the cron (1,4) schedules it twice per week (Mon & Thu). “Bi-weekly” is ambiguous (every two weeks vs twice a week). Consider renaming to “Twice weekly”/“Semiweekly” (or similar) to avoid confusion in the Actions UI.
| name: "Bi-weekly 24.04: ec2" | ||
|
|
||
| on: | ||
| workflow_dispatch: | ||
| schedule: | ||
| # Run Mon & Thurs for high-cost test runs | ||
| - cron: '2 22 * * 1,4' |
There was a problem hiding this comment.
The workflow name says “Bi-weekly”, but the cron (1,4) schedules it twice per week (Mon & Thu). “Bi-weekly” is ambiguous (every two weeks vs twice a week). Consider renaming to “Twice weekly”/“Semiweekly” (or similar) to avoid confusion in the Actions UI.
| name: "Bi-weekly 25.10: ec2" | ||
|
|
||
| on: | ||
| workflow_dispatch: | ||
| schedule: | ||
| # Run Mon & Thurs for high-cost test runs | ||
| - cron: '2 22 * * 1,4' |
There was a problem hiding this comment.
The workflow name says “Bi-weekly”, but the cron (1,4) schedules it twice per week (Mon & Thu). “Bi-weekly” is ambiguous (every two weeks vs twice a week). Consider renaming to “Twice weekly”/“Semiweekly” (or similar) to avoid confusion in the Actions UI.
| name: "Bi-weekly 26.04: ec2" | ||
|
|
||
| on: | ||
| workflow_dispatch: | ||
| schedule: | ||
| # Run Mon & Thurs for high-cost test runs | ||
| - cron: '2 22 * * 1,4' |
There was a problem hiding this comment.
The workflow name says “Bi-weekly”, but the cron (1,4) schedules it twice per week (Mon & Thu). “Bi-weekly” is ambiguous (every two weeks vs twice a week). Consider renaming to “Twice weekly”/“Semiweekly” (or similar) to avoid confusion in the Actions UI.
holmanb
left a comment
There was a problem hiding this comment.
Thanks for the updates @blackboxsw.
| echo "[lxd]" > /home/$USER/.config/pycloudlib.toml | ||
| mkdir -p ~/.ssh | ||
| sh -c 'printf "%s\n" "$SSH_PRIVATE_KEY" > ~/.ssh/cloudinit_id_rsa' | ||
| chmod 600 ~/.ssh/cloudinit_id_rsa |
There was a problem hiding this comment.
This order feels odd.
sh -c ... *_rsa.pub and sh -c ... *_rsa are related because they are similar calls for a pair of keys
sh -c *_rsa and chmod ... are related due to an ordering dependency
so if we instead order
sh -c ... *_rsa.pub
sh -c ... *_rsa
chmod ...Then we can keep code adjacent that is more closely related - and it is also easier to read.
| slowest-report: true | ||
| upload-artifact: true | ||
| github-report: true | ||
| artifact-name: ctrf-report-${{ inputs.platform }}-${{inputs.release}} |
There was a problem hiding this comment.
While making changes nearby can we please make the formats consistent?
ctrf-report-${{ inputs.platform }}-${{ inputs.release }}
| PYCLOUDLIB_CONFIG="${{ runner.temp }}/pycloudlib.toml" | ||
| export PYCLOUDLIB_CONFIG |
There was a problem hiding this comment.
We set other environment variables in the dedicated section above - this is not consistent.
| PYCLOUDLIB_CONFIG="${{ runner.temp }}/pycloudlib.toml" | ||
| export PYCLOUDLIB_CONFIG | ||
|
|
||
| # Dump secrets using a subshell to avoid leaks due to xtrace. |
There was a problem hiding this comment.
Can we move this comment to the first time that this pattern is used?
Also, this comment is technically wrong. It uses a subprocess not a subshell - lets correct it as well. And since x-trace isn't used by default this isn't required, though I'm in favor of keeping it and clarifying that it is to avoid accidental leaks while debugging:
# Dump secrets using a subprocess to avoid accidental leaks when using set -x.
| uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0 | ||
| with: | ||
| name: failure-${{ github.job }} | ||
| path: ${{ github.workspace }}/cloud_init_test_logs/ |
There was a problem hiding this comment.
Don't the logs include secrets?
| insights-report: true | ||
| flaky-rate-report: true | ||
| slowest-report: true | ||
| upload-artifact: true |
There was a problem hiding this comment.
Similarly - if uploading, are we certain that there are no secrets?
| with: | ||
| report-path: '${{ github.workspace }}/reports/junit-report-${{ inputs.platform}}-${{ inputs.release }}.xml' | ||
| title: '${{ inputs.platform }}-${{ inputs.release }}-${{ inputs.image_type }}:${{ inputs.install_source }}:v${{ env.DEB_VERSION }}' | ||
| integrations-config: | |
There was a problem hiding this comment.
This appears to be used for integrating with other tools - not currently used. Can we drop this for now?
| - name: Publish Test Report with Insights | ||
| if: always() | ||
| if: ${{ always() }} | ||
| uses: ctrf-io/github-test-reporter@024bc4b64d997ca9da86833c6b9548c55c620e40 # v1.0.26 |
There was a problem hiding this comment.
I haven't fully vetted this action yet. Were others considered?
| uses: ctrf-io/github-test-reporter@024bc4b64d997ca9da86833c6b9548c55c620e40 # v1.0.26 | ||
| with: | ||
| report-path: '${{ github.workspace }}/reports/junit-report-${{ inputs.platform}}-${{ inputs.release }}.xml' | ||
| title: '${{ inputs.platform }}-${{ inputs.release }}-${{ inputs.image_type }}:${{ inputs.install_source }}:v${{ env.DEB_VERSION }}' |
There was a problem hiding this comment.
Will this be redundant? I would expect the job to encode this information already.
| @@ -72,20 +86,30 @@ | |||
| rm -rf ${{ github.workspace }}/cloud_init_test_logs/ | |||
| - name: Setup pycloudlib | |||
There was a problem hiding this comment.
nit: This does two distinct things:
- Configure ssh
- Set pycloudlib configuration
I would probably make this two distinct steps.
Improve our public scheduled integration test coverage to cover lxd_vm and ec2 platforms.
Running integration tests will require 3 repository secrets
PYCLOUDLIB_TOML,SSH_PUBLIC_KEYandSSH_PRIVATE_KEYin order to configure pycloudlib to run integration tests.This PR renames and orders Github workflow files to aid visibility when manually reviewing and running actions.
Add assertion on a non-empty Github repo-level secret named
PYCLOUDLIB_TOMLwhich willbe written to a temporary file which is set in the environment variable
PYCLOUDLIB_CONFIG. This config is used sa runtime integration test configuration for pycloudlib.Add an assertion step in 20-dispatch-common.yml to error when
secrets.PYCLOUDLIB_TOML is absent and conditional logic which skips
each integration test step to avoid wasting runner cycles on integration tests.
Proposed Commit Message
See individual commits
Additional Context
Test Steps
Failed run example empty or absent secret PYCLOUDLIB_TOML https://github.com/blackboxsw/cloud-init/actions/runs/21699149948/job/62575866822
Success RUN with PYCLOUDLIB_TOML secret set to "[ec2]"
https://github.com/blackboxsw/cloud-init/actions/runs/24084983529/job/70255277685
Merge type