[codex] Apply generated cpflow GitHub Actions flow#732
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (28)
WalkthroughThis PR refactors Control Plane CI/CD automation from manual setup/build/deploy actions to a unified cpflow CLI–based system. It introduces parameterized composite actions for environment setup, Docker image building with SSH support, and application deletion; adds new workflows for PR-based review apps, staging deployment, and manual production promotion with health checks and rollback; and updates documentation covering slash commands, repository secrets/variables, and deployment processes. Changescpflow GitHub Actions Refactor
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Upstream token not passed to copy-image command
- Added the required -t upstream token argument to the cpflow copy-image-from-upstream command while keeping the token sourced from secrets via the environment.
Or push these changes by commenting:
@cursor push 3606e16506
Preview (3606e16506)
diff --git a/.github/workflows/cpflow-promote-staging-to-production.yml b/.github/workflows/cpflow-promote-staging-to-production.yml
--- a/.github/workflows/cpflow-promote-staging-to-production.yml
+++ b/.github/workflows/cpflow-promote-staging-to-production.yml
@@ -203,14 +203,13 @@
- name: Copy image from staging
env:
- # Pass the upstream token via env rather than `-t` so it doesn't appear in /proc/<pid>/cmdline.
CPLN_UPSTREAM_TOKEN: ${{ secrets.CPLN_TOKEN_STAGING }}
PRODUCTION_APP_NAME: ${{ vars.PRODUCTION_APP_NAME }}
CPLN_ORG_PRODUCTION: ${{ vars.CPLN_ORG_PRODUCTION }}
shell: bash
run: |
set -euo pipefail
- cpflow copy-image-from-upstream -a "${PRODUCTION_APP_NAME}" --org "${CPLN_ORG_PRODUCTION}"
+ cpflow copy-image-from-upstream -a "${PRODUCTION_APP_NAME}" -t "${CPLN_UPSTREAM_TOKEN}" --org "${CPLN_ORG_PRODUCTION}"
- name: Deploy image to production
env:You can send follow-ups to the cloud agent here.
Code Review: cpflow GitHub Actions MigrationThis is a solid migration from hand-written to generated Security — meaningful fixes
Bugs fixed
Potential issues
Nits
Overall this is well-structured with meaningful security and reliability improvements. The one item I'd prioritize is the |
| if cpflow exists -a "${APP_NAME}" --org "${CPLN_ORG}"; then | ||
| echo "exists=true" >> "$GITHUB_OUTPUT" | ||
| else | ||
| echo "exists=false" >> "$GITHUB_OUTPUT" |
There was a problem hiding this comment.
This if/else around cpflow exists conflates two different failure modes: "app does not exist" (expected non-zero) and "API/auth error" (unexpected non-zero). Both land in the else branch and set exists=false.
Compare delete-app.sh in this same PR, which carefully pattern-matches the output ("Double check your org", "Unknown API token format", "ERROR", etc.) to distinguish a real failure from a genuine not-found and surface it as a hard error.
If there's a transient auth issue when a PR is pushed, this step sets exists=false and the pull_request path silently writes a step-summary saying the app doesn't exist yet — masking the real problem until the deploy is manually retried.
Suggested approach: mirror delete-app.sh's pattern — capture the output, check exit code, inspect the text for known error tokens, and exit non-zero on a real failure rather than treating it as "not found".
There was a problem hiding this comment.
Fixed in b9095c6 by capturing cpflow exists output and failing hard on auth/API error tokens instead of treating every non-zero exit as app-not-found.
|
|
||
| if [[ -f "${HOME}/.ssh/cpflow_build_key" ]]; then | ||
| eval "$(ssh-agent -s)" | ||
| trap 'ssh-agent -k >/dev/null; rm -f "${HOME}/.ssh/cpflow_build_key"' EXIT |
There was a problem hiding this comment.
The EXIT trap kills the SSH agent and removes cpflow_build_key, but the ~/.ssh/known_hosts file written in the previous step is not cleaned up here. On ephemeral GitHub-hosted runners this is harmless (the VM is discarded), but if this action is ever adopted on self-hosted runners with a persistent ~/.ssh, the custom known_hosts entries would accumulate across runs.
Consider adding rm -f "${HOME}/.ssh/known_hosts" to the trap, or using a temp file under $RUNNER_TEMP instead of ~/.ssh for the known_hosts.
| default: "3.3.1" | ||
| cpflow_version: | ||
| description: cpflow gem version | ||
| required: false |
There was a problem hiding this comment.
The CLI version is pinned here as a default, which is good for reproducibility. Worth noting: since sudo npm install -g fetches from the npm registry at runtime, a compromised release of @controlplane/cli@3.3.1 would silently affect all runs. Consider adding a SHA/integrity check (e.g. via npm install --integrity) or at minimum locking to a digest-pinned download, especially since this step runs in workflows that access staging/production tokens.
|
|
||
| current_image="${selected_image:-${first_image}}" | ||
| current_version="${selected_version:-${first_version}}" | ||
|
|
There was a problem hiding this comment.
The staging env-var parity check only compares names, not values. It will catch a missing DATABASE_URL in production, but it won't catch a case where production has the right key names but wrong or stale values (e.g. a staging secret was rotated but the production counterpart wasn't updated). This is expected behaviour for a name-only check — just worth documenting explicitly so it's clear what the check does and doesn't guarantee.
There was a problem hiding this comment.
Documented in b9095c6: the production promotion workflow checks that production has the same environment variable names as staging, but it does not compare secret values.
|
Code Review: Replace hand-written CI/CD Actions with generated cpflow- flow* Well-considered overhaul that hardens security, safety, and operational reliability. No blocking issues found. Security improvements Token no longer passed via --token flag — old setup-environment exposed it in /proc/cmdline and ps; now uses CPLN_TOKEN env var only. SSH key scoped to a dedicated step so ACTIONS_STEP_DEBUG cannot leak it from the build step. Delete command now gated on author_association (old workflow had no permission check). Fork PR protection blocks syncs before any secret is consumed. Old contains() command matching replaced with exact match. Safety improvements cancel-in-progress changed from true to false everywhere — old staging deploy risked partial GVC deployments mid-rollout. Prefix-based delete guard replaces the old substring check for production/staging names. Production rollback logic added with per-workload state capture and restore — old workflow had no rollback at all. Health checks now configurable (HEALTH_CHECK_RETRIES, HEALTH_CHECK_INTERVAL, HEALTH_CHECK_ACCEPTED_STATUSES). Issue 1: Fork-PR /deploy-review-app comment silently fails with no PR feedback When a collaborator comments /deploy-review-app on a fork PR, the job condition passes but Validate review app deployment source exits 1 with only a runner log message. No PR comment explains the rejection. Suggest posting an issues.createComment before exit 1, similar to how the unconfigured-secrets path writes to GITHUB_STEP_SUMMARY. Issue 2: cpflow exists error detection is string-matching fragile delete-app.sh uses case matching over stderr to distinguish app-not-found from auth/network errors. The inline TODO acknowledges this. If cpflow changes an error message, the pattern will not match and a real failure will be silently treated as app not found, causing delete to return success vacuously. The TODO to switch to a structured exit code is the right long-term fix. Issue 3: gem install / npm install without integrity pinning cpflow-setup-environment/action.yml installs @controlplane/cli@3.3.1 and cpflow 4.2.0 without hash verification. A re-tagged or compromised release would install silently. For production-adjacent CI, consider verifying SHA256 after install or using lockfiles with integrity entries. Issue 4: Nightly cleanup has no failure notification cpflow-cleanup-stale-review-apps.yml runs on cron with no failure notification. Silent failures accumulate stale apps over days. Consider a step gated on if: failure() that posts a GitHub issue or annotation. Minor observations APP_NAME / PR_NUMBER double-assignment is intentional (workflow-level env needed early for concurrency.group, step overwrites with resolved values) but a short comment would clarify intent. cpflow-review-app-help.yml using pull_request_target is correct and well-documented. deploy job in cpflow-deploy-staging.yml re-runs cpflow-setup-environment unnecessarily (only needs cpflow deploy-image) — minor redundancy, acceptable for simplicity. |
| echo "🗑️ Deleting application: $APP_NAME" | ||
| cpflow delete -a "$APP_NAME" --org "$CPLN_ORG" --yes | ||
|
|
||
| echo "✅ Successfully deleted application: $APP_NAME" |
There was a problem hiding this comment.
The case pattern matching over stderr is the right approach given cpflow exists doesn't expose a structured signal, and the inline TODO is the right way to track this. One additional defensive measure worth considering: log the full exists_output to the runner even when the "not found" path is taken, so there's an audit trail if unexpected output silently passed through the filter. Right now the output is printed only if non-empty, but it's easy to miss in logs.
Also worth noting: the pattern list doesn't include "timeout", "connection refused", or "ENOTFOUND" — all plausible network failure messages. If cpflow's Ruby HTTP client times out, it may surface Errno::ECONNREFUSED or Net::ReadTimeout, which would match "Net::", so that's covered. But curl-style timeouts would not. Low risk given this is a Ruby gem, but worth keeping in mind when bumping the cpflow version.
| shell: bash | ||
| run: | | ||
| set -euo pipefail | ||
| cpflow cleanup-stale-apps -a "${{ vars.REVIEW_APP_PREFIX }}" --org "${{ vars.CPLN_ORG_STAGING }}" --yes |
There was a problem hiding this comment.
No failure notification exists for this cron job. If cpflow cleanup-stale-apps fails (auth error, API hiccup, etc.), stale review apps accumulate silently across days.
Consider appending a step:
- name: Notify on failure
if: failure()
run: |
echo "::error::Stale review app cleanup failed. Check the workflow logs."Or, if the team monitors GitHub issues, use gh issue create to open a tracking issue on failure so it's not lost in workflow history.
| } >> "$GITHUB_STEP_SUMMARY" | ||
| exit 0 | ||
| fi | ||
|
|
There was a problem hiding this comment.
When a collaborator comments /deploy-review-app on a fork PR, execution reaches this exit 1 with only a runner log message — no feedback is posted to the PR thread. The collaborator sees a red workflow run with no explanation in the PR.
Consider posting a comment before exiting. The GH_TOKEN is available from the Resolve PR ref and commit step but not inherited here; you'd need to add GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} to this step's env block. Something like:
gh api "repos/${GITHUB_REPOSITORY}/issues/${PR_NUMBER}/comments" \
--method POST \
--field body="Fork PRs cannot use /deploy-review-app because this workflow uses repository secrets. Ask a maintainer to deploy from a branch in the base repository." \
--silent || true
There was a problem hiding this comment.
Fixed in 1e46bfe: fork PR /deploy-review-app rejections now post a PR comment before exiting, while still keeping the deployment blocked because the workflow uses repository secrets.
| run: | | ||
| set -euo pipefail | ||
|
|
||
| sudo npm install -g @controlplane/cli@${{ inputs.cpln_cli_version }} |
There was a problem hiding this comment.
Version-pinned but not hash-verified. A malicious actor who gains write access to the @controlplane/cli npm package (or re-tags a release) could substitute a compromised binary. Same applies to gem install cpflow below.
For production-adjacent pipelines, consider adding a post-install checksum check, or at minimum document the expected SHA256 for each pinned version in a comment so drift is detectable during future version bumps.
| run: | | ||
| # Best-effort rollback: try every workload, aggregate failures, exit non-zero at the end | ||
| # if any failed. A single cpln hiccup shouldn't leave other workloads mid-promotion. | ||
| set -uo pipefail |
There was a problem hiding this comment.
The intentional omission of -e here (using set -uo pipefail instead of set -euo pipefail) is the correct design — it lets rollback continue across all workloads even if one cpln call fails, with rollback_failures tracking the aggregate. Worth a brief comment explaining the omission so a future reader doesn't add -e thinking it was forgotten:
| set -uo pipefail | |
| # Intentionally no -e: we want every workload to attempt rollback even if earlier ones fail. | |
| set -uo pipefail |
There was a problem hiding this comment.
Fixed in 1e46bfe with a comment explaining that set -e is intentionally omitted so rollback can attempt every workload and aggregate failures.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Token not persisted for subsequent workflow steps
- The composite action now writes CPLN_TOKEN to GITHUB_ENV so subsequent cpflow and cpln workflow steps inherit authentication without passing the token on the command line.
Or push these changes by commenting:
@cursor push f1eee564c7
Preview (f1eee564c7)
diff --git a/.github/actions/cpflow-setup-environment/action.yml b/.github/actions/cpflow-setup-environment/action.yml
--- a/.github/actions/cpflow-setup-environment/action.yml
+++ b/.github/actions/cpflow-setup-environment/action.yml
@@ -63,6 +63,12 @@
exit 1
fi
+ {
+ echo "CPLN_TOKEN<<__CPLN_TOKEN__"
+ echo "$CPLN_TOKEN"
+ echo "__CPLN_TOKEN__"
+ } >> "$GITHUB_ENV"
+
create_output=""
if ! create_output="$(cpln profile create default --org "$ORG" 2>&1)"; then
if ! echo "$create_output" | grep -qi "already exists"; thenYou can send follow-ups to the cloud agent here.
Code Review — PR #732: Generated cpflow GitHub Actions FlowOverviewThis is a well-structured replacement of hand-written CI/CD workflows with the generated Significant Improvements ✅
IssuesBug:
|
| fi | ||
| deploy_args+=(--org "${CPLN_ORG}" --verbose) | ||
|
|
||
| cpflow deploy-image "${deploy_args[@]}" |
There was a problem hiding this comment.
Bug: deploy runs even when build fails.
When a job has an explicit if:, GitHub Actions drops the implicit success() guard. Because this condition only checks validate-branch's output, the deploy job will start even if the build job failed, wasting runner minutes and attempting to deploy a missing image.
| cpflow deploy-image "${deploy_args[@]}" | |
| if: needs.validate-branch.outputs.is_deployable == 'true' && needs.build.result == 'success' |
| echo "🗑️ Deleting application: $APP_NAME" | ||
| cpflow delete -a "$APP_NAME" --org "$CPLN_ORG" --yes | ||
|
|
||
| echo "✅ Successfully deleted application: $APP_NAME" |
There was a problem hiding this comment.
The string-matching heuristic on cpflow exists error output (lines 43–54) is the main reliability risk in this file. The consequence of an unmatched error pattern is silently exiting 0 ("app not found") when the API actually returned an error, meaning the delete is skipped without any visible failure.
An alternative that inverts the logic: treat any non-zero exit with non-empty output as a failure unless it matches a known "not found" pattern:
# instead of matching known error strings, match known "not found" patterns
case "$exists_output" in
*"not found"*|*"does not exist"*)
echo "⚠️ Application does not exist: $APP_NAME"
exit 0
;;
*)
echo "❌ ERROR: failed to determine whether application exists" >&2
printf '%s\n' "$exists_output" >&2
exit 1
;;
esacThis is safer by default — unknown output is treated as an error rather than a no-op. The tradeoff is that you'd need to know the exact "not found" wording from cpflow exists, which has the same coupling problem in the other direction. The TODO comment already captures this; just flagging it for awareness.
| echo "ready=false" >> "$GITHUB_OUTPUT" | ||
| { | ||
| echo "Control Plane review app automation is not configured yet." | ||
| echo |
There was a problem hiding this comment.
This top-level APP_NAME is computed from the raw event payload and is later overwritten in $GITHUB_ENV by the Resolve PR ref and commit step (echo "APP_NAME=..." >> "$GITHUB_ENV"). The step-level write wins, so this declaration is dead configuration and can be confusing. Consider removing it and relying solely on the step-computed value, which is already the authoritative one (it handles all three event types correctly).
Code Review: cpflow Generated GitHub Actions FlowThis is a meaningful security and reliability upgrade over the handwritten workflows. Several issues from the old setup are fixed here; I've also flagged a few new items worth addressing. ✅ Notable Improvements
🔴 Issues1. The delete workflow triggers on 2. The same fragile string-matching logic ( 🟡 Warnings3. In 4. Production health check accepts 301/302 by default
5. Nightly cleanup has no failure notification
ℹ️ Informational6.
7. These will pick up any non-breaking updates automatically, which is generally fine. However, |
| name: Delete Review App | ||
|
|
||
| on: | ||
| pull_request_target: |
There was a problem hiding this comment.
pull_request_target deserves an explanation comment here.
pull_request_target runs in the context of the base repo (write permissions, access to secrets) and is required for fork PR cleanup — pull_request can't post comments on fork PRs because the token is read-only. This is safe here because no PR code is ever checked out (the actions/checkout@v4 below fetches base-branch code only).
cpflow-review-app-help.yml has a good comment explaining this exact pattern. Without a parallel comment here, a future maintainer may "fix" this to pull_request thinking it's safer, silently breaking fork PR cleanup.
| pull_request_target: | |
| # pull_request_target is intentional: it has write permission to comment on PRs from | |
| # forks, where `pull_request` would be read-only. This is safe because no PR code is | |
| # checked out — the job only calls cpflow with base-repo credentials. | |
| # Do not switch this to `pull_request` or add a checkout of the PR ref without re-evaluating. | |
| pull_request_target: |
| # Make the token available to later workflow steps without putting it on argv. | ||
| token_delimiter="CPLN_TOKEN_$(openssl rand -hex 8)" | ||
| { | ||
| echo "CPLN_TOKEN<<${token_delimiter}" | ||
| printf '%s\n' "$CPLN_TOKEN" | ||
| echo "${token_delimiter}" | ||
| } >> "$GITHUB_ENV" |
There was a problem hiding this comment.
Writing CPLN_TOKEN to GITHUB_ENV propagates it to every subsequent step in the calling workflow — not just the steps that explicitly use it. This is a necessary tradeoff (cpflow picks it up from the environment automatically), but it's worth documenting so future maintainers don't inadvertently add steps that log environment state.
| # Make the token available to later workflow steps without putting it on argv. | |
| token_delimiter="CPLN_TOKEN_$(openssl rand -hex 8)" | |
| { | |
| echo "CPLN_TOKEN<<${token_delimiter}" | |
| printf '%s\n' "$CPLN_TOKEN" | |
| echo "${token_delimiter}" | |
| } >> "$GITHUB_ENV" | |
| # Persist CPLN_TOKEN to GITHUB_ENV so all subsequent steps can authenticate | |
| # with cpflow/cpln without putting the token on the command line. | |
| # Trade-off: the token is present in every subsequent step's environment. | |
| # Do not add steps that dump env vars (e.g., `env | sort`) after this point. | |
| token_delimiter="CPLN_TOKEN_$(openssl rand -hex 8)" | |
| { | |
| echo "CPLN_TOKEN<<${token_delimiter}" | |
| printf '%s\n' "$CPLN_TOKEN" | |
| echo "${token_delimiter}" | |
| } >> "$GITHUB_ENV" |
| # TODO: replace this string-matching with a structured signal once `cpflow exists` exposes one | ||
| # (e.g. a distinct exit code for "not found" vs. API/auth errors, or `cpflow exists --json`). | ||
| # Until then, keep this list in sync if `cpflow exists` starts emitting new error patterns — | ||
| # any unmatched error string would otherwise be silently treated as "app not found". | ||
| exists_output="" | ||
| if ! exists_output="$(cpflow exists -a "$APP_NAME" --org "$CPLN_ORG" 2>&1)"; then | ||
| case "$exists_output" in | ||
| *"Double check your org"*|*"Unknown API token format"*|*"ERROR"*|*"Error:"*|*"Traceback"*|*"Net::"*) |
There was a problem hiding this comment.
The same cpflow exists error-pattern matching appears verbatim in the Check if review app exists step in cpflow-deploy-review-app.yml (line ~195). The TODO comment here is accurate but only covers this file — the workflow copy is uncovered.
If cpflow adds a new error pattern (e.g. "rate limit" or "timeout"), both locations need updating in sync. Consider extracting this into a shared script (e.g. .github/scripts/cpflow-app-exists.sh) that both the action and the workflow call, so the pattern list is maintained in one place.
| # repo variable to tighten this for apps that expose a dedicated health endpoint | ||
| # (e.g. "200" for a plain /health, or "200 401 403" for apps that auth-gate / without | ||
| # redirecting). | ||
| HEALTH_CHECK_ACCEPTED_STATUSES: ${{ vars.HEALTH_CHECK_ACCEPTED_STATUSES || '200 301 302' }} |
There was a problem hiding this comment.
Accepting 301/302 as "healthy" means a redirect to a login page, maintenance page, or even a misconfigured CDN passes the health check. The comment correctly explains why (curl without -L, Rails auth-gating /), but a redirect can pass even when the destination is broken.
Consider adding a step summary warning when a non-200 status is accepted, so operators reviewing the promotion log have visibility:
if [[ "${http_status}" == "${accepted}" ]]; then
if [[ "${http_status}" != "200" ]]; then
echo "::warning::Health check passed with HTTP ${http_status} (not 200) — verify the redirect destination is healthy."
fi
echo "healthy=true" >> "$GITHUB_OUTPUT"
exit 0
fiAlso worth documenting in .controlplane/shakacode-team.md that setting HEALTH_CHECK_ACCEPTED_STATUSES=200 and adding a /health endpoint is the recommended production posture.
| on: | ||
| workflow_dispatch: | ||
| schedule: | ||
| - cron: "0 0 * * *" |
There was a problem hiding this comment.
Scheduled workflows that fail silently are easy to miss — GitHub does send an email for failed scheduled runs but only to the repo owner by default, and only after 7 days of consecutive failures (as of current GitHub behavior).
Consider adding a failure notification step, or at minimum documenting that the team should verify GitHub's "Notify me of failed workflow runs" notification setting is enabled. Stale review apps that aren't cleaned up accrue Control Plane costs.
| - cron: "0 0 * * *" | |
| - cron: "0 3 * * *" # 3 AM UTC — avoids exact midnight when many repos run scheduled jobs simultaneously |
(The cron timing is a minor suggestion — staggering off midnight reduces GitHub's scheduler contention, making the run more likely to start on time.)
|
This is a well-structured migration from hand-written workflows to generated cpflow-* GitHub Actions. The security thinking is thorough — fork gating, double-checkout pattern, persist-credentials: false, secrets via env vars, and a prefix-safety check on deletion are all done correctly. Security
Code Quality
Minor
Overall: The PR is a significant improvement — better fork safety, structured rollback, env-var parity checks before production promotion, and prefix-guarded deletion are all good additions. The issues above are mostly low-to-medium severity, but the persist-credentials inconsistency and RC version bootstrap are easy wins to clean up before merge. |
| script: | | ||
| const commentId = Number("${{ steps.create-comment.outputs.comment-id }}"); | ||
| const deploymentId = "${{ steps.init-deployment.outputs.result }}"; | ||
| const appUrl = "${{ steps.workload.outputs.workload_url }}"; |
There was a problem hiding this comment.
The workload_url is sourced from cpln workload get ... | jq -r '.status.endpoint // empty' — an external API value. If the endpoint string contains " (or a backtick in template-literal context), it will break this JS string literal and could be exploited if the Control Plane API response is ever tampered with.
All other dynamic values here (WORKFLOW_URL, CONSOLE_URL) are already routed through core.exportVariable and read as process.env.*. Apply the same pattern:
| const appUrl = "${{ steps.workload.outputs.workload_url }}"; | |
| const appUrl = process.env.APP_URL || ""; |
And add the export alongside the existing core.exportVariable calls in the "Set deployment links" step above:
core.exportVariable("APP_URL", workload_url); // in the "Retrieve app URL" shell step, or pass via a prior github-script step| if: ${{ hashFiles('.github/actions/cpflow-validate-config/action.yml') == '' }} | ||
| shell: bash | ||
| run: | | ||
| set -euo pipefail | ||
| gem install cpflow -v "5.0.0.rc.0" --no-document | ||
| ruby -S cpflow generate-github-actions --staging-branch master | ||
| # shellcheck disable=SC2016 | ||
| ruby -0pi -e '$_.gsub!(/so callers can pass `\$\{\{ vars\.CPLN_CLI_VERSION \}\}` unconditionally\./, "so callers can pass the repository variable value unconditionally."); $_.gsub!(/so callers can pass `\$\{\{ vars\.CPFLOW_VERSION \}\}` unconditionally\./, "so callers can pass the repository variable value unconditionally.")' .github/actions/cpflow-setup-environment/action.yml |
There was a problem hiding this comment.
Two concerns with this bootstrap block:
-
RC version in production CI:
5.0.0.rc.0is a pre-release. If the actions are already committed to the base branch (the normal case), this step is skipped — but for a fresh repo setup or if the actions are accidentally deleted, this RC gem installs and generates new action files mid-run. Using a GA release (or no version pin withgem install cpflow) would be safer. -
Inline patching of generated files: The
ruby -0pione-liner patchescpflow-setup-environment/action.ymlafter generation. If the upstream template text changes in a future cpflow release, this substitution silently becomes a no-op, and the literal${{ vars.CPLN_CLI_VERSION }}Markdown in the description won't be fixed. Consider either:- Committing the generated actions to the base branch so this bootstrap never runs in practice, or
- Adding an explicit check after the substitution (
grep -q "repository variable value" ... || exit 1) to fail loudly if the patch didn't apply.
| - name: Checkout repository | ||
| if: steps.check-branch.outputs.is_deployable == 'true' | ||
| uses: actions/checkout@v4 |
There was a problem hiding this comment.
This checkout is missing persist-credentials: false, which is present on every other actions/checkout call in this PR. This leaves the GITHUB_TOKEN credential helper stored in .git/config for the duration of the job, even though nothing in this job pushes to git.
| - name: Checkout repository | |
| if: steps.check-branch.outputs.is_deployable == 'true' | |
| uses: actions/checkout@v4 | |
| - name: Checkout repository | |
| if: steps.check-branch.outputs.is_deployable == 'true' | |
| uses: actions/checkout@v4 | |
| with: | |
| persist-credentials: false |
| fi | ||
|
|
||
| if [[ "${staging_image_ref}" == /org/*/image/* ]]; then | ||
| staging_image="${staging_image_ref##*/image/}" |
There was a problem hiding this comment.
The ## (greedy) operator strips the longest prefix matching */image/. For the expected Control Plane path format /org/myorg/image/myapp this is fine, but an image ref that contains a second /image/ segment (e.g. /org/myorg/image/myapp/image/v2) would be stripped to v2 instead of myapp/image/v2.
Using the non-greedy # strips only the first /image/ occurrence, which is more predictable:
| staging_image="${staging_image_ref##*/image/}" | |
| staging_image="${staging_image_ref#*/image/}" |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Health check variables ignore documented repository variable overrides
- Updated the workflow env defaults to read the matching repository variables for health check and rollback readiness settings.
Or push these changes by commenting:
@cursor push ed231dd54e
Preview (ed231dd54e)
diff --git a/.github/workflows/cpflow-promote-staging-to-production.yml b/.github/workflows/cpflow-promote-staging-to-production.yml
--- a/.github/workflows/cpflow-promote-staging-to-production.yml
+++ b/.github/workflows/cpflow-promote-staging-to-production.yml
@@ -16,8 +16,8 @@
# Worst-case wall time per attempt is HEALTH_CHECK_INTERVAL plus the curl --max-time below
# (10s), so the defaults give a ~10 minute window (24 × (15 + 10) = 600s) — enough for
# most Rails cold boots (asset precompile + db:migrate + workload readiness).
- HEALTH_CHECK_RETRIES: 24
- HEALTH_CHECK_INTERVAL: 15
+ HEALTH_CHECK_RETRIES: ${{ vars.HEALTH_CHECK_RETRIES || '24' }}
+ HEALTH_CHECK_INTERVAL: ${{ vars.HEALTH_CHECK_INTERVAL || '15' }}
# Space-separated list of HTTP statuses considered healthy. The default accepts 301/302
# because `curl` is invoked without `-L`, so a root `/` that redirects to a login page
# (common for Rails apps that auth-gate `/`) would otherwise be reported as unhealthy
@@ -31,8 +31,8 @@
# expose a dedicated health endpoint (e.g. "200" for a plain /health, or "200 401 403"
# for apps that auth-gate / without redirecting).
HEALTH_CHECK_ACCEPTED_STATUSES: ${{ vars.HEALTH_CHECK_ACCEPTED_STATUSES || '200 301 302' }}
- ROLLBACK_READINESS_RETRIES: 24
- ROLLBACK_READINESS_INTERVAL: 15
+ ROLLBACK_READINESS_RETRIES: ${{ vars.ROLLBACK_READINESS_RETRIES || '24' }}
+ ROLLBACK_READINESS_INTERVAL: ${{ vars.ROLLBACK_READINESS_INTERVAL || '15' }}
PRIMARY_WORKLOAD: ${{ vars.PRIMARY_WORKLOAD }}
concurrency:You can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit ff6a820. Configure here.
| # for apps that auth-gate / without redirecting). | ||
| HEALTH_CHECK_ACCEPTED_STATUSES: ${{ vars.HEALTH_CHECK_ACCEPTED_STATUSES || '200 301 302' }} | ||
| ROLLBACK_READINESS_RETRIES: 24 | ||
| ROLLBACK_READINESS_INTERVAL: 15 |
There was a problem hiding this comment.
Health check variables ignore documented repository variable overrides
Medium Severity
HEALTH_CHECK_RETRIES, HEALTH_CHECK_INTERVAL, ROLLBACK_READINESS_RETRIES, and ROLLBACK_READINESS_INTERVAL are hardcoded in the workflow env block, so setting them as repository variables has no effect. This contradicts the workflow's own comment ("Override these by … setting the matching repository variable") and the documentation in cpflow-help.md and shakacode-team.md, which list all four as optional repository variables. HEALTH_CHECK_ACCEPTED_STATUSES on the same block correctly uses ${{ vars.HEALTH_CHECK_ACCEPTED_STATUSES || '200 301 302' }}; these four need the same pattern.
Additional Locations (2)
Reviewed by Cursor Bugbot for commit ff6a820. Configure here.
Code Review: Replace hand-written GitHub Actions with generated cpflow-* flowOverviewThis PR replaces five hand-written Control Plane GitHub Actions workflows/actions with generated Security Improvements (significant wins)
Issues and Concerns1. Release candidate in production default (medium risk)In A release candidate ships as the default for a production-critical deployment tool. If 2. Fragile bootstrap-and-patch step (maintainability concern)In 3.
|
| # Override per-repo by setting `CPLN_CLI_VERSION` / `CPFLOW_VERSION` repo variables; | ||
| # an empty input falls back to the action's pinned default below. | ||
| default_cpln_cli_version="3.3.1" | ||
| default_cpflow_version="5.0.0.rc.0" |
There was a problem hiding this comment.
This pins a release candidate as the default for a production-critical deployment tool. If 5.0.0.rc.0 contains a bug, every repo that inherits this default is affected. Please update to the stable release before merge (or document a concrete plan to do so before this workflow runs on master).
| - name: Bootstrap generated cpflow actions | ||
| if: ${{ hashFiles('.github/actions/cpflow-validate-config/action.yml') == '' }} | ||
| shell: bash | ||
| run: | | ||
| set -euo pipefail | ||
| gem install cpflow -v "5.0.0.rc.0" --no-document | ||
| ruby -S cpflow generate-github-actions --staging-branch master | ||
| # shellcheck disable=SC2016 | ||
| ruby -0pi -e '$_.gsub!(/so callers can pass `\$\{\{ vars\.CPLN_CLI_VERSION \}\}` unconditionally\./, "so callers can pass the repository variable value unconditionally."); $_.gsub!(/so callers can pass `\$\{\{ vars\.CPFLOW_VERSION \}\}` unconditionally\./, "so callers can pass the repository variable value unconditionally.")' .github/actions/cpflow-setup-environment/action.yml |
There was a problem hiding this comment.
This bootstrap block has two concerns:
-
It should never run for this repo:
cpflow-validate-config/action.ymlis committed here, sohashFiles(...)will always return a non-empty string. The block is dead code for this repo and only matters for repos that haven't yet committed the generated files. -
The
ruby -0pi -epatch is very fragile: It regex-substitutes specific hardcoded strings in the generatedcpflow-setup-environment/action.yml. If the upstream template changes that wording, the substitution silently no-ops and the file drifts undetected.
Recommend either removing the bootstrap entirely (since the generated files are committed) or replacing it with a CI drift-detection step that fails loudly if the committed generated files differ from what cpflow generate-github-actions would produce today.
| - name: Checkout repository | ||
| if: steps.check-branch.outputs.is_deployable == 'true' | ||
| uses: actions/checkout@v4 |
There was a problem hiding this comment.
The build and deploy jobs both set persist-credentials: false on their checkout steps, but this validate-branch checkout does not. For consistency and to keep GITHUB_TOKEN out of .git/config, add persist-credentials: false here too.
| - name: Checkout repository | |
| if: steps.check-branch.outputs.is_deployable == 'true' | |
| uses: actions/checkout@v4 | |
| - name: Checkout repository | |
| if: steps.check-branch.outputs.is_deployable == 'true' | |
| uses: actions/checkout@v4 | |
| with: | |
| persist-credentials: false |
| (github.event_name == 'issue_comment' && | ||
| github.event.issue.pull_request && | ||
| github.event.comment.body == '/deploy-review-app' && | ||
| contains(fromJson('["OWNER","MEMBER","COLLABORATOR"]'), github.event.comment.author_association)) |
There was a problem hiding this comment.
The allowlist covers OWNER, MEMBER, and COLLABORATOR, but not CONTRIBUTOR. GitHub assigns CONTRIBUTOR to anyone whose PR has previously been merged into the repo. Depending on the project's intent, CONTRIBUTOR may be a reasonable addition so past contributors can deploy review apps without needing explicit collaborator access. Worth confirming whether this is intentional.
Code ReviewOverviewThis PR replaces hand-written Control Plane GitHub Actions with a generated cpflow-* action/workflow set. The new code is a significant improvement across security, correctness, and operational capability. Worth merging once the concerns below are addressed. Security: Notable Improvements
Issues to Address1. Release candidate pinned in production CI (blocking)
2. Bootstrap regex is tightly coupled to upstream wording (low risk, but fragile) The bootstrap step in 3. The Minor Observations
|
| # Override per-repo by setting `CPLN_CLI_VERSION` / `CPFLOW_VERSION` repo variables; | ||
| # an empty input falls back to the action's pinned default below. | ||
| default_cpln_cli_version="3.3.1" | ||
| default_cpflow_version="5.0.0.rc.0" |
There was a problem hiding this comment.
Blocking: 5.0.0.rc.0 is a release candidate. This file is the default for all workflows in this repo, including production promotion. Pin a stable version before merging (e.g., 5.0.0 once released), or add an explicit comment confirming this RC is intentionally used in production and why it is acceptable.
| default_cpflow_version="5.0.0.rc.0" | |
| default_cpflow_version="5.0.0" |
| gem install cpflow -v "5.0.0.rc.0" --no-document | ||
| ruby -S cpflow generate-github-actions --staging-branch master | ||
| # shellcheck disable=SC2016 | ||
| ruby -0pi -e '$_.gsub!(/so callers can pass `\$\{\{ vars\.CPLN_CLI_VERSION \}\}` unconditionally\./, "so callers can pass the repository variable value unconditionally."); $_.gsub!(/so callers can pass `\$\{\{ vars\.CPFLOW_VERSION \}\}` unconditionally\./, "so callers can pass the repository variable value unconditionally.")' .github/actions/cpflow-setup-environment/action.yml |
There was a problem hiding this comment.
This regex is tightly coupled to the exact wording that cpflow generate-github-actions emits in the input description of cpflow-setup-environment/action.yml. If cpflow changes that sentence in a future release, the gsub! call silently no-ops and the grep guard below fails the bootstrap at runtime.
Two suggestions:
- Add a comment here referencing the exact cpflow version/commit whose output this regex is written against, so future maintainers know when to update it.
- Since this bootstrap only fires when
.github/actions/cpflow-validate-config/action.ymldoesn't exist (i.e., initial repo setup), consider whether patching the generated file in-place is worth the coupling — alternatively, generate a small wrapper action that avoids the need for post-generation text surgery.
| (github.event_name == 'issue_comment' && | ||
| github.event.issue.pull_request && | ||
| github.event.comment.body == '/deploy-review-app' && | ||
| contains(fromJson('["OWNER","MEMBER","COLLABORATOR"]'), github.event.comment.author_association)) |
There was a problem hiding this comment.
The allow-list covers OWNER, MEMBER, and COLLABORATOR but not CONTRIBUTOR (GitHub's association for users who have had a commit merged but are not formal org members). This means external contributors who have shipped code to the repo cannot self-serve review apps via slash commands — they must ask a member to trigger it for them.
If this is intentional (reasonable from a security standpoint — CONTRIBUTOR association does not imply ongoing trust), add a comment here and a note in cpflow-help.md so contributors know to ask a maintainer instead of assuming the command is broken.
| shell: bash | ||
| run: | | ||
| set -euo pipefail | ||
| gem install cpflow -v "5.0.0.rc.0" --no-document |
There was a problem hiding this comment.
Same RC version concern as in cpflow-setup-environment/action.yml — this bootstrap step pins 5.0.0.rc.0 independently. If the setup action's default is updated to a stable release, update this line to match so both paths use the same version.
Greptile SummaryThis PR replaces the hand-written Control Plane GitHub Actions workflows with a generated
Confidence Score: 3/5Safe to merge for single-workload apps; multi-workload production apps where staging omits background job workers will find promotion blocked at the staging image capture step. The production promotion workflow derives the workload name list from the production app config and iterates those same names against the staging GVC. For repos where staging intentionally runs fewer workloads,
Important Files Changed
Sequence DiagramsequenceDiagram
participant Dev as Developer
participant GH as GitHub Actions
participant CP_S as Control Plane Staging
participant CP_P as Control Plane Production
Dev->>GH: Push to master
GH->>GH: cpflow-deploy-staging validate-branch
GH->>CP_S: build-image + deploy-image
Dev->>GH: /deploy-review-app comment
GH->>GH: cpflow-deploy-review-app resolve PR ref
GH->>CP_S: setup-app + build-image + deploy-image
GH-->>Dev: PR comment with review app URL
Dev->>GH: PR closed or /delete-review-app
GH->>CP_S: cpflow delete review app
GH->>GH: nightly cleanup-stale-review-apps
GH->>CP_S: cpflow cleanup-stale-apps
Dev->>GH: "workflow_dispatch confirm_promotion=promote"
GH->>CP_S: cpln workload get capture staging image
GH->>CP_P: copy-image-from-upstream
GH->>CP_P: deploy-image
GH->>CP_P: wait-for-health poll
alt health check passes
GH-->>Dev: Promotion success + GitHub release
else health check fails
GH->>CP_P: cpln workload update rollback
GH-->>Dev: Promotion failed
end
Reviews (1): Last reviewed commit: "Address cpflow workflow review fixes" | Re-trigger Greptile |
| - name: Capture deployed staging image | ||
| id: staging-image | ||
| env: | ||
| CPLN_TOKEN_STAGING: ${{ secrets.CPLN_TOKEN_STAGING }} | ||
| STAGING_APP_NAME: ${{ vars.STAGING_APP_NAME }} | ||
| CPLN_ORG_STAGING: ${{ vars.CPLN_ORG_STAGING }} | ||
| WORKLOAD_NAMES: ${{ steps.workloads.outputs.names }} | ||
| shell: bash | ||
| run: | | ||
| set -euo pipefail | ||
|
|
||
| selected_workload="${PRIMARY_WORKLOAD:-}" | ||
| selected_image="" | ||
| first_image="" | ||
|
|
||
| while IFS= read -r workload_name; do | ||
| [[ -n "${workload_name}" ]] || continue | ||
|
|
||
| workload_json="$(CPLN_TOKEN="${CPLN_TOKEN_STAGING}" cpln workload get "${workload_name}" --gvc "${STAGING_APP_NAME}" --org "${CPLN_ORG_STAGING}" -o json)" | ||
| workload_image="$(echo "${workload_json}" | jq -r '.spec.containers[0].image // empty')" | ||
|
|
||
| if [[ -z "${workload_image}" ]]; then | ||
| echo "::error::Could not find an image on staging workload '${workload_name}'." >&2 | ||
| exit 1 | ||
| fi | ||
|
|
||
| if [[ -z "${first_image}" ]]; then | ||
| first_image="${workload_image}" | ||
| fi | ||
|
|
||
| if [[ -n "${selected_workload}" && "${workload_name}" == "${selected_workload}" ]]; then | ||
| selected_image="${workload_image}" | ||
| fi | ||
| done < <(tr ',' '\n' <<< "${WORKLOAD_NAMES}") | ||
|
|
||
| staging_image_ref="${selected_image:-${first_image}}" | ||
| if [[ -z "${staging_image_ref}" ]]; then | ||
| echo "::error::Could not determine the deployed staging image." >&2 | ||
| exit 1 | ||
| fi | ||
|
|
||
| if [[ "${staging_image_ref}" == /org/*/image/* ]]; then | ||
| staging_image="${staging_image_ref##*/image/}" | ||
| elif [[ "${staging_image_ref}" == *.registry.cpln.io/* ]]; then | ||
| staging_image="${staging_image_ref#*.registry.cpln.io/}" | ||
| else | ||
| staging_image="${staging_image_ref}" | ||
| fi | ||
|
|
||
| echo "image=${staging_image}" >> "$GITHUB_OUTPUT" |
There was a problem hiding this comment.
Staging image lookup uses production workload names
WORKLOAD_NAMES is resolved from the production app's app_workloads config (the Resolve production app workloads step), but those same names are then used to query workloads in the staging GVC. If production has a workload (e.g. sidekiq, worker) that was never stood up in staging, cpln workload get "${workload_name}" --gvc "${STAGING_APP_NAME}" exits non-zero, set -euo pipefail kills the script, and the promotion is blocked even though the staging rails app is healthy and ready to promote. The failure message comes from cpln itself with no indication that the problem is a staging/production workload-set mismatch. Consider resolving workload names separately for staging vs. production, or fall back to PRIMARY_WORKLOAD alone when app_workloads lists workloads that don't exist in staging.
| - name: Finalize deployment status | ||
| if: always() && steps.config.outputs.ready == 'true' && steps.source.outputs.allowed == 'true' && (steps.check-app.outputs.exists == 'true' || steps.setup-review-app.outcome == 'success') | ||
| uses: actions/github-script@v7 | ||
| with: | ||
| script: | | ||
| const commentId = Number("${{ steps.create-comment.outputs.comment-id }}"); | ||
| const deploymentId = "${{ steps.init-deployment.outputs.result }}"; | ||
| const appUrl = "${{ steps.workload.outputs.workload_url }}"; | ||
| const success = "${{ job.status }}" === "success"; | ||
|
|
||
| if (deploymentId) { | ||
| await github.rest.repos.createDeploymentStatus({ | ||
| owner: context.repo.owner, | ||
| repo: context.repo.repo, | ||
| deployment_id: Number(deploymentId), | ||
| state: success ? "success" : "failure", | ||
| environment: `review/${process.env.APP_NAME}`, | ||
| environment_url: success && appUrl ? appUrl : undefined, | ||
| log_url: process.env.WORKFLOW_URL, | ||
| description: success ? "Review app ready" : "Review app deployment failed" | ||
| }); | ||
| } | ||
|
|
||
| const successBody = [ | ||
| "## Review app ready", | ||
| "", | ||
| appUrl ? `[Open review app](${appUrl})` : "Review app deployed, but no endpoint URL was detected.", | ||
| "", | ||
| `[Open Control Plane console](${process.env.CONSOLE_URL})`, | ||
| `[View workflow logs](${process.env.WORKFLOW_URL})` | ||
| ].join("\n"); | ||
|
|
||
| const failureBody = [ | ||
| `❌ Review app deployment failed for PR #${process.env.PR_NUMBER}`, | ||
| "", | ||
| `[Open Control Plane console](${process.env.CONSOLE_URL})`, | ||
| `[View workflow logs](${process.env.WORKFLOW_URL})` | ||
| ].join("\n"); | ||
|
|
||
| if (!Number.isFinite(commentId) || commentId <= 0) { | ||
| core.warning("Skipping PR comment update because no comment id was created."); | ||
| return; | ||
| } | ||
|
|
||
| await github.rest.issues.updateComment({ | ||
| owner: context.repo.owner, | ||
| repo: context.repo.repo, | ||
| comment_id: commentId, | ||
| body: success ? successBody : failureBody | ||
| }); |
There was a problem hiding this comment.
Step outputs interpolated directly into JS strings
Lines 433–434 inline ${{ steps.init-deployment.outputs.result }} and ${{ steps.workload.outputs.workload_url }} directly into the JavaScript string literal rather than routing them through env: as is done consistently for all other values in this file. The workload_url comes from cpln workload get … -o json | jq -r '.status.endpoint' — not user-controlled — but the pattern is still inconsistent and would become a real injection risk if the data source ever changes.
| on: | ||
| push: | ||
| # GitHub does not allow repository vars in branch filters. Default to the common | ||
| # deploy branches unless `cpflow generate-github-actions --staging-branch BRANCH` | ||
| # was used. If STAGING_APP_BRANCH is later changed in repository variables, keep | ||
| # this list in sync so pushes to that branch actually trigger the workflow. | ||
| branches: ["master"] |
There was a problem hiding this comment.
Push trigger branch is hardcoded; STAGING_APP_BRANCH change requires manual file edit
The on.push.branches list is ["master"] while the runtime guard reads vars.STAGING_APP_BRANCH. When a team changes the variable to a different branch, pushes to that branch will never trigger this workflow and the misconfiguration is silent. Consider adding a prominent comment at the top of this file (in addition to the inline note) that the branches: filter must stay in sync with the variable value.
| - name: Set up Ruby for cpflow bootstrap | ||
| if: ${{ hashFiles('.github/actions/cpflow-validate-config/action.yml') == '' }} | ||
| uses: ruby/setup-ruby@v1 | ||
| with: | ||
| ruby-version: "3.4" | ||
|
|
||
| - name: Bootstrap generated cpflow actions | ||
| if: ${{ hashFiles('.github/actions/cpflow-validate-config/action.yml') == '' }} | ||
| shell: bash | ||
| run: | | ||
| set -euo pipefail | ||
| gem install cpflow -v "5.0.0.rc.0" --no-document | ||
| ruby -S cpflow generate-github-actions --staging-branch master | ||
| # shellcheck disable=SC2016 | ||
| ruby -0pi -e '$_.gsub!(/so callers can pass `\$\{\{ vars\.CPLN_CLI_VERSION \}\}` unconditionally\./, "so callers can pass the repository variable value unconditionally."); $_.gsub!(/so callers can pass `\$\{\{ vars\.CPFLOW_VERSION \}\}` unconditionally\./, "so callers can pass the repository variable value unconditionally.")' .github/actions/cpflow-setup-environment/action.yml | ||
| if grep -n '\$''{{ vars\.\(CPLN_CLI_VERSION\|CPFLOW_VERSION\) }}' .github/actions/cpflow-setup-environment/action.yml; then | ||
| echo "::error::Bootstrapped cpflow setup action still contains GitHub metadata expressions in input descriptions." | ||
| exit 1 | ||
| fi |
There was a problem hiding this comment.
Bootstrap step hardcodes cpflow version independently of
cpflow-setup-environment
The bootstrap path hard-codes cpflow -v "5.0.0.rc.0". When cpflow graduates from RC, repos that hit the bootstrap path will regenerate actions from the older RC gem rather than the latest release—and since the bootstrap only fires when the actions directory is missing, it would silently succeed with stale generated code. This version needs the same update discipline as the default in cpflow-setup-environment/action.yml.
|
✅ Review app for PR #732 is deleted |
Code ReviewThis PR replaces hand-written Control Plane GitHub Actions with the generated cpflow-* workflows. The new code is more structured, more secure, and better documented. Security Improvements: Issue 1: Pre-release cpflow version in production deployments Issue 2: Health-check redirect acceptance Issue 3: Env-parity check only inspects GVC-level variable names Issue 4: Bootstrap step version independent of CPFLOW_VERSION variable Minor: Staging 3-job chain adds runner startup overhead. Concurrency cancel-in-progress false is correct. Rollback is image-only (DB migrations not reversed). Stale cleanup cancel-in-progress false is correct. Summary: Excellent improvement. Actionable: (a) confirm 5.0.0.rc.0 intent or move to stable, (b) add /health endpoint and tighten HEALTH_CHECK_ACCEPTED_STATUSES. |
| # Override per-repo by setting `CPLN_CLI_VERSION` / `CPFLOW_VERSION` repo variables; | ||
| # an empty input falls back to the action's pinned default below. | ||
| default_cpln_cli_version="3.3.1" | ||
| default_cpflow_version="5.0.0.rc.0" |
There was a problem hiding this comment.
This pins a pre-release (rc.0) as the default for all workflows that use this action — review apps, staging, and production promotion. A bug in the RC could block or mis-deploy every environment simultaneously. When a stable 5.0.0 is cut, bump this default; until then consider adding a comment like # intentional RC: stable not yet released so future readers know this is a deliberate choice and not an oversight.
| shell: bash | ||
| run: | | ||
| set -euo pipefail | ||
| gem install cpflow -v "5.0.0.rc.0" --no-document |
There was a problem hiding this comment.
This bootstrap gem install hard-codes 5.0.0.rc.0 independently of the CPFLOW_VERSION repository variable used in cpflow-setup-environment. If the repo variable is later updated to a different version, this path will still install the hard-coded RC. The bootstrap only runs when the action files are absent, so the risk is low in practice — but consider deriving the version from the variable for consistency, e.g., by reading vars.CPFLOW_VERSION as a workflow input here.
| `200 301 302` accepts redirects because curl is invoked without `-L`, so a | ||
| root path that auth-redirects looks like a redirect, not a failure. | ||
| required: false | ||
| default: "200 301 302" |
There was a problem hiding this comment.
The 200 301 302 default is correctly explained in the description, but accepting redirects means a maintenance-mode page or an auth-gate that redirects to /login would pass the health check even when the underlying app is broken or unavailable. This is explicitly called out in the promote workflow's comments as a trade-off. As a follow-up, consider adding a /health endpoint to this Rails app (e.g., via a simple route returning 200 OK) and setting HEALTH_CHECK_ACCEPTED_STATUSES=200 in repository variables to make the production promotion gate unambiguous.
| # deploy branches unless `cpflow generate-github-actions --staging-branch BRANCH` | ||
| # was used. If STAGING_APP_BRANCH is later changed in repository variables, keep | ||
| # this list in sync so pushes to that branch actually trigger the workflow. | ||
| branches: ["master"] |
There was a problem hiding this comment.
GitHub does not allow repository variables in push trigger branch filters, so this filter is correctly hard-coded to master. The STAGING_APP_BRANCH variable is then re-checked at runtime in validate-branch. This means if the staging branch is ever changed, two places need updating: this filter and the STAGING_APP_BRANCH repository variable. The existing comment in the workflow already calls this out, but it is worth keeping in mind as a maintenance footgun.
| staging_vars="$(CPLN_TOKEN="${CPLN_TOKEN_STAGING}" cpln gvc get "${STAGING_APP_NAME}" --org "${CPLN_ORG_STAGING}" -o json | jq -r '.spec.env // [] | .[].name' | sort)" | ||
| production_vars="$(CPLN_TOKEN="${CPLN_TOKEN_PRODUCTION}" cpln gvc get "${PRODUCTION_APP_NAME}" --org "${CPLN_ORG_PRODUCTION}" -o json | jq -r '.spec.env // [] | .[].name' | sort)" |
There was a problem hiding this comment.
This parity check compares .spec.env[].name on the GVC objects, which covers only GVC-level environment variable names. It will not detect:
- Workload-level env vars (set per-container, not on the GVC)
- Control Plane secret references (
cpln://secret/...) vs. plain values - Variables that exist in both environments but carry different values
This is documented in readme.md, but teams should be aware that a workload configured with a missing or differently-named env var in production could pass this check and only fail at runtime post-promotion. A workload-level parity check (iterating cpln workload get per workload) would close the gap but is more complex to implement.



Summary
cpflow-*action and workflow set fromcpflow generate-github-actions.cpflow-*action/workflow names.Verification
BUNDLE_GEMFILE=/tmp/cpflow-pr-278/Gemfile bin/conductor-exec bundle exec ruby /tmp/cpflow-pr-278/bin/cpflow github-flow-readiness— no blocking readiness issues detected.bin/conductor-exec ruby -e 'require "yaml"; paths = Dir[".github/**/*.yml"] + Dir[".github/**/*.yaml"] + Dir[".controlplane/**/*.yml"]; paths.sort.each { |path| YAML.load_file(path, aliases: true); puts path }'bin/conductor-exec git diff --cached --checkbin/conductor-exec bundle exec rubocopNot Run / Blocked
bin/conductor-exec bundle exec rspeccould not complete locally because PostgreSQL was not running at/tmp/.s.PGSQL.5432; after the first Rails load failure, Coveralls setup also reported repeated coverage initialization errors.Note
Medium Risk
Replaces all Control Plane deployment automation (review apps, staging deploys, production promotion/rollback, cleanup) with a newly generated workflow set, so misconfiguration could block or mis-deploy environments. Changes also touch secret-handling paths (tokens/SSH key for Docker builds), requiring careful review of permissions and trust boundaries.
Overview
Switches Control Plane CI/CD from the hand-written GitHub Actions to the generated
cpflow-*composite actions and workflows, covering opt-in review apps, branch-gated staging deploys, manual staging→production promotion, and nightly stale review-app cleanup.The new flow adds shared actions for environment setup/version pinning, config validation, safer Docker builds (optional SSH key + extra args), release-phase detection, health polling, and stricter review-app deletion safeguards (prefix-based). Docs in
.controlplane/readme.mdand.controlplane/shakacode-team.mdare updated to match the new workflows, required repo variables/secrets, and operational guidance (slash-command exact matching, staging branch filter caveats).Reviewed by Cursor Bugbot for commit 79c6b97. Bugbot is set up for automated code reviews on this repo. Configure here.
Summary by CodeRabbit
Documentation
New Features
Refactor