Restore fork infra dropped by the #2391 main pickup (terraform, cockpit, sops, e2e workflow)#72
Restore fork infra dropped by the #2391 main pickup (terraform, cockpit, sops, e2e workflow)#72ConstanzeTU wants to merge 3 commits into
Conversation
The fork's terraform/ tree (auth0 + cloud_deps + cockpit credentials + scripts, 16 files) was deleted on main by commit 53d09cc, the local landing of upstream pixie-io#2391 'Create vizier and cloud cert-manager compatible secrets'. That upstream commit only touches cert handling; the terraform deletion was an unrelated regression bundled into the fork pickup. Content here is byte-identical to the pre-deletion tree at 610975b (parent of main tip). This belongs on main, not in PR 68.
Same regression as the terraform tree: commit 53d09cc (the fork landing of upstream pixie-io#2391) deleted these fork-only files from main, and they were re-added on PR 68's branch where they don't belong. Moved here for the main-restore PR. - cockpit/**: restored at the post-rename path (private/cockpit/ -> cockpit/) so it passes the upstream filename-linter on main; the workflow file is unchanged from main. - .sops.yaml, .github/workflows/e2e_log4shell_soc.yaml: byte-identical to PR 68's copies.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Plus Run ID: 📒 Files selected for processing (11)
💤 Files with no reviewable changes (9)
📝 WalkthroughWalkthroughAdds a Terraform-managed Auth0 tenant and cloud_deps module (certs, secrets, kustomize batches), SOPS-encrypted credential storage via Azure Key Vault, Kubernetes cockpit manifests (ConfigMaps, Ingress, Envoy, auth-server, plugin-db-updater Job, Kustomization), and a GitHub Actions workflow for end-to-end log4shell SOC testing with pprof profiling. ChangesCockpit Cloud Environment Setup
E2E log4shell SOC GitHub Actions workflow
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 15
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/e2e_log4shell_soc.yaml:
- Around line 36-37: The checkout in the workflow currently leaves persisted
credentials available to later steps, which can expose the read token to
external scripts. Update the existing actions/checkout usage in the
e2e_log4shell_soc job to disable persisted credentials, keeping the checkout
step for the harness scripts but ensuring the token is not stored in local git
config. Refer to the checkout step using actions/checkout in this workflow and
apply the secure checkout setting there.
- Around line 14-24: The workflow_dispatch e2e job currently has no concurrency
control, so duplicate runs can collide on shared k3s state and overwrite
/tmp/evidence. Add a concurrency block to this workflow using a stable group key
for the host-scoped job (for example based on github.workflow plus the selected
inputs such as dx_image and soc_ref), and enable cancel-in-progress or otherwise
prevent overlapping runs in the workflow that defines workflow_dispatch,
dx_image, and soc_ref.
- Around line 39-43: The Install k3s step is using the unpinned get.k3s.io
installer, which makes the privileged e2e environment non-reproducible. Update
the k3s install logic in the workflow step named Install k3s to set a validated
INSTALL_K3S_VERSION before invoking the installer, keeping the rest of the
readiness check and kubectl verification unchanged.
- Around line 102-120: The “Profile dx in real life (pprof + metrics)” step is
not failure-safe when prior setup fails, so it can’t write artifacts or clearly
indicate it was skipped. In the workflow job, update this step to create
/tmp/evidence before any writes and to explicitly detect missing prerequisites
like the dx-daemon pod or a failed port-forward. Use the existing shell block in
the profiling step to emit a clear skip message and exit successfully when those
prerequisites are absent, while keeping the profiling and metric collection
logic intact when they are present.
- Around line 45-70: The deploy step is interpolating untrusted manual inputs
directly into shell commands, which can allow command injection. In the SOC
deployment job, stop embedding inputs.soc_ref and inputs.dx_image inside the run
script and instead pass them through validated environment variables or
precomputed safe outputs before use. Add strict allowlist-style validation for
the ref/image values in the step that performs git clone and kubectl set image,
and only execute the commands after the values have been verified safe.
In `@cockpit/auth_deployment.yaml`:
- Around line 14-19: Make the PL_AUTH0_CLIENT_ID secret lookup mandatory as
well, since it comes from the same cloud-auth0-secrets object as
PL_AUTH0_CLIENT_SECRET. Update the secretKeyRef for PL_AUTH0_CLIENT_ID in the
auth deployment manifest to remove the optional lookup, matching the behavior
already used for the client secret so missing or broken secrets fail at
deployment time.
In `@cockpit/domain_config.yaml`:
- Line 7: PL_DOMAIN_NAME is hard-coded to a specific public domain, creating a
second source of truth that can drift from the configured cloud_domain. Update
the domain config so PL_DOMAIN_NAME is derived from the same domain input used
by the cert/Auth0 layer rather than a literal, and keep the value wired through
the existing configuration path that populates this ConfigMap.
In `@cockpit/plugin_db_updater_job.yaml`:
- Around line 31-32: The updater Job still references the mutable
cloud-plugin-load_db:latest image without a kustomize images transform, so fix
the image reference in the plugin_db_updater_job manifest or add an images
rewrite in kustomization to pin it to an immutable tag or digest. Use the
updater container entry in plugin_db_updater_job and the image handling in
kustomization.yaml to ensure the deployed Job always pulls a stable,
reproducible image.
In `@cockpit/proxy_envoy.yaml`:
- Around line 43-45: The CORS origin allowlist in proxy_envoy.yaml is too narrow
and blocks the cockpit domain. Update the Envoy CORS configuration in the
relevant cors/allow_origin_string_match block to allow the actual cloud domain
used by the deployment, matching the cloud_domain default from variables.tf (via
the cockpit proxy config) instead of only test.austrianopencloudcommunity.org.
Ensure the suffix matches the real browser origin expected by the cockpit
frontend.
- Around line 9-96: The ConfigMap entry for envoy.yaml is using a folded scalar,
which will collapse the YAML structure and break Envoy’s config parsing. Change
the envoy.yaml value to a literal block so the full nested structure is
preserved exactly as written. Keep the fix focused on the envoy.yaml key in
proxy_envoy.yaml and ensure the rendered config remains valid YAML for Envoy to
load.
In `@terraform/kubernetes/auth0/auth0_main.tf`:
- Around line 11-13: Disable the Auth0 provider debug setting in the provider
block so committed Terraform config does not emit request/response details;
update the auth0 provider configuration in auth0_main.tf to keep debug off by
default and only enable it temporarily for local troubleshooting. Use the
provider "auth0" block as the target for this change.
In `@terraform/kubernetes/auth0/generated.tf`:
- Around line 467-476: The Auth0 setup is over-privileged because the
interactive login client and the Management API grant share the same client
identity. Update the Terraform model around the app represented by the
regular_web client and the auth0_client_grant resource so the login application
only keeps callback/authorization_code usage, and move the Management API scopes
to a separate least-privileged M2M client. Make sure the grant resource uses the
new machine-to-machine client_id and that the login client no longer has
tenant-admin scopes attached.
In `@terraform/kubernetes/cloud_deps/cert_manager.tf`:
- Around line 1-4: The lookup in locals.cert_config is failing too early with an
invalid-index error, so the intended validation never gets a chance to run.
Change the cert_details access to use a nullable/safe lookup first in
cert_manager.tf, then have the existing precondition validate that returned
value and surface the custom message. Keep the fix centered around
locals.cert_config and the related precondition block so the missing
cert_details entry is handled by validation instead of indexing.
In `@terraform/kubernetes/cloud_deps/provider.tf`:
- Around line 14-22: The kubeconfig settings in the kubernetes and kustomization
provider blocks are hard-coded to a specific home path and context, making this
module environment-specific. Update the provider configuration to take both the
kubeconfig path and context from inputs instead of embedding
"~/.kube/cockpit-config" and "default". Use existing Terraform input variables
or add new ones, and if you keep a home-directory default, resolve it with
pathexpand() in the provider setup.
In `@terraform/kubernetes/scripts/create_random_bytes.sh`:
- Around line 19-36: The random-bytes generation in create_random_bytes.sh can
hang or emit an empty JSON payload when chars_to_keep is empty or when an
upstream command in the /dev/urandom pipeline fails. Update the script to
validate inputs before the pipeline starts, reject empty or invalid values for
chars_to_keep and num_chars in usage(), and make the pipeline fail fast by
ensuring the script propagates failures from bytes generation instead of still
reaching jq. Use the existing usage() and bytes assignment as the main
touchpoints.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 26709ba6-70c5-4b82-98c9-a0300c8fc6be
⛔ Files ignored due to path filters (1)
terraform/kubernetes/cloud_deps/terraform.tfstateis excluded by!**/*.tfstate
📒 Files selected for processing (27)
.github/workflows/e2e_log4shell_soc.yaml.sops.yamlcockpit/artifact_config.yamlcockpit/auth_deployment.yamlcockpit/cloud_ingress.yamlcockpit/domain_config.yamlcockpit/kustomization.yamlcockpit/oauth_config.yamlcockpit/plugin_db_updater_job.yamlcockpit/proxy_envoy.yamlcockpit/script_bundles_config.yamlcockpit/servers_transport.yamlterraform/credentials/cockpit/auth0_config.yamlterraform/kubernetes/auth0/.terraform.lock.hclterraform/kubernetes/auth0/auth0_import.tfterraform/kubernetes/auth0/auth0_main.tfterraform/kubernetes/auth0/generated.tfterraform/kubernetes/auth0/terraform.tfvarsterraform/kubernetes/auth0/variables.tfterraform/kubernetes/cloud_deps/.terraform.lock.hclterraform/kubernetes/cloud_deps/base_cloud_deps.tfterraform/kubernetes/cloud_deps/cert_manager.tfterraform/kubernetes/cloud_deps/core_resource_deps.tfterraform/kubernetes/cloud_deps/env_cloud_deps.tfterraform/kubernetes/cloud_deps/provider.tfterraform/kubernetes/cloud_deps/variables.tfterraform/kubernetes/scripts/create_random_bytes.sh
| on: | ||
| workflow_dispatch: | ||
| inputs: | ||
| dx_image: | ||
| description: dx-daemon image to test (default = .image-tags pin) | ||
| required: false | ||
| default: "" | ||
| soc_ref: | ||
| description: k8sstormcenter/soc branch | ||
| required: false | ||
| default: "218-clickhouse-schema" |
There was a problem hiding this comment.
🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win
Add concurrency for this host-scoped e2e.
This job mutates k3s state and writes fixed /tmp/evidence; duplicate dispatches can collide or overwrite evidence.
Proposed fix
soc_ref:
description: k8sstormcenter/soc branch
required: false
default: "218-clickhouse-schema"
+concurrency:
+ group: e2e-log4shell-soc
+ cancel-in-progress: false
permissions:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| on: | |
| workflow_dispatch: | |
| inputs: | |
| dx_image: | |
| description: dx-daemon image to test (default = .image-tags pin) | |
| required: false | |
| default: "" | |
| soc_ref: | |
| description: k8sstormcenter/soc branch | |
| required: false | |
| default: "218-clickhouse-schema" | |
| on: | |
| workflow_dispatch: | |
| inputs: | |
| dx_image: | |
| description: dx-daemon image to test (default = .image-tags pin) | |
| required: false | |
| default: "" | |
| soc_ref: | |
| description: k8sstormcenter/soc branch | |
| required: false | |
| default: "218-clickhouse-schema" | |
| concurrency: | |
| group: e2e-log4shell-soc | |
| cancel-in-progress: false |
🧰 Tools
🪛 zizmor (1.26.1)
[warning] 14-24: insufficient job-level concurrency limits (concurrency-limits): workflow is missing concurrency setting
(concurrency-limits)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/e2e_log4shell_soc.yaml around lines 14 - 24, The
workflow_dispatch e2e job currently has no concurrency control, so duplicate
runs can collide on shared k3s state and overwrite /tmp/evidence. Add a
concurrency block to this workflow using a stable group key for the host-scoped
job (for example based on github.workflow plus the selected inputs such as
dx_image and soc_ref), and enable cancel-in-progress or otherwise prevent
overlapping runs in the workflow that defines workflow_dispatch, dx_image, and
soc_ref.
Source: Linters/SAST tools
| - name: Checkout pixie (harness scripts) | ||
| uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major | ⚡ Quick win
Disable persisted checkout credentials.
The job later runs external repo scripts; leaving the checkout token in local git config unnecessarily exposes the read token to job code.
Proposed fix
- name: Checkout pixie (harness scripts)
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
+ with:
+ persist-credentials: false📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Checkout pixie (harness scripts) | |
| uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 | |
| - name: Checkout pixie (harness scripts) | |
| uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 | |
| with: | |
| persist-credentials: false |
🧰 Tools
🪛 zizmor (1.26.1)
[warning] 36-37: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false
(artipacked)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/e2e_log4shell_soc.yaml around lines 36 - 37, The checkout
in the workflow currently leaves persisted credentials available to later steps,
which can expose the read token to external scripts. Update the existing
actions/checkout usage in the e2e_log4shell_soc job to disable persisted
credentials, keeping the checkout step for the harness scripts but ensuring the
token is not stored in local git config. Refer to the checkout step using
actions/checkout in this workflow and apply the secure checkout setting there.
Source: Linters/SAST tools
| - name: Install k3s | ||
| run: | | ||
| curl -sfL https://get.k3s.io | sh -s - --write-kubeconfig-mode 644 | ||
| for i in $(seq 1 60); do kubectl get nodes --no-headers 2>/dev/null | grep -q ' Ready' && break; sleep 5; done | ||
| kubectl get nodes |
There was a problem hiding this comment.
🔒 Security & Privacy | 🔵 Trivial | ⚡ Quick win
Pin the k3s version before running the installer.
get.k3s.io installs a moving target; set a validated INSTALL_K3S_VERSION so this privileged e2e is reproducible.
Proposed fix pattern
- name: Install k3s
run: |
+ export INSTALL_K3S_VERSION="${INSTALL_K3S_VERSION:?pin the validated k3s version}"
curl -sfL https://get.k3s.io | sh -s - --write-kubeconfig-mode 644🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/e2e_log4shell_soc.yaml around lines 39 - 43, The Install
k3s step is using the unpinned get.k3s.io installer, which makes the privileged
e2e environment non-reproducible. Update the k3s install logic in the workflow
step named Install k3s to set a validated INSTALL_K3S_VERSION before invoking
the installer, keeping the rest of the readiness check and kubectl verification
unchanged.
| - name: Deploy the SOC stack (Pixie + kubescape + ClickHouse + AE + dx + log4j chain) | ||
| env: | ||
| PX_CLOUD_ADDR: pixie.austrianopencloudcommunity.org | ||
| PX_DEPLOY_KEY: ${{ secrets.PX_DEPLOY_KEY }} | ||
| PX_API_KEY: ${{ secrets.PX_API_KEY }} | ||
| TS_AUTHKEY: ${{ secrets.TAILSCALE_AUTH_KEY }} | ||
| DX_ENTLEIN_PAT: ${{ secrets.DX_ENTLEIN_PAT }} # private entlein/dx image pull | ||
| CLICKHOUSE_ANALYST_PASSWORD: ${{ secrets.CLICKHOUSE_ANALYST_PASSWORD }} | ||
| CLICKHOUSE_INGEST_PASSWORD: ${{ secrets.CLICKHOUSE_INGEST_PASSWORD }} | ||
| CLICKHOUSE_PIXIE_PASSWORD: ${{ secrets.CLICKHOUSE_PIXIE_PASSWORD }} | ||
| run: | | ||
| set -euo pipefail | ||
| sudo apt-get update -qq && sudo apt-get install -y python3-yaml | ||
| git clone --depth 1 -b "${{ inputs.soc_ref }}" https://github.com/k8sstormcenter/soc soc | ||
| cd soc | ||
| make pixie # vizier + AE | ||
| make kubescape || true # node-agent (netStreaming) | ||
| bash tree/clickhouse-lab/install.sh # forensic_db | ||
| make log4j # vulnerable backend + attacker + dx + SBoBs (managed-by=User) | ||
| if [ -n "${{ inputs.dx_image }}" ]; then | ||
| kubectl -n honey set image ds/dx-daemon dx-daemon="${{ inputs.dx_image }}" || true | ||
| fi | ||
| # optimal config + enable pprof for the real-life profile (DX_TELEMETRY_CACHE/DX_BENCH | ||
| # are defaults in main, set here too in case the kit's manifest predates them) | ||
| kubectl -n honey set env ds/dx-daemon DX_PPROF_ADDR=0.0.0.0:6060 DX_TELEMETRY_CACHE=1 DX_BENCH=pemdirect | ||
| kubectl -n honey rollout status ds/dx-daemon --timeout=120s |
There was a problem hiding this comment.
🔒 Security & Privacy | 🔴 Critical | ⚡ Quick win
Move dispatch inputs out of shell templates and validate them.
${{ inputs.soc_ref }} and ${{ inputs.dx_image }} are expanded before shell parsing. A crafted manual input can break out of quotes and run commands with the secrets in this step.
Proposed fix
env:
+ SOC_REF: ${{ inputs.soc_ref }}
+ DX_IMAGE: ${{ inputs.dx_image }}
PX_CLOUD_ADDR: pixie.austrianopencloudcommunity.org
PX_DEPLOY_KEY: ${{ secrets.PX_DEPLOY_KEY }}
@@
set -euo pipefail
sudo apt-get update -qq && sudo apt-get install -y python3-yaml
- git clone --depth 1 -b "${{ inputs.soc_ref }}" https://github.com/k8sstormcenter/soc soc
+ if ! printf '%s' "$SOC_REF" | grep -Eq '^[A-Za-z0-9._/-]+$'; then
+ echo "Invalid soc_ref"; exit 1
+ fi
+ git clone --depth 1 --branch "$SOC_REF" --single-branch https://github.com/k8sstormcenter/soc soc
@@
- if [ -n "${{ inputs.dx_image }}" ]; then
- kubectl -n honey set image ds/dx-daemon dx-daemon="${{ inputs.dx_image }}" || true
+ if [ -n "$DX_IMAGE" ]; then
+ if ! printf '%s' "$DX_IMAGE" | grep -Eq '^[A-Za-z0-9._/:@+-]+$'; then
+ echo "Invalid dx_image"; exit 1
+ fi
+ kubectl -n honey set image ds/dx-daemon "dx-daemon=$DX_IMAGE"
fi📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Deploy the SOC stack (Pixie + kubescape + ClickHouse + AE + dx + log4j chain) | |
| env: | |
| PX_CLOUD_ADDR: pixie.austrianopencloudcommunity.org | |
| PX_DEPLOY_KEY: ${{ secrets.PX_DEPLOY_KEY }} | |
| PX_API_KEY: ${{ secrets.PX_API_KEY }} | |
| TS_AUTHKEY: ${{ secrets.TAILSCALE_AUTH_KEY }} | |
| DX_ENTLEIN_PAT: ${{ secrets.DX_ENTLEIN_PAT }} # private entlein/dx image pull | |
| CLICKHOUSE_ANALYST_PASSWORD: ${{ secrets.CLICKHOUSE_ANALYST_PASSWORD }} | |
| CLICKHOUSE_INGEST_PASSWORD: ${{ secrets.CLICKHOUSE_INGEST_PASSWORD }} | |
| CLICKHOUSE_PIXIE_PASSWORD: ${{ secrets.CLICKHOUSE_PIXIE_PASSWORD }} | |
| run: | | |
| set -euo pipefail | |
| sudo apt-get update -qq && sudo apt-get install -y python3-yaml | |
| git clone --depth 1 -b "${{ inputs.soc_ref }}" https://github.com/k8sstormcenter/soc soc | |
| cd soc | |
| make pixie # vizier + AE | |
| make kubescape || true # node-agent (netStreaming) | |
| bash tree/clickhouse-lab/install.sh # forensic_db | |
| make log4j # vulnerable backend + attacker + dx + SBoBs (managed-by=User) | |
| if [ -n "${{ inputs.dx_image }}" ]; then | |
| kubectl -n honey set image ds/dx-daemon dx-daemon="${{ inputs.dx_image }}" || true | |
| fi | |
| # optimal config + enable pprof for the real-life profile (DX_TELEMETRY_CACHE/DX_BENCH | |
| # are defaults in main, set here too in case the kit's manifest predates them) | |
| kubectl -n honey set env ds/dx-daemon DX_PPROF_ADDR=0.0.0.0:6060 DX_TELEMETRY_CACHE=1 DX_BENCH=pemdirect | |
| kubectl -n honey rollout status ds/dx-daemon --timeout=120s | |
| - name: Deploy the SOC stack (Pixie + kubescape + ClickHouse + AE + dx + log4j chain) | |
| env: | |
| SOC_REF: ${{ inputs.soc_ref }} | |
| DX_IMAGE: ${{ inputs.dx_image }} | |
| PX_CLOUD_ADDR: pixie.austrianopencloudcommunity.org | |
| PX_DEPLOY_KEY: ${{ secrets.PX_DEPLOY_KEY }} | |
| PX_API_KEY: ${{ secrets.PX_API_KEY }} | |
| TS_AUTHKEY: ${{ secrets.TAILSCALE_AUTH_KEY }} | |
| DX_ENTLEIN_PAT: ${{ secrets.DX_ENTLEIN_PAT }} # private entlein/dx image pull | |
| CLICKHOUSE_ANALYST_PASSWORD: ${{ secrets.CLICKHOUSE_ANALYST_PASSWORD }} | |
| CLICKHOUSE_INGEST_PASSWORD: ${{ secrets.CLICKHOUSE_INGEST_PASSWORD }} | |
| CLICKHOUSE_PIXIE_PASSWORD: ${{ secrets.CLICKHOUSE_PIXIE_PASSWORD }} | |
| run: | | |
| set -euo pipefail | |
| sudo apt-get update -qq && sudo apt-get install -y python3-yaml | |
| if ! printf '%s' "$SOC_REF" | grep -Eq '^[A-Za-z0-9._/-]+$'; then | |
| echo "Invalid soc_ref"; exit 1 | |
| fi | |
| git clone --depth 1 --branch "$SOC_REF" --single-branch https://github.com/k8sstormcenter/soc soc | |
| cd soc | |
| make pixie # vizier + AE | |
| make kubescape || true # node-agent (netStreaming) | |
| bash tree/clickhouse-lab/install.sh # forensic_db | |
| make log4j # vulnerable backend + attacker + dx + SBoBs (managed-by=User) | |
| if [ -n "$DX_IMAGE" ]; then | |
| if ! printf '%s' "$DX_IMAGE" | grep -Eq '^[A-Za-z0-9._/:@+-]+$'; then | |
| echo "Invalid dx_image"; exit 1 | |
| fi | |
| kubectl -n honey set image ds/dx-daemon "dx-daemon=$DX_IMAGE" | |
| fi | |
| # optimal config + enable pprof for the real-life profile (DX_TELEMETRY_CACHE/DX_BENCH | |
| # are defaults in main, set here too in case the kit's manifest predates them) | |
| kubectl -n honey set env ds/dx-daemon DX_PPROF_ADDR=0.0.0.0:6060 DX_TELEMETRY_CACHE=1 DX_BENCH=pemdirect | |
| kubectl -n honey rollout status ds/dx-daemon --timeout=120s |
🧰 Tools
🪛 zizmor (1.26.1)
[error] 58-58: code injection via template expansion (template-injection): may expand into attacker-controllable code
(template-injection)
[error] 64-64: code injection via template expansion (template-injection): may expand into attacker-controllable code
(template-injection)
[error] 65-65: code injection via template expansion (template-injection): may expand into attacker-controllable code
(template-injection)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/e2e_log4shell_soc.yaml around lines 45 - 70, The deploy
step is interpolating untrusted manual inputs directly into shell commands,
which can allow command injection. In the SOC deployment job, stop embedding
inputs.soc_ref and inputs.dx_image inside the run script and instead pass them
through validated environment variables or precomputed safe outputs before use.
Add strict allowlist-style validation for the ref/image values in the step that
performs git clone and kubectl set image, and only execute the commands after
the values have been verified safe.
Source: Linters/SAST tools
| - name: Profile dx in real life (pprof + metrics) | ||
| if: always() | ||
| run: | | ||
| set -uo pipefail | ||
| POD=$(kubectl -n honey get pod -l app=dx-daemon -o jsonpath='{.items[0].metadata.name}') | ||
| kubectl -n honey port-forward "$POD" 6060:6060 9095:9095 & PF=$!; sleep 5 | ||
| # 30s CPU profile under a fresh fire + heap, served by DX_PPROF_ADDR=:6060 | ||
| ( bash "$HARNESS/log4shell_fire.sh" >/dev/null 2>&1 || true ) & | ||
| curl -s --max-time 40 -o /tmp/evidence/dx_cpu.pprof \ | ||
| "http://127.0.0.1:6060/debug/pprof/profile?seconds=30" || true | ||
| curl -s "http://127.0.0.1:6060/debug/pprof/heap" -o /tmp/evidence/dx_heap.pprof || true | ||
| curl -s "http://127.0.0.1:9095/metrics" -o /tmp/evidence/dx_metrics.txt || true | ||
| go tool pprof -top -nodecount=25 /tmp/evidence/dx_cpu.pprof > /tmp/evidence/dx_cpu_top.txt 2>&1 || true | ||
| kill $PF 2>/dev/null || true | ||
| echo "=== dx CPU top ==="; head -30 /tmp/evidence/dx_cpu_top.txt | ||
| echo "=== verdict latency ===" | ||
| grep -E \ | ||
| "dx_(time_to_verdict|bench_query_duration)_seconds_(sum|count)" \ | ||
| /tmp/evidence/dx_metrics.txt || true |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win
Make the always-on profiling step failure-safe.
If earlier steps fail before /tmp/evidence or the dx pod exists, this step silently loses profile artifacts. Create the directory and record an explicit skip.
Proposed fix
run: |
set -uo pipefail
- POD=$(kubectl -n honey get pod -l app=dx-daemon -o jsonpath='{.items[0].metadata.name}')
+ mkdir -p /tmp/evidence
+ POD=$(kubectl -n honey get pod -l app=dx-daemon -o jsonpath='{.items[0].metadata.name}' 2>/dev/null || true)
+ if [ -z "$POD" ]; then
+ echo "dx-daemon pod not found; profiling skipped" | tee /tmp/evidence/profile_skipped.txt
+ exit 0
+ fi
kubectl -n honey port-forward "$POD" 6060:6060 9095:9095 & PF=$!; sleep 5
+ trap 'kill "$PF" 2>/dev/null || true' EXIT📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - name: Profile dx in real life (pprof + metrics) | |
| if: always() | |
| run: | | |
| set -uo pipefail | |
| POD=$(kubectl -n honey get pod -l app=dx-daemon -o jsonpath='{.items[0].metadata.name}') | |
| kubectl -n honey port-forward "$POD" 6060:6060 9095:9095 & PF=$!; sleep 5 | |
| # 30s CPU profile under a fresh fire + heap, served by DX_PPROF_ADDR=:6060 | |
| ( bash "$HARNESS/log4shell_fire.sh" >/dev/null 2>&1 || true ) & | |
| curl -s --max-time 40 -o /tmp/evidence/dx_cpu.pprof \ | |
| "http://127.0.0.1:6060/debug/pprof/profile?seconds=30" || true | |
| curl -s "http://127.0.0.1:6060/debug/pprof/heap" -o /tmp/evidence/dx_heap.pprof || true | |
| curl -s "http://127.0.0.1:9095/metrics" -o /tmp/evidence/dx_metrics.txt || true | |
| go tool pprof -top -nodecount=25 /tmp/evidence/dx_cpu.pprof > /tmp/evidence/dx_cpu_top.txt 2>&1 || true | |
| kill $PF 2>/dev/null || true | |
| echo "=== dx CPU top ==="; head -30 /tmp/evidence/dx_cpu_top.txt | |
| echo "=== verdict latency ===" | |
| grep -E \ | |
| "dx_(time_to_verdict|bench_query_duration)_seconds_(sum|count)" \ | |
| /tmp/evidence/dx_metrics.txt || true | |
| - name: Profile dx in real life (pprof + metrics) | |
| if: always() | |
| run: | | |
| set -uo pipefail | |
| mkdir -p /tmp/evidence | |
| POD=$(kubectl -n honey get pod -l app=dx-daemon -o jsonpath='{.items[0].metadata.name}' 2>/dev/null || true) | |
| if [ -z "$POD" ]; then | |
| echo "dx-daemon pod not found; profiling skipped" | tee /tmp/evidence/profile_skipped.txt | |
| exit 0 | |
| fi | |
| kubectl -n honey port-forward "$POD" 6060:6060 9095:9095 & PF=$!; sleep 5 | |
| trap 'kill "$PF" 2>/dev/null || true' EXIT | |
| # 30s CPU profile under a fresh fire + heap, served by DX_PPROF_ADDR=:6060 | |
| ( bash "$HARNESS/log4shell_fire.sh" >/dev/null 2>&1 || true ) & | |
| curl -s --max-time 40 -o /tmp/evidence/dx_cpu.pprof \ | |
| "http://127.0.0.1:6060/debug/pprof/profile?seconds=30" || true | |
| curl -s "http://127.0.0.1:6060/debug/pprof/heap" -o /tmp/evidence/dx_heap.pprof || true | |
| curl -s "http://127.0.0.1:9095/metrics" -o /tmp/evidence/dx_metrics.txt || true | |
| go tool pprof -top -nodecount=25 /tmp/evidence/dx_cpu.pprof > /tmp/evidence/dx_cpu_top.txt 2>&1 || true | |
| kill $PF 2>/dev/null || true | |
| echo "=== dx CPU top ==="; head -30 /tmp/evidence/dx_cpu_top.txt | |
| echo "=== verdict latency ===" | |
| grep -E \ | |
| "dx_(time_to_verdict|bench_query_duration)_seconds_(sum|count)" \ | |
| /tmp/evidence/dx_metrics.txt || true |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/e2e_log4shell_soc.yaml around lines 102 - 120, The
“Profile dx in real life (pprof + metrics)” step is not failure-safe when prior
setup fails, so it can’t write artifacts or clearly indicate it was skipped. In
the workflow job, update this step to create /tmp/evidence before any writes and
to explicitly detect missing prerequisites like the dx-daemon pod or a failed
port-forward. Use the existing shell block in the profiling step to emit a clear
skip message and exit successfully when those prerequisites are absent, while
keeping the profiling and metric collection logic intact when they are present.
| provider "auth0" { | ||
| debug = true | ||
| } |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '\n## Files mentioning auth0 provider or debug\n'
git ls-files | rg '(^|/)auth0.*\.tf$|terraform/kubernetes/auth0/|provider "auth0"|debug\s*=' -n || true
printf '\n## auth0_main.tf with line numbers\n'
cat -n terraform/kubernetes/auth0/auth0_main.tf
printf '\n## nearby terraform files in auth0 module\n'
find terraform/kubernetes/auth0 -maxdepth 1 -type f | sort | xargs -I{} sh -c 'echo "--- {}"; cat -n "{}" | sed -n "1,220p"'Repository: k8sstormcenter/pixie
Length of output: 18992
🌐 Web query:
Auth0 Terraform provider debug = true logs sensitive data provider documentation
💡 Result:
In the Auth0 Terraform provider, setting debug = true in the provider configuration enables HTTP request and response logging when the TF_LOG environment variable is also set to DEBUG [1][2][3]. Security implications of enabling this debug mode: 1. Sensitive Data Exposure: Enabling debug logging can cause sensitive information, such as authentication tokens, client secrets, or other PII, to be written to standard output or log files in plain text [1][2][4]. Because these logs capture the full HTTP traffic between the provider and the Auth0 Management API, they may bypass standard Terraform CLI redaction mechanisms [4][5]. 2. Best Practices: You should avoid enabling debug mode in non-development environments [6]. If you must use debug logging, ensure that the output is directed to a secure, restricted-access location and that logs are purged promptly [5]. 3. Alternative Data Protection: For managing sensitive data within Auth0 resources, the provider offers specific mechanisms like write-only attributes (e.g., for passwords or API keys) and PII configuration options for resources like log streams to help handle sensitive data more securely than raw logging [7][8][9]. To debug issues without globally exposing sensitive data via the provider's debug flag, consider using more granular Terraform logging options (e.g., TF_LOG_PROVIDER_AUTH0=DEBUG) or platform-specific monitoring tools that do not persist sensitive API request bodies in cleartext [5]. Always treat Terraform log output as potentially containing sensitive information when debug verbosity is enabled [4].
Citations:
- 1: https://registry.terraform.io/providers/auth0/auth0/1.42.0/docs
- 2: https://registry.terraform.io/providers/auth0/auth0/latest/docs
- 3: https://library.tf/providers/auth0/auth0/latest
- 4: https://docs.hashicorp.com/terraform/plugin/log/filtering
- 5: https://developer.hashicorp.com/terraform/plugin/log/managing
- 6: https://docs.hashicorp.com/terraform/plugin/debugging
- 7: https://registry.terraform.io/providers/auth0/auth0/latest/docs/resources/log_stream
- 8: auth0/terraform-provider-auth0@9563820
- 9: Support for Ephemeral Values in Terraform Auth0 Provider auth0/terraform-provider-auth0#1218
Disable Auth0 provider debug logging. debug = true can emit Auth0 request/response details, including secrets, into Terraform logs when debug logging is enabled. Keep it off in committed config and enable it only in local troubleshooting.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@terraform/kubernetes/auth0/auth0_main.tf` around lines 11 - 13, Disable the
Auth0 provider debug setting in the provider block so committed Terraform config
does not emit request/response details; update the auth0 provider configuration
in auth0_main.tf to keep debug off by default and only enable it temporarily for
local troubleshooting. Use the provider "auth0" block as the target for this
change.
| # __generated__ by Terraform from "cgr_lgvjTwTNm53mojKq" | ||
| resource "auth0_client_grant" "akm0sumjg95tcrlp390pxzjozkkyndt_https_k8sstormcenter_eu_auth0_com_api_v2" { | ||
| allow_all_scopes = false | ||
| allow_any_organization = false | ||
| audience = "https://k8sstormcenter.eu.auth0.com/api/v2/" | ||
| authorization_details_types = [] | ||
| client_id = "5akM0SumJG95tcRlP390pXzJoZkkYNDT" | ||
| organization_usage = null | ||
| scopes = ["read:users", "update:users", "delete:users", "create:users", "read:users_app_metadata", "update:users_app_metadata", "delete:users_app_metadata", "create:users_app_metadata", "read:user_custom_blocks", "create:user_custom_blocks", "delete:user_custom_blocks", "create:user_tickets", "read:logs_users", "read:user_idp_tokens", "create:user_attribute_profiles", "read:user_attribute_profiles", "update:user_attribute_profiles", "delete:user_attribute_profiles", "read:user_effective_permissions", "read:user_effective_roles", "read:user_role_source_groups", "read:user_permission_source_roles"] | ||
| subject_type = "client" |
There was a problem hiding this comment.
🔒 Security & Privacy | 🟠 Major | 🏗️ Heavy lift
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file='terraform/kubernetes/auth0/generated.tf'
echo '=== Around lines 450-490 ==='
sed -n '450,490p' "$file" | cat -n
echo
echo '=== Around lines 1688-1765 ==='
sed -n '1688,1765p' "$file" | cat -n
echo
echo '=== Search for the client id 5akM0SumJG95tcRlP390pXzJoZkkYNDT ==='
rg -n '5akM0SumJG95tcRlP390pXzJoZkkYNDT|app_type =|refresh_token|authorization_code|client_grant' "$file"Repository: k8sstormcenter/pixie
Length of output: 7746
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '580,620p' terraform/kubernetes/auth0/generated.tf | cat -nRepository: k8sstormcenter/pixie
Length of output: 1761
Split the interactive login client from the Management API grant. The same client_id is used for callbacks/authorization_code on the app at 1698-1758 and is also granted Auth0 Management API scopes here, so a leaked login secret would expose tenant-admin access too. Move the Management API scopes to a separate least-privileged M2M client and keep the login app modeled as regular_web.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@terraform/kubernetes/auth0/generated.tf` around lines 467 - 476, The Auth0
setup is over-privileged because the interactive login client and the Management
API grant share the same client identity. Update the Terraform model around the
app represented by the regular_web client and the auth0_client_grant resource so
the login application only keeps callback/authorization_code usage, and move the
Management API scopes to a separate least-privileged M2M client. Make sure the
grant resource uses the new machine-to-machine client_id and that the login
client no longer has tenant-admin scopes attached.
| locals { | ||
| cert_subdomain = var.cloud_domain | ||
| cert_config = var.cert_details[var.cloud_domain] | ||
| } |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
The custom validation never catches a missing cert_details entry.
local.cert_config = var.cert_details[var.cloud_domain] can throw Terraform's invalid-index error before the precondition runs, so operators won't get the intended message. Make the lookup nullable first, then validate that result.
Suggested fix
locals {
cert_subdomain = var.cloud_domain
- cert_config = var.cert_details[var.cloud_domain]
+ cert_config = lookup(var.cert_details, var.cloud_domain, null)
}
@@
lifecycle {
precondition {
- condition = contains(keys(var.cert_details), var.cloud_domain)
+ condition = local.cert_config != null
error_message = "cloud_domain '${var.cloud_domain}' must have a corresponding entry in cert_details."
}
}Also applies to: 48-53
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@terraform/kubernetes/cloud_deps/cert_manager.tf` around lines 1 - 4, The
lookup in locals.cert_config is failing too early with an invalid-index error,
so the intended validation never gets a chance to run. Change the cert_details
access to use a nullable/safe lookup first in cert_manager.tf, then have the
existing precondition validate that returned value and surface the custom
message. Keep the fix centered around locals.cert_config and the related
precondition block so the missing cert_details entry is handled by validation
instead of indexing.
| provider "kubernetes" { | ||
| config_path = "~/.kube/cockpit-config" | ||
| config_context = "default" | ||
| } | ||
|
|
||
| provider "kustomization" { | ||
| context = "default" | ||
| kubeconfig_path = "~/.kube/cockpit-config" | ||
| } |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf '--- file list ---\n'
git ls-files terraform/kubernetes/cloud_deps
printf '\n--- provider.tf ---\n'
cat -n terraform/kubernetes/cloud_deps/provider.tf
printf '\n--- related terraform files ---\n'
fd -a -t f 'variables.tf|*.tf' terraform/kubernetes/cloud_deps 2>/dev/null | sed 's#^\./##' | sort
printf '\n--- search for kubeconfig/context inputs/usages ---\n'
rg -n --hidden --glob '!**/.terraform/**' --glob '!**/.git/**' \
'kubeconfig_path|kubeconfig_context|config_context|config_path|context\s*=' \
terraform/kubernetes/cloud_depsRepository: k8sstormcenter/pixie
Length of output: 249
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf -- '--- file list ---\n'
git ls-files terraform/kubernetes/cloud_deps
printf -- '\n--- provider.tf ---\n'
cat -n terraform/kubernetes/cloud_deps/provider.tf
printf -- '\n--- related terraform files ---\n'
fd -a -t f 'variables.tf|*.tf' terraform/kubernetes/cloud_deps 2>/dev/null | sed 's#^\./##' | sort
printf -- '\n--- search for kubeconfig/context inputs/usages ---\n'
rg -n --hidden --glob '!**/.terraform/**' --glob '!**/.git/**' \
'kubeconfig_path|kubeconfig_context|config_context|config_path|context\s*=' \
terraform/kubernetes/cloud_depsRepository: k8sstormcenter/pixie
Length of output: 1194
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf -- '--- variables.tf ---\n'
cat -n terraform/kubernetes/cloud_deps/variables.tf
printf -- '\n--- other terraform files in module ---\n'
for f in terraform/kubernetes/cloud_deps/*.tf; do
[ "$(basename "$f")" = "provider.tf" ] && continue
printf -- '\n### %s ###\n' "$f"
cat -n "$f"
done
printf -- '\n--- search for cloud_deps references ---\n'
rg -n --hidden --glob '!**/.terraform/**' --glob '!**/.git/**' 'cloud_deps|cockpit-config|kubeconfig|kustomization' .Repository: k8sstormcenter/pixie
Length of output: 48401
🌐 Web query:
Terraform Kubernetes provider config_path tilde expansion kubeconfig path docs
💡 Result:
The Terraform Kubernetes provider documentation explicitly uses tilde () expansion for the /.kube/config") } Key considerations for config_path argument in its examples [1][2][3]. While this syntax is commonly used in Terraform provider configuration blocks, it is important to note that the provider expects an absolute path or a path relative to the working directory [4]. If you encounter issues where tilde expansion is not working as expected—particularly in CI/CD environments or non-standard file systems—you can use the built-in Terraform function pathexpand() to resolve the tilde to an absolute path before passing it to the provider [4]: provider "kubernetes" { config_path = pathexpand("config_path and config_paths: * The provider allows you to specify the kubeconfig location via the config_path attribute or the KUBE_CONFIG_PATH environment variable [1][3]. * The provider supports multiple paths using the config_paths attribute or the KUBE_CONFIG_PATHS environment variable [1][3]. * Unlike kubectl, the Terraform Kubernetes provider does not automatically read the standard KUBECONFIG environment variable by default [5][6]. You must explicitly configure the path in your Terraform block or provide it via the provider-specific environment variables [1][3].
Citations:
- 1: https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs
- 2: https://registry.terraform.io/providers/hashicorp/kubernetes/2.37.1/docs/guides/getting-started.html
- 3: https://registry.terraform.io/providers/hashicorp/kubernetes/3.0.1/docs
- 4: https://stackoverflow.com/questions/75484801/terraform-kubernetes-provider-fails-on-github-action-with-following-config-pat
- 5: Fall back to default kubernetes client config to support KUBECONFIG hashicorp/terraform-provider-kubernetes#1976
- 6: Source config_path provider argument from $KUBECONFIG env var hashicorp/terraform-provider-kubernetes#1175
🌐 Web query:
kbst kustomization provider kubeconfig_path context docs tilde expansion
💡 Result:
The kbst/kustomization Terraform provider does not automatically expand the tilde () character in the /.kube/config") } Alternatively, if you are setting this via an environment variable (e.g., kubeconfig_path attribute [1][2]. While some Terraform providers or core features may accept tilde-prefixed paths, it is a known behavior in the Terraform ecosystem that many providers do not perform shell-style tilde expansion [3]. To use a path in your home directory with the kubeconfig_path attribute, you should use the Terraform pathexpand function [4][3]. Recommended approach: Use the pathexpand function within a local variable or directly in the provider configuration to ensure the path is correctly resolved to an absolute path before it is passed to the provider [3]. Example usage: provider "kustomization" { kubeconfig_path = pathexpand("KUBECONFIG_PATH), the shell environment may handle the expansion before Terraform receives the string, depending on your shell configuration and how the variable is exported [5][1]. However, explicitly using pathexpand in your HCL code is the most reliable way to ensure portability across different systems [3].
Citations:
- 1: https://registry.terraform.io/providers/kbst/kustomization/latest/docs
- 2: https://registry.terraform.io/providers/kbst/kustomization/0.9.5/docs
- 3: https://registry.terraform.io/providers/tehcyx/kind/latest/docs/resources/cluster
- 4: core: Add pathexpand interpolation function hashicorp/terraform#11277
- 5: https://github.com/kbst/terraform-provider-kustomization
Parameterize the kubeconfig settings
terraform/kubernetes/cloud_deps/provider.tf should not hard-code ~/.kube/cockpit-config and default. ~ is not a reliable path expansion here, and the fixed context makes this module environment-specific. Expose both as inputs, using pathexpand() if you want a home-directory default.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@terraform/kubernetes/cloud_deps/provider.tf` around lines 14 - 22, The
kubeconfig settings in the kubernetes and kustomization provider blocks are
hard-coded to a specific home path and context, making this module
environment-specific. Update the provider configuration to take both the
kubeconfig path and context from inputs instead of embedding
"~/.kube/cockpit-config" and "default". Use existing Terraform input variables
or add new ones, and if you keep a home-directory default, resolve it with
pathexpand() in the provider setup.
| set -e | ||
|
|
||
| usage() { | ||
| if [ "$#" -ne 2 ]; then | ||
| echo "Illegal number of parameters" | ||
| echo "Usage: $0 <characters_to_keep> <number_of_characters>" | ||
| exit 1 | ||
| fi | ||
| } | ||
|
|
||
| usage "$@" | ||
|
|
||
| chars_to_keep=$1 | ||
| num_chars=$2 | ||
|
|
||
| bytes=$(< /dev/urandom tr -dc "${chars_to_keep}" | fold -w "$num_chars" | head -n 1) | ||
|
|
||
| jq -n --arg output "$bytes" '{"output":$output}' |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Fail fast before entering the /dev/urandom pipeline.
An empty chars_to_keep makes tr -dc reject every byte, so head -n 1 never returns; and without pipefail, an upstream failure can still emit {"output":""}. That can hang terraform apply or generate empty secrets.
💡 Proposed fix
- set -e
+ set -euo pipefail
...
usage "$@"
chars_to_keep=$1
num_chars=$2
+
+if [ -z "$chars_to_keep" ] || ! [[ "$num_chars" =~ ^[1-9][0-9]*$ ]]; then
+ echo "Usage: $0 <characters_to_keep> <number_of_characters>" >&2
+ exit 1
+fi
-bytes=$(< /dev/urandom tr -dc "${chars_to_keep}" | fold -w "$num_chars" | head -n 1)
+bytes=$(LC_ALL=C tr -dc "${chars_to_keep}" < /dev/urandom | fold -w "$num_chars" | head -n 1)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| set -e | |
| usage() { | |
| if [ "$#" -ne 2 ]; then | |
| echo "Illegal number of parameters" | |
| echo "Usage: $0 <characters_to_keep> <number_of_characters>" | |
| exit 1 | |
| fi | |
| } | |
| usage "$@" | |
| chars_to_keep=$1 | |
| num_chars=$2 | |
| bytes=$(< /dev/urandom tr -dc "${chars_to_keep}" | fold -w "$num_chars" | head -n 1) | |
| jq -n --arg output "$bytes" '{"output":$output}' | |
| set -euo pipefail | |
| usage() { | |
| if [ "$#" -ne 2 ]; then | |
| echo "Illegal number of parameters" | |
| echo "Usage: $0 <characters_to_keep> <number_of_characters>" | |
| exit 1 | |
| fi | |
| } | |
| usage "$@" | |
| chars_to_keep=$1 | |
| num_chars=$2 | |
| if [ -z "$chars_to_keep" ] || ! [[ "$num_chars" =~ ^[1-9][0-9]*$ ]]; then | |
| echo "Usage: $0 <characters_to_keep> <number_of_characters>" >&2 | |
| exit 1 | |
| fi | |
| bytes=$(LC_ALL=C tr -dc "${chars_to_keep}" < /dev/urandom | fold -w "$num_chars" | head -n 1) | |
| jq -n --arg output "$bytes" '{"output":$output}' |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@terraform/kubernetes/scripts/create_random_bytes.sh` around lines 19 - 36,
The random-bytes generation in create_random_bytes.sh can hang or emit an empty
JSON payload when chars_to_keep is empty or when an upstream command in the
/dev/urandom pipeline fails. Update the script to validate inputs before the
pipeline starts, reject empty or invalid values for chars_to_keep and num_chars
in usage(), and make the pipeline fail fast by ensuring the script propagates
failures from bytes generation instead of still reaching jq. Use the existing
usage() and bytes assignment as the main touchpoints.
…name-linter Reverts the private/cockpit/ -> cockpit/ rename so this restore matches the original main tree exactly: all 10 cockpit files (incl. kustomization.yaml with its original ../../k8s/cloud relative paths) are now byte-identical to the pre-deletion state at 610975b. Upstream main carries filename_linter.yaml (added by pixie-io#2391), which fails any PR touching a 'private' path. Adding a single '!private/cockpit/**' negative glob keeps that guard for everything else while letting this known fork-cloud config through. This is the only deviation in the PR from a pure byte-for-byte restore.
entlein
left a comment
There was a problem hiding this comment.
restoring original commits that were thrown out of main
What
Restores fork-specific infrastructure that was deleted from
mainand should never have left it, at its original paths, byte-identical to the pre-deletion tree (610975b27).terraform/**(auth0 + cloud_deps + cockpit credentials + scripts)private/cockpit/**(fork cloud kustomize bundle).sops.yaml.github/workflows/e2e_log4shell_soc.yaml.github/workflows/filename_linter.yamlWhy
maintip53d09cc8dis the fork landing of upstream pixie-io#2391 "Create vizier and cloud cert-manager compatible secrets". Upstream pixie-io#2391 only touches certificate handling, but as it landed on this fork it also deleted all of the above fork-only files — an unrelated regression. They were caught while preparing PR 68, re-added there to keep that branch deployable, then moved here (and stripped from PR 68) for a dedicated restore intomain.The one non-restore change
53d09cc8dalso brought upstream'sfilename_linter.yamlontomain, which fails any PR touching a path containingprivate. To keepprivate/cockpit/**at its original path while staying green, a single negative glob is added:This preserves the "no private/* leaks" guard for everything else. It is the only deviation in this PR from a pure byte-for-byte restore.
Not included
skaffold/skaffold_cloud.yaml— an upstream file with fork edits, not a clean fork-only add; left for a separate decision.