Restore fork infra dropped by the #2391 main pickup (terraform, cockpit, sops, e2e workflow) by ConstanzeTU · Pull Request #72 · k8sstormcenter/pixie

ConstanzeTU · 2026-06-28T08:58:34Z

What

Restores fork-specific infrastructure that was deleted from main and should never have left it, at its original paths, byte-identical to the pre-deletion tree (610975b27).

Area	Files	Vs. original main
`terraform/**` (auth0 + cloud_deps + cockpit credentials + scripts)	16	byte-identical
`private/cockpit/**` (fork cloud kustomize bundle)	10	byte-identical
`.sops.yaml`	1	byte-identical
`.github/workflows/e2e_log4shell_soc.yaml`	1	byte-identical
`.github/workflows/filename_linter.yaml`	1	+1 line carve-out (see below)

Why

main tip 53d09cc8d is the fork landing of upstream pixie-io#2391 "Create vizier and cloud cert-manager compatible secrets". Upstream pixie-io#2391 only touches certificate handling, but as it landed on this fork it also deleted all of the above fork-only files — an unrelated regression. They were caught while preparing PR 68, re-added there to keep that branch deployable, then moved here (and stripped from PR 68) for a dedicated restore into main.

The one non-restore change

53d09cc8d also brought upstream's filename_linter.yaml onto main, which fails any PR touching a path containing private. To keep private/cockpit/** at its original path while staying green, a single negative glob is added:

- '!private/cockpit/**'

This preserves the "no private/* leaks" guard for everything else. It is the only deviation in this PR from a pure byte-for-byte restore.

Not included

skaffold/skaffold_cloud.yaml — an upstream file with fork edits, not a clean fork-only add; left for a separate decision.

The fork's terraform/ tree (auth0 + cloud_deps + cockpit credentials + scripts, 16 files) was deleted on main by commit 53d09cc, the local landing of upstream pixie-io#2391 'Create vizier and cloud cert-manager compatible secrets'. That upstream commit only touches cert handling; the terraform deletion was an unrelated regression bundled into the fork pickup. Content here is byte-identical to the pre-deletion tree at 610975b (parent of main tip). This belongs on main, not in PR 68.

Same regression as the terraform tree: commit 53d09cc (the fork landing of upstream pixie-io#2391) deleted these fork-only files from main, and they were re-added on PR 68's branch where they don't belong. Moved here for the main-restore PR. - cockpit/**: restored at the post-rename path (private/cockpit/ -> cockpit/) so it passes the upstream filename-linter on main; the workflow file is unchanged from main. - .sops.yaml, .github/workflows/e2e_log4shell_soc.yaml: byte-identical to PR 68's copies.

coderabbitai · 2026-06-28T08:58:52Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 27f8ccc7-8f1b-4941-bae7-1c05d1b11530

📥 Commits

Reviewing files that changed from the base of the PR and between 6737704 and 1ca1624.

📒 Files selected for processing (11)

.github/workflows/filename_linter.yaml
private/cockpit/artifact_config.yaml
private/cockpit/auth_deployment.yaml
private/cockpit/cloud_ingress.yaml
private/cockpit/domain_config.yaml
private/cockpit/kustomization.yaml
private/cockpit/oauth_config.yaml
private/cockpit/plugin_db_updater_job.yaml
private/cockpit/proxy_envoy.yaml
private/cockpit/script_bundles_config.yaml
private/cockpit/servers_transport.yaml

💤 Files with no reviewable changes (9)

private/cockpit/domain_config.yaml
private/cockpit/oauth_config.yaml
private/cockpit/auth_deployment.yaml
private/cockpit/artifact_config.yaml
private/cockpit/servers_transport.yaml
private/cockpit/script_bundles_config.yaml
private/cockpit/proxy_envoy.yaml
private/cockpit/plugin_db_updater_job.yaml
private/cockpit/cloud_ingress.yaml

📝 Walkthrough

Walkthrough

Adds a Terraform-managed Auth0 tenant and cloud_deps module (certs, secrets, kustomize batches), SOPS-encrypted credential storage via Azure Key Vault, Kubernetes cockpit manifests (ConfigMaps, Ingress, Envoy, auth-server, plugin-db-updater Job, Kustomization), and a GitHub Actions workflow for end-to-end log4shell SOC testing with pprof profiling.

Changes

Cockpit Cloud Environment Setup

Layer / File(s)	Summary
SOPS config and encrypted Auth0 secret `.sops.yaml`, `terraform/credentials/cockpit/auth0_config.yaml`	Adds SOPS `creation_rules` mapping `terraform/credentials/cockpit` to an Azure Key Vault key, and adds the SOPS-encrypted Kubernetes Secret holding Auth0 client-id and client-secret.
Auth0 Terraform provider, imports, and generated resources `terraform/kubernetes/auth0/auth0_main.tf`, `terraform/kubernetes/auth0/variables.tf`, `terraform/kubernetes/auth0/terraform.tfvars`, `terraform/kubernetes/auth0/.terraform.lock.hcl`, `terraform/kubernetes/auth0/auth0_import.tf`, `terraform/kubernetes/auth0/generated.tf`	Adds provider config (debug mode, version constraint), `cloud_domain` variable/tfvars, lock file, auto-generated import blocks for existing Auth0 entities, and the full `generated.tf` covering connections, tenant, client, resource server scopes (~1000 scopes), branding, and attack protection.
cloud_deps Terraform: secrets, certs, and kustomize batches `terraform/kubernetes/cloud_deps/provider.tf`, `terraform/kubernetes/cloud_deps/.terraform.lock.hcl`, `terraform/kubernetes/cloud_deps/variables.tf`, `terraform/kubernetes/scripts/create_random_bytes.sh`, `terraform/kubernetes/cloud_deps/core_resource_deps.tf`, `terraform/kubernetes/cloud_deps/cert_manager.tf`, `terraform/kubernetes/cloud_deps/base_cloud_deps.tf`, `terraform/kubernetes/cloud_deps/env_cloud_deps.tf`	Introduces the cloud_deps module: kubeconfig-based provider wiring, input variables, a `/dev/urandom`-backed random-bytes helper script, core Kubernetes secrets (JWT/session/DB keys and Auth0 credentials from remote state), cert-manager chain (self-signed CA → internal issuer → server/client certs; Let's Encrypt → proxy cert), and ordered three-priority kustomize batch resources for Elastic operator and env dependencies.
Cockpit Kubernetes manifests and Kustomization `private/cockpit/...`, `.github/workflows/filename_linter.yaml`	Adds ConfigMaps for domain, OAuth, artifacts, script bundles, and Envoy proxy config; auth-server Deployment sourcing Auth0 credentials from secrets; cloud Ingress with TLS and external-dns annotations; Traefik ServersTransport; plugin-db-updater Job with PostgreSQL init container and projected TLS volumes; Kustomization overlay composing all resources. Also exempts `private/cockpit/**` from the filename linter.

E2E log4shell SOC GitHub Actions workflow

Layer / File(s)	Summary
E2E workflow: SOC stack deploy, harness execution, profiling, and artifact upload `.github/workflows/e2e_log4shell_soc.yaml`	Defines the `e2e-log4shell-soc` manual-dispatch workflow: installs k3s on an Oracle runner, deploys the SOC stack (Pixie/kubescape/ClickHouse/log4j chain) with optional `dx-daemon` image override and profiling env vars, waits for pod readiness, runs canonical harness scripts with "rule in" detection gate, port-forwards pprof/metrics during a live log4shell fire to collect CPU/heap profiles, and uploads all evidence as a 14-day artifact.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

k8sstormcenter/pixie#41: Directly overlaps with the cockpit cloud deployment IaC additions, including terraform/kubernetes/cloud_deps/, create_random_bytes.sh, cert-manager resources, and kustomize-based cloud manifests.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: restoring fork infrastructure and related workflow/config files dropped by the `#2391` pickup.
Description check	✅ Passed	The description is directly aligned with the changeset and explains the restored files and the one intentional linter carve-out.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch restore-fork-infra-main

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 15

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/e2e_log4shell_soc.yaml:
- Around line 36-37: The checkout in the workflow currently leaves persisted
credentials available to later steps, which can expose the read token to
external scripts. Update the existing actions/checkout usage in the
e2e_log4shell_soc job to disable persisted credentials, keeping the checkout
step for the harness scripts but ensuring the token is not stored in local git
config. Refer to the checkout step using actions/checkout in this workflow and
apply the secure checkout setting there.
- Around line 14-24: The workflow_dispatch e2e job currently has no concurrency
control, so duplicate runs can collide on shared k3s state and overwrite
/tmp/evidence. Add a concurrency block to this workflow using a stable group key
for the host-scoped job (for example based on github.workflow plus the selected
inputs such as dx_image and soc_ref), and enable cancel-in-progress or otherwise
prevent overlapping runs in the workflow that defines workflow_dispatch,
dx_image, and soc_ref.
- Around line 39-43: The Install k3s step is using the unpinned get.k3s.io
installer, which makes the privileged e2e environment non-reproducible. Update
the k3s install logic in the workflow step named Install k3s to set a validated
INSTALL_K3S_VERSION before invoking the installer, keeping the rest of the
readiness check and kubectl verification unchanged.
- Around line 102-120: The “Profile dx in real life (pprof + metrics)” step is
not failure-safe when prior setup fails, so it can’t write artifacts or clearly
indicate it was skipped. In the workflow job, update this step to create
/tmp/evidence before any writes and to explicitly detect missing prerequisites
like the dx-daemon pod or a failed port-forward. Use the existing shell block in
the profiling step to emit a clear skip message and exit successfully when those
prerequisites are absent, while keeping the profiling and metric collection
logic intact when they are present.
- Around line 45-70: The deploy step is interpolating untrusted manual inputs
directly into shell commands, which can allow command injection. In the SOC
deployment job, stop embedding inputs.soc_ref and inputs.dx_image inside the run
script and instead pass them through validated environment variables or
precomputed safe outputs before use. Add strict allowlist-style validation for
the ref/image values in the step that performs git clone and kubectl set image,
and only execute the commands after the values have been verified safe.

In `@cockpit/auth_deployment.yaml`:
- Around line 14-19: Make the PL_AUTH0_CLIENT_ID secret lookup mandatory as
well, since it comes from the same cloud-auth0-secrets object as
PL_AUTH0_CLIENT_SECRET. Update the secretKeyRef for PL_AUTH0_CLIENT_ID in the
auth deployment manifest to remove the optional lookup, matching the behavior
already used for the client secret so missing or broken secrets fail at
deployment time.

In `@cockpit/domain_config.yaml`:
- Line 7: PL_DOMAIN_NAME is hard-coded to a specific public domain, creating a
second source of truth that can drift from the configured cloud_domain. Update
the domain config so PL_DOMAIN_NAME is derived from the same domain input used
by the cert/Auth0 layer rather than a literal, and keep the value wired through
the existing configuration path that populates this ConfigMap.

In `@cockpit/plugin_db_updater_job.yaml`:
- Around line 31-32: The updater Job still references the mutable
cloud-plugin-load_db:latest image without a kustomize images transform, so fix
the image reference in the plugin_db_updater_job manifest or add an images
rewrite in kustomization to pin it to an immutable tag or digest. Use the
updater container entry in plugin_db_updater_job and the image handling in
kustomization.yaml to ensure the deployed Job always pulls a stable,
reproducible image.

In `@cockpit/proxy_envoy.yaml`:
- Around line 43-45: The CORS origin allowlist in proxy_envoy.yaml is too narrow
and blocks the cockpit domain. Update the Envoy CORS configuration in the
relevant cors/allow_origin_string_match block to allow the actual cloud domain
used by the deployment, matching the cloud_domain default from variables.tf (via
the cockpit proxy config) instead of only test.austrianopencloudcommunity.org.
Ensure the suffix matches the real browser origin expected by the cockpit
frontend.
- Around line 9-96: The ConfigMap entry for envoy.yaml is using a folded scalar,
which will collapse the YAML structure and break Envoy’s config parsing. Change
the envoy.yaml value to a literal block so the full nested structure is
preserved exactly as written. Keep the fix focused on the envoy.yaml key in
proxy_envoy.yaml and ensure the rendered config remains valid YAML for Envoy to
load.

In `@terraform/kubernetes/auth0/auth0_main.tf`:
- Around line 11-13: Disable the Auth0 provider debug setting in the provider
block so committed Terraform config does not emit request/response details;
update the auth0 provider configuration in auth0_main.tf to keep debug off by
default and only enable it temporarily for local troubleshooting. Use the
provider "auth0" block as the target for this change.

In `@terraform/kubernetes/auth0/generated.tf`:
- Around line 467-476: The Auth0 setup is over-privileged because the
interactive login client and the Management API grant share the same client
identity. Update the Terraform model around the app represented by the
regular_web client and the auth0_client_grant resource so the login application
only keeps callback/authorization_code usage, and move the Management API scopes
to a separate least-privileged M2M client. Make sure the grant resource uses the
new machine-to-machine client_id and that the login client no longer has
tenant-admin scopes attached.

In `@terraform/kubernetes/cloud_deps/cert_manager.tf`:
- Around line 1-4: The lookup in locals.cert_config is failing too early with an
invalid-index error, so the intended validation never gets a chance to run.
Change the cert_details access to use a nullable/safe lookup first in
cert_manager.tf, then have the existing precondition validate that returned
value and surface the custom message. Keep the fix centered around
locals.cert_config and the related precondition block so the missing
cert_details entry is handled by validation instead of indexing.

In `@terraform/kubernetes/cloud_deps/provider.tf`:
- Around line 14-22: The kubeconfig settings in the kubernetes and kustomization
provider blocks are hard-coded to a specific home path and context, making this
module environment-specific. Update the provider configuration to take both the
kubeconfig path and context from inputs instead of embedding
"~/.kube/cockpit-config" and "default". Use existing Terraform input variables
or add new ones, and if you keep a home-directory default, resolve it with
pathexpand() in the provider setup.

In `@terraform/kubernetes/scripts/create_random_bytes.sh`:
- Around line 19-36: The random-bytes generation in create_random_bytes.sh can
hang or emit an empty JSON payload when chars_to_keep is empty or when an
upstream command in the /dev/urandom pipeline fails. Update the script to
validate inputs before the pipeline starts, reject empty or invalid values for
chars_to_keep and num_chars in usage(), and make the pipeline fail fast by
ensuring the script propagates failures from bytes generation instead of still
reaching jq. Use the existing usage() and bytes assignment as the main
touchpoints.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 26709ba6-70c5-4b82-98c9-a0300c8fc6be

📥 Commits

Reviewing files that changed from the base of the PR and between 53d09cc and 6737704.

⛔ Files ignored due to path filters (1)

terraform/kubernetes/cloud_deps/terraform.tfstate is excluded by !**/*.tfstate

📒 Files selected for processing (27)

.github/workflows/e2e_log4shell_soc.yaml
.sops.yaml
cockpit/artifact_config.yaml
cockpit/auth_deployment.yaml
cockpit/cloud_ingress.yaml
cockpit/domain_config.yaml
cockpit/kustomization.yaml
cockpit/oauth_config.yaml
cockpit/plugin_db_updater_job.yaml
cockpit/proxy_envoy.yaml
cockpit/script_bundles_config.yaml
cockpit/servers_transport.yaml
terraform/credentials/cockpit/auth0_config.yaml
terraform/kubernetes/auth0/.terraform.lock.hcl
terraform/kubernetes/auth0/auth0_import.tf
terraform/kubernetes/auth0/auth0_main.tf
terraform/kubernetes/auth0/generated.tf
terraform/kubernetes/auth0/terraform.tfvars
terraform/kubernetes/auth0/variables.tf
terraform/kubernetes/cloud_deps/.terraform.lock.hcl
terraform/kubernetes/cloud_deps/base_cloud_deps.tf
terraform/kubernetes/cloud_deps/cert_manager.tf
terraform/kubernetes/cloud_deps/core_resource_deps.tf
terraform/kubernetes/cloud_deps/env_cloud_deps.tf
terraform/kubernetes/cloud_deps/provider.tf
terraform/kubernetes/cloud_deps/variables.tf
terraform/kubernetes/scripts/create_random_bytes.sh

coderabbitai · 2026-06-28T09:16:45Z

+on:
+  workflow_dispatch:
+    inputs:
+      dx_image:
+        description: dx-daemon image to test (default = .image-tags pin)
+        required: false
+        default: ""
+      soc_ref:
+        description: k8sstormcenter/soc branch
+        required: false
+        default: "218-clickhouse-schema"


🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Add concurrency for this host-scoped e2e.

This job mutates k3s state and writes fixed /tmp/evidence; duplicate dispatches can collide or overwrite evidence.

Proposed fix

soc_ref: description: k8sstormcenter/soc branch required: false default: "218-clickhouse-schema" +concurrency: + group: e2e-log4shell-soc + cancel-in-progress: false permissions:

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

on:

workflow_dispatch:

inputs:

dx_image:

description: dx-daemon image to test (default = .image-tags pin)

required: false

default: ""

soc_ref:

description: k8sstormcenter/soc branch

required: false

default: "218-clickhouse-schema"

on:

workflow_dispatch:

inputs:

dx_image:

description: dx-daemon image to test (default = .image-tags pin)

required: false

default: ""

soc_ref:

description: k8sstormcenter/soc branch

required: false

default: "218-clickhouse-schema"

concurrency:

group: e2e-log4shell-soc

cancel-in-progress: false

🧰 Tools

🪛 zizmor (1.26.1)

[warning] 14-24: insufficient job-level concurrency limits (concurrency-limits): workflow is missing concurrency setting

(concurrency-limits)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/e2e_log4shell_soc.yaml around lines 14 - 24, The workflow_dispatch e2e job currently has no concurrency control, so duplicate runs can collide on shared k3s state and overwrite /tmp/evidence. Add a concurrency block to this workflow using a stable group key for the host-scoped job (for example based on github.workflow plus the selected inputs such as dx_image and soc_ref), and enable cancel-in-progress or otherwise prevent overlapping runs in the workflow that defines workflow_dispatch, dx_image, and soc_ref.

Source: Linters/SAST tools

coderabbitai · 2026-06-28T09:16:45Z

+    - name: Checkout pixie (harness scripts)
+      uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683  # v4.2.2


🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Disable persisted checkout credentials.

The job later runs external repo scripts; leaving the checkout token in local git config unnecessarily exposes the read token to job code.

Proposed fix

- name: Checkout pixie (harness scripts) uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 + with: + persist-credentials: false

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- name: Checkout pixie (harness scripts)

uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

- name: Checkout pixie (harness scripts)

uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

with:

persist-credentials: false

🧰 Tools

🪛 zizmor (1.26.1)

[warning] 36-37: credential persistence through GitHub Actions artifacts (artipacked): does not set persist-credentials: false

(artipacked)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/e2e_log4shell_soc.yaml around lines 36 - 37, The checkout in the workflow currently leaves persisted credentials available to later steps, which can expose the read token to external scripts. Update the existing actions/checkout usage in the e2e_log4shell_soc job to disable persisted credentials, keeping the checkout step for the harness scripts but ensuring the token is not stored in local git config. Refer to the checkout step using actions/checkout in this workflow and apply the secure checkout setting there.

Source: Linters/SAST tools

coderabbitai · 2026-06-28T09:16:45Z

+    - name: Install k3s
+      run: |
+        curl -sfL https://get.k3s.io | sh -s - --write-kubeconfig-mode 644
+        for i in $(seq 1 60); do kubectl get nodes --no-headers 2>/dev/null | grep -q ' Ready' && break; sleep 5; done
+        kubectl get nodes


🔒 Security & Privacy | 🔵 Trivial | ⚡ Quick win

Pin the k3s version before running the installer.

get.k3s.io installs a moving target; set a validated INSTALL_K3S_VERSION so this privileged e2e is reproducible.

Proposed fix pattern

- name: Install k3s run: | + export INSTALL_K3S_VERSION="${INSTALL_K3S_VERSION:?pin the validated k3s version}" curl -sfL https://get.k3s.io | sh -s - --write-kubeconfig-mode 644

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/e2e_log4shell_soc.yaml around lines 39 - 43, The Install k3s step is using the unpinned get.k3s.io installer, which makes the privileged e2e environment non-reproducible. Update the k3s install logic in the workflow step named Install k3s to set a validated INSTALL_K3S_VERSION before invoking the installer, keeping the rest of the readiness check and kubectl verification unchanged.

coderabbitai · 2026-06-28T09:16:45Z

+    - name: Deploy the SOC stack (Pixie + kubescape + ClickHouse + AE + dx + log4j chain)
+      env:
+        PX_CLOUD_ADDR: pixie.austrianopencloudcommunity.org
+        PX_DEPLOY_KEY: ${{ secrets.PX_DEPLOY_KEY }}
+        PX_API_KEY: ${{ secrets.PX_API_KEY }}
+        TS_AUTHKEY: ${{ secrets.TAILSCALE_AUTH_KEY }}
+        DX_ENTLEIN_PAT: ${{ secrets.DX_ENTLEIN_PAT }}            # private entlein/dx image pull
+        CLICKHOUSE_ANALYST_PASSWORD: ${{ secrets.CLICKHOUSE_ANALYST_PASSWORD }}
+        CLICKHOUSE_INGEST_PASSWORD: ${{ secrets.CLICKHOUSE_INGEST_PASSWORD }}
+        CLICKHOUSE_PIXIE_PASSWORD: ${{ secrets.CLICKHOUSE_PIXIE_PASSWORD }}
+      run: |
+        set -euo pipefail
+        sudo apt-get update -qq && sudo apt-get install -y python3-yaml
+        git clone --depth 1 -b "${{ inputs.soc_ref }}" https://github.com/k8sstormcenter/soc soc
+        cd soc
+        make pixie                              # vizier + AE
+        make kubescape || true                  # node-agent (netStreaming)
+        bash tree/clickhouse-lab/install.sh      # forensic_db
+        make log4j                               # vulnerable backend + attacker + dx + SBoBs (managed-by=User)
+        if [ -n "${{ inputs.dx_image }}" ]; then
+          kubectl -n honey set image ds/dx-daemon dx-daemon="${{ inputs.dx_image }}" || true
+        fi
+        # optimal config + enable pprof for the real-life profile (DX_TELEMETRY_CACHE/DX_BENCH
+        # are defaults in main, set here too in case the kit's manifest predates them)
+        kubectl -n honey set env ds/dx-daemon DX_PPROF_ADDR=0.0.0.0:6060 DX_TELEMETRY_CACHE=1 DX_BENCH=pemdirect
+        kubectl -n honey rollout status ds/dx-daemon --timeout=120s


🔒 Security & Privacy | 🔴 Critical | ⚡ Quick win

Move dispatch inputs out of shell templates and validate them.

${{ inputs.soc_ref }} and ${{ inputs.dx_image }} are expanded before shell parsing. A crafted manual input can break out of quotes and run commands with the secrets in this step.

Proposed fix

env: + SOC_REF: ${{ inputs.soc_ref }} + DX_IMAGE: ${{ inputs.dx_image }} PX_CLOUD_ADDR: pixie.austrianopencloudcommunity.org PX_DEPLOY_KEY: ${{ secrets.PX_DEPLOY_KEY }} @@ set -euo pipefail sudo apt-get update -qq && sudo apt-get install -y python3-yaml - git clone --depth 1 -b "${{ inputs.soc_ref }}" https://github.com/k8sstormcenter/soc soc + if ! printf '%s' "$SOC_REF" | grep -Eq '^[A-Za-z0-9._/-]+$'; then + echo "Invalid soc_ref"; exit 1 + fi + git clone --depth 1 --branch "$SOC_REF" --single-branch https://github.com/k8sstormcenter/soc soc @@ - if [ -n "${{ inputs.dx_image }}" ]; then - kubectl -n honey set image ds/dx-daemon dx-daemon="${{ inputs.dx_image }}" || true + if [ -n "$DX_IMAGE" ]; then + if ! printf '%s' "$DX_IMAGE" | grep -Eq '^[A-Za-z0-9._/:@+-]+$'; then + echo "Invalid dx_image"; exit 1 + fi + kubectl -n honey set image ds/dx-daemon "dx-daemon=$DX_IMAGE" fi

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- name: Deploy the SOC stack (Pixie + kubescape + ClickHouse + AE + dx + log4j chain)

env:

PX_CLOUD_ADDR: pixie.austrianopencloudcommunity.org

PX_DEPLOY_KEY: ${{ secrets.PX_DEPLOY_KEY }}

PX_API_KEY: ${{ secrets.PX_API_KEY }}

TS_AUTHKEY: ${{ secrets.TAILSCALE_AUTH_KEY }}

DX_ENTLEIN_PAT: ${{ secrets.DX_ENTLEIN_PAT }} # private entlein/dx image pull

CLICKHOUSE_ANALYST_PASSWORD: ${{ secrets.CLICKHOUSE_ANALYST_PASSWORD }}

CLICKHOUSE_INGEST_PASSWORD: ${{ secrets.CLICKHOUSE_INGEST_PASSWORD }}

CLICKHOUSE_PIXIE_PASSWORD: ${{ secrets.CLICKHOUSE_PIXIE_PASSWORD }}

run: |

set -euo pipefail

sudo apt-get update -qq && sudo apt-get install -y python3-yaml

git clone --depth 1 -b "${{ inputs.soc_ref }}" https://github.com/k8sstormcenter/soc soc

cd soc

make pixie # vizier + AE

make kubescape || true # node-agent (netStreaming)

bash tree/clickhouse-lab/install.sh # forensic_db

make log4j # vulnerable backend + attacker + dx + SBoBs (managed-by=User)

if [ -n "${{ inputs.dx_image }}" ]; then

kubectl -n honey set image ds/dx-daemon dx-daemon="${{ inputs.dx_image }}" || true

fi

# optimal config + enable pprof for the real-life profile (DX_TELEMETRY_CACHE/DX_BENCH

# are defaults in main, set here too in case the kit's manifest predates them)

kubectl -n honey set env ds/dx-daemon DX_PPROF_ADDR=0.0.0.0:6060 DX_TELEMETRY_CACHE=1 DX_BENCH=pemdirect

kubectl -n honey rollout status ds/dx-daemon --timeout=120s

- name: Deploy the SOC stack (Pixie + kubescape + ClickHouse + AE + dx + log4j chain)

env:

SOC_REF: ${{ inputs.soc_ref }}

DX_IMAGE: ${{ inputs.dx_image }}

PX_CLOUD_ADDR: pixie.austrianopencloudcommunity.org

PX_DEPLOY_KEY: ${{ secrets.PX_DEPLOY_KEY }}

PX_API_KEY: ${{ secrets.PX_API_KEY }}

TS_AUTHKEY: ${{ secrets.TAILSCALE_AUTH_KEY }}

DX_ENTLEIN_PAT: ${{ secrets.DX_ENTLEIN_PAT }} # private entlein/dx image pull

CLICKHOUSE_ANALYST_PASSWORD: ${{ secrets.CLICKHOUSE_ANALYST_PASSWORD }}

CLICKHOUSE_INGEST_PASSWORD: ${{ secrets.CLICKHOUSE_INGEST_PASSWORD }}

CLICKHOUSE_PIXIE_PASSWORD: ${{ secrets.CLICKHOUSE_PIXIE_PASSWORD }}

run: |

set -euo pipefail

sudo apt-get update -qq && sudo apt-get install -y python3-yaml

if ! printf '%s' "$SOC_REF" | grep -Eq '^[A-Za-z0-9._/-]+$'; then

echo "Invalid soc_ref"; exit 1

fi

git clone --depth 1 --branch "$SOC_REF" --single-branch https://github.com/k8sstormcenter/soc soc

cd soc

make pixie # vizier + AE

make kubescape || true # node-agent (netStreaming)

bash tree/clickhouse-lab/install.sh # forensic_db

make log4j # vulnerable backend + attacker + dx + SBoBs (managed-by=User)

if [ -n "$DX_IMAGE" ]; then

if ! printf '%s' "$DX_IMAGE" | grep -Eq '^[A-Za-z0-9._/:@+-]+$'; then

echo "Invalid dx_image"; exit 1

fi

kubectl -n honey set image ds/dx-daemon "dx-daemon=$DX_IMAGE"

fi

# optimal config + enable pprof for the real-life profile (DX_TELEMETRY_CACHE/DX_BENCH

# are defaults in main, set here too in case the kit's manifest predates them)

kubectl -n honey set env ds/dx-daemon DX_PPROF_ADDR=0.0.0.0:6060 DX_TELEMETRY_CACHE=1 DX_BENCH=pemdirect

kubectl -n honey rollout status ds/dx-daemon --timeout=120s

🧰 Tools

🪛 zizmor (1.26.1)

[error] 58-58: code injection via template expansion (template-injection): may expand into attacker-controllable code

(template-injection)

[error] 64-64: code injection via template expansion (template-injection): may expand into attacker-controllable code

(template-injection)

[error] 65-65: code injection via template expansion (template-injection): may expand into attacker-controllable code

(template-injection)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/e2e_log4shell_soc.yaml around lines 45 - 70, The deploy step is interpolating untrusted manual inputs directly into shell commands, which can allow command injection. In the SOC deployment job, stop embedding inputs.soc_ref and inputs.dx_image inside the run script and instead pass them through validated environment variables or precomputed safe outputs before use. Add strict allowlist-style validation for the ref/image values in the step that performs git clone and kubectl set image, and only execute the commands after the values have been verified safe.

Source: Linters/SAST tools

coderabbitai · 2026-06-28T09:16:45Z

+    - name: Profile dx in real life (pprof + metrics)
+      if: always()
+      run: |
+        set -uo pipefail
+        POD=$(kubectl -n honey get pod -l app=dx-daemon -o jsonpath='{.items[0].metadata.name}')
+        kubectl -n honey port-forward "$POD" 6060:6060 9095:9095 & PF=$!; sleep 5
+        # 30s CPU profile under a fresh fire + heap, served by DX_PPROF_ADDR=:6060
+        ( bash "$HARNESS/log4shell_fire.sh" >/dev/null 2>&1 || true ) &
+        curl -s --max-time 40 -o /tmp/evidence/dx_cpu.pprof \
+          "http://127.0.0.1:6060/debug/pprof/profile?seconds=30" || true
+        curl -s "http://127.0.0.1:6060/debug/pprof/heap" -o /tmp/evidence/dx_heap.pprof || true
+        curl -s "http://127.0.0.1:9095/metrics" -o /tmp/evidence/dx_metrics.txt || true
+        go tool pprof -top -nodecount=25 /tmp/evidence/dx_cpu.pprof > /tmp/evidence/dx_cpu_top.txt 2>&1 || true
+        kill $PF 2>/dev/null || true
+        echo "=== dx CPU top ==="; head -30 /tmp/evidence/dx_cpu_top.txt
+        echo "=== verdict latency ==="
+        grep -E \
+          "dx_(time_to_verdict|bench_query_duration)_seconds_(sum|count)" \
+          /tmp/evidence/dx_metrics.txt || true


🩺 Stability & Availability | 🟡 Minor | ⚡ Quick win

Make the always-on profiling step failure-safe.

If earlier steps fail before /tmp/evidence or the dx pod exists, this step silently loses profile artifacts. Create the directory and record an explicit skip.

Proposed fix

run: | set -uo pipefail - POD=$(kubectl -n honey get pod -l app=dx-daemon -o jsonpath='{.items[0].metadata.name}') + mkdir -p /tmp/evidence + POD=$(kubectl -n honey get pod -l app=dx-daemon -o jsonpath='{.items[0].metadata.name}' 2>/dev/null || true) + if [ -z "$POD" ]; then + echo "dx-daemon pod not found; profiling skipped" | tee /tmp/evidence/profile_skipped.txt + exit 0 + fi kubectl -n honey port-forward "$POD" 6060:6060 9095:9095 & PF=$!; sleep 5 + trap 'kill "$PF" 2>/dev/null || true' EXIT

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- name: Profile dx in real life (pprof + metrics)

if: always()

run: |

set -uo pipefail

POD=$(kubectl -n honey get pod -l app=dx-daemon -o jsonpath='{.items[0].metadata.name}')

kubectl -n honey port-forward "$POD" 6060:6060 9095:9095 & PF=$!; sleep 5

# 30s CPU profile under a fresh fire + heap, served by DX_PPROF_ADDR=:6060

( bash "$HARNESS/log4shell_fire.sh" >/dev/null 2>&1 || true ) &

curl -s --max-time 40 -o /tmp/evidence/dx_cpu.pprof \

"http://127.0.0.1:6060/debug/pprof/profile?seconds=30" || true

curl -s "http://127.0.0.1:6060/debug/pprof/heap" -o /tmp/evidence/dx_heap.pprof || true

curl -s "http://127.0.0.1:9095/metrics" -o /tmp/evidence/dx_metrics.txt || true

go tool pprof -top -nodecount=25 /tmp/evidence/dx_cpu.pprof > /tmp/evidence/dx_cpu_top.txt 2>&1 || true

kill $PF 2>/dev/null || true

echo "=== dx CPU top ==="; head -30 /tmp/evidence/dx_cpu_top.txt

echo "=== verdict latency ==="

grep -E \

"dx_(time_to_verdict|bench_query_duration)_seconds_(sum|count)" \

/tmp/evidence/dx_metrics.txt || true

- name: Profile dx in real life (pprof + metrics)

if: always()

run: |

set -uo pipefail

mkdir -p /tmp/evidence

POD=$(kubectl -n honey get pod -l app=dx-daemon -o jsonpath='{.items[0].metadata.name}' 2>/dev/null || true)

if [ -z "$POD" ]; then

echo "dx-daemon pod not found; profiling skipped" | tee /tmp/evidence/profile_skipped.txt

exit 0

fi

kubectl -n honey port-forward "$POD" 6060:6060 9095:9095 & PF=$!; sleep 5

trap 'kill "$PF" 2>/dev/null || true' EXIT

# 30s CPU profile under a fresh fire + heap, served by DX_PPROF_ADDR=:6060

( bash "$HARNESS/log4shell_fire.sh" >/dev/null 2>&1 || true ) &

curl -s --max-time 40 -o /tmp/evidence/dx_cpu.pprof \

"http://127.0.0.1:6060/debug/pprof/profile?seconds=30" || true

curl -s "http://127.0.0.1:6060/debug/pprof/heap" -o /tmp/evidence/dx_heap.pprof || true

curl -s "http://127.0.0.1:9095/metrics" -o /tmp/evidence/dx_metrics.txt || true

go tool pprof -top -nodecount=25 /tmp/evidence/dx_cpu.pprof > /tmp/evidence/dx_cpu_top.txt 2>&1 || true

kill $PF 2>/dev/null || true

echo "=== dx CPU top ==="; head -30 /tmp/evidence/dx_cpu_top.txt

echo "=== verdict latency ==="

grep -E \

"dx_(time_to_verdict|bench_query_duration)_seconds_(sum|count)" \

/tmp/evidence/dx_metrics.txt || true

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/e2e_log4shell_soc.yaml around lines 102 - 120, The “Profile dx in real life (pprof + metrics)” step is not failure-safe when prior setup fails, so it can’t write artifacts or clearly indicate it was skipped. In the workflow job, update this step to create /tmp/evidence before any writes and to explicitly detect missing prerequisites like the dx-daemon pod or a failed port-forward. Use the existing shell block in the profiling step to emit a clear skip message and exit successfully when those prerequisites are absent, while keeping the profiling and metric collection logic intact when they are present.

coderabbitai · 2026-06-28T09:16:46Z

+provider "auth0" {
+  debug = true
+}


🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail printf '\n## Files mentioning auth0 provider or debug\n' git ls-files | rg '(^|/)auth0.*\.tf$|terraform/kubernetes/auth0/|provider "auth0"|debug\s*=' -n || true printf '\n## auth0_main.tf with line numbers\n' cat -n terraform/kubernetes/auth0/auth0_main.tf printf '\n## nearby terraform files in auth0 module\n' find terraform/kubernetes/auth0 -maxdepth 1 -type f | sort | xargs -I{} sh -c 'echo "--- {}"; cat -n "{}" | sed -n "1,220p"'

Repository: k8sstormcenter/pixie

Length of output: 18992

🌐 Web query:

Auth0 Terraform provider debug = true logs sensitive data provider documentation

💡 Result:

In the Auth0 Terraform provider, setting debug = true in the provider configuration enables HTTP request and response logging when the TF_LOG environment variable is also set to DEBUG [1][2][3]. Security implications of enabling this debug mode: 1. Sensitive Data Exposure: Enabling debug logging can cause sensitive information, such as authentication tokens, client secrets, or other PII, to be written to standard output or log files in plain text [1][2][4]. Because these logs capture the full HTTP traffic between the provider and the Auth0 Management API, they may bypass standard Terraform CLI redaction mechanisms [4][5]. 2. Best Practices: You should avoid enabling debug mode in non-development environments [6]. If you must use debug logging, ensure that the output is directed to a secure, restricted-access location and that logs are purged promptly [5]. 3. Alternative Data Protection: For managing sensitive data within Auth0 resources, the provider offers specific mechanisms like write-only attributes (e.g., for passwords or API keys) and PII configuration options for resources like log streams to help handle sensitive data more securely than raw logging [7][8][9]. To debug issues without globally exposing sensitive data via the provider's debug flag, consider using more granular Terraform logging options (e.g., TF_LOG_PROVIDER_AUTH0=DEBUG) or platform-specific monitoring tools that do not persist sensitive API request bodies in cleartext [5]. Always treat Terraform log output as potentially containing sensitive information when debug verbosity is enabled [4].

Citations:

1: https://registry.terraform.io/providers/auth0/auth0/1.42.0/docs

2: https://registry.terraform.io/providers/auth0/auth0/latest/docs

3: https://library.tf/providers/auth0/auth0/latest

4: https://docs.hashicorp.com/terraform/plugin/log/filtering

5: https://developer.hashicorp.com/terraform/plugin/log/managing

6: https://docs.hashicorp.com/terraform/plugin/debugging

7: https://registry.terraform.io/providers/auth0/auth0/latest/docs/resources/log_stream

8: auth0/terraform-provider-auth0@9563820

9: Support for Ephemeral Values in Terraform Auth0 Provider auth0/terraform-provider-auth0#1218

Disable Auth0 provider debug logging. debug = true can emit Auth0 request/response details, including secrets, into Terraform logs when debug logging is enabled. Keep it off in committed config and enable it only in local troubleshooting.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@terraform/kubernetes/auth0/auth0_main.tf` around lines 11 - 13, Disable the Auth0 provider debug setting in the provider block so committed Terraform config does not emit request/response details; update the auth0 provider configuration in auth0_main.tf to keep debug off by default and only enable it temporarily for local troubleshooting. Use the provider "auth0" block as the target for this change.

coderabbitai · 2026-06-28T09:16:46Z

+# __generated__ by Terraform from "cgr_lgvjTwTNm53mojKq"
+resource "auth0_client_grant" "akm0sumjg95tcrlp390pxzjozkkyndt_https_k8sstormcenter_eu_auth0_com_api_v2" {
+  allow_all_scopes            = false
+  allow_any_organization      = false
+  audience                    = "https://k8sstormcenter.eu.auth0.com/api/v2/"
+  authorization_details_types = []
+  client_id                   = "5akM0SumJG95tcRlP390pXzJoZkkYNDT"
+  organization_usage          = null
+  scopes                      = ["read:users", "update:users", "delete:users", "create:users", "read:users_app_metadata", "update:users_app_metadata", "delete:users_app_metadata", "create:users_app_metadata", "read:user_custom_blocks", "create:user_custom_blocks", "delete:user_custom_blocks", "create:user_tickets", "read:logs_users", "read:user_idp_tokens", "create:user_attribute_profiles", "read:user_attribute_profiles", "update:user_attribute_profiles", "delete:user_attribute_profiles", "read:user_effective_permissions", "read:user_effective_roles", "read:user_role_source_groups", "read:user_permission_source_roles"]
+  subject_type                = "client"


🔒 Security & Privacy | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail file='terraform/kubernetes/auth0/generated.tf' echo '=== Around lines 450-490 ===' sed -n '450,490p' "$file" | cat -n echo echo '=== Around lines 1688-1765 ===' sed -n '1688,1765p' "$file" | cat -n echo echo '=== Search for the client id 5akM0SumJG95tcRlP390pXzJoZkkYNDT ===' rg -n '5akM0SumJG95tcRlP390pXzJoZkkYNDT|app_type =|refresh_token|authorization_code|client_grant' "$file"

Repository: k8sstormcenter/pixie

Length of output: 7746

🏁 Script executed:

#!/bin/bash set -euo pipefail sed -n '580,620p' terraform/kubernetes/auth0/generated.tf | cat -n

Repository: k8sstormcenter/pixie

Length of output: 1761

Split the interactive login client from the Management API grant. The same client_id is used for callbacks/authorization_code on the app at 1698-1758 and is also granted Auth0 Management API scopes here, so a leaked login secret would expose tenant-admin access too. Move the Management API scopes to a separate least-privileged M2M client and keep the login app modeled as regular_web.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@terraform/kubernetes/auth0/generated.tf` around lines 467 - 476, The Auth0 setup is over-privileged because the interactive login client and the Management API grant share the same client identity. Update the Terraform model around the app represented by the regular_web client and the auth0_client_grant resource so the login application only keeps callback/authorization_code usage, and move the Management API scopes to a separate least-privileged M2M client. Make sure the grant resource uses the new machine-to-machine client_id and that the login client no longer has tenant-admin scopes attached.

coderabbitai · 2026-06-28T09:16:46Z

+locals {
+  cert_subdomain = var.cloud_domain
+  cert_config    = var.cert_details[var.cloud_domain]
+}


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

The custom validation never catches a missing cert_details entry.

local.cert_config = var.cert_details[var.cloud_domain] can throw Terraform's invalid-index error before the precondition runs, so operators won't get the intended message. Make the lookup nullable first, then validate that result.

Suggested fix

locals { cert_subdomain = var.cloud_domain - cert_config = var.cert_details[var.cloud_domain] + cert_config = lookup(var.cert_details, var.cloud_domain, null) } @@ lifecycle { precondition { - condition = contains(keys(var.cert_details), var.cloud_domain) + condition = local.cert_config != null error_message = "cloud_domain '${var.cloud_domain}' must have a corresponding entry in cert_details." } }

Also applies to: 48-53

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@terraform/kubernetes/cloud_deps/cert_manager.tf` around lines 1 - 4, The lookup in locals.cert_config is failing too early with an invalid-index error, so the intended validation never gets a chance to run. Change the cert_details access to use a nullable/safe lookup first in cert_manager.tf, then have the existing precondition validate that returned value and surface the custom message. Keep the fix centered around locals.cert_config and the related precondition block so the missing cert_details entry is handled by validation instead of indexing.

coderabbitai · 2026-06-28T09:16:46Z

+provider "kubernetes" {
+  config_path    = "~/.kube/cockpit-config"
+  config_context = "default"
+}
+
+provider "kustomization" {
+  context = "default"
+  kubeconfig_path = "~/.kube/cockpit-config"
+}


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail printf '--- file list ---\n' git ls-files terraform/kubernetes/cloud_deps printf '\n--- provider.tf ---\n' cat -n terraform/kubernetes/cloud_deps/provider.tf printf '\n--- related terraform files ---\n' fd -a -t f 'variables.tf|*.tf' terraform/kubernetes/cloud_deps 2>/dev/null | sed 's#^\./##' | sort printf '\n--- search for kubeconfig/context inputs/usages ---\n' rg -n --hidden --glob '!**/.terraform/**' --glob '!**/.git/**' \ 'kubeconfig_path|kubeconfig_context|config_context|config_path|context\s*=' \ terraform/kubernetes/cloud_deps

Repository: k8sstormcenter/pixie

Length of output: 249

🏁 Script executed:

#!/bin/bash set -euo pipefail printf -- '--- file list ---\n' git ls-files terraform/kubernetes/cloud_deps printf -- '\n--- provider.tf ---\n' cat -n terraform/kubernetes/cloud_deps/provider.tf printf -- '\n--- related terraform files ---\n' fd -a -t f 'variables.tf|*.tf' terraform/kubernetes/cloud_deps 2>/dev/null | sed 's#^\./##' | sort printf -- '\n--- search for kubeconfig/context inputs/usages ---\n' rg -n --hidden --glob '!**/.terraform/**' --glob '!**/.git/**' \ 'kubeconfig_path|kubeconfig_context|config_context|config_path|context\s*=' \ terraform/kubernetes/cloud_deps

Repository: k8sstormcenter/pixie

Length of output: 1194

🏁 Script executed:

#!/bin/bash set -euo pipefail printf -- '--- variables.tf ---\n' cat -n terraform/kubernetes/cloud_deps/variables.tf printf -- '\n--- other terraform files in module ---\n' for f in terraform/kubernetes/cloud_deps/*.tf; do [ "$(basename "$f")" = "provider.tf" ] && continue printf -- '\n### %s ###\n' "$f" cat -n "$f" done printf -- '\n--- search for cloud_deps references ---\n' rg -n --hidden --glob '!**/.terraform/**' --glob '!**/.git/**' 'cloud_deps|cockpit-config|kubeconfig|kustomization' .

Repository: k8sstormcenter/pixie

Length of output: 48401

🌐 Web query:

Terraform Kubernetes provider config_path tilde expansion kubeconfig path docs

💡 Result:

The Terraform Kubernetes provider documentation explicitly uses tilde () expansion for the config_path argument in its examples [1][2][3]. While this syntax is commonly used in Terraform provider configuration blocks, it is important to note that the provider expects an absolute path or a path relative to the working directory [4]. If you encounter issues where tilde expansion is not working as expected—particularly in CI/CD environments or non-standard file systems—you can use the built-in Terraform function pathexpand() to resolve the tilde to an absolute path before passing it to the provider [4]: provider "kubernetes" { config_path = pathexpand("/.kube/config") } Key considerations for config_path and config_paths: * The provider allows you to specify the kubeconfig location via the config_path attribute or the KUBE_CONFIG_PATH environment variable [1][3]. * The provider supports multiple paths using the config_paths attribute or the KUBE_CONFIG_PATHS environment variable [1][3]. * Unlike kubectl, the Terraform Kubernetes provider does not automatically read the standard KUBECONFIG environment variable by default [5][6]. You must explicitly configure the path in your Terraform block or provide it via the provider-specific environment variables [1][3].

Citations:

1: https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs

2: https://registry.terraform.io/providers/hashicorp/kubernetes/2.37.1/docs/guides/getting-started.html

3: https://registry.terraform.io/providers/hashicorp/kubernetes/3.0.1/docs

4: https://stackoverflow.com/questions/75484801/terraform-kubernetes-provider-fails-on-github-action-with-following-config-pat

5: Fall back to default kubernetes client config to support KUBECONFIG hashicorp/terraform-provider-kubernetes#1976

6: Source config_path provider argument from $KUBECONFIG env var hashicorp/terraform-provider-kubernetes#1175

🌐 Web query:

kbst kustomization provider kubeconfig_path context docs tilde expansion

💡 Result:

The kbst/kustomization Terraform provider does not automatically expand the tilde () character in the kubeconfig_path attribute [1][2]. While some Terraform providers or core features may accept tilde-prefixed paths, it is a known behavior in the Terraform ecosystem that many providers do not perform shell-style tilde expansion [3]. To use a path in your home directory with the kubeconfig_path attribute, you should use the Terraform pathexpand function [4][3]. Recommended approach: Use the pathexpand function within a local variable or directly in the provider configuration to ensure the path is correctly resolved to an absolute path before it is passed to the provider [3]. Example usage: provider "kustomization" { kubeconfig_path = pathexpand("/.kube/config") } Alternatively, if you are setting this via an environment variable (e.g., KUBECONFIG_PATH), the shell environment may handle the expansion before Terraform receives the string, depending on your shell configuration and how the variable is exported [5][1]. However, explicitly using pathexpand in your HCL code is the most reliable way to ensure portability across different systems [3].

Citations:

1: https://registry.terraform.io/providers/kbst/kustomization/latest/docs

2: https://registry.terraform.io/providers/kbst/kustomization/0.9.5/docs

3: https://registry.terraform.io/providers/tehcyx/kind/latest/docs/resources/cluster

4: core: Add pathexpand interpolation function hashicorp/terraform#11277

5: https://github.com/kbst/terraform-provider-kustomization

Parameterize the kubeconfig settings
terraform/kubernetes/cloud_deps/provider.tf should not hard-code ~/.kube/cockpit-config and default. ~ is not a reliable path expansion here, and the fixed context makes this module environment-specific. Expose both as inputs, using pathexpand() if you want a home-directory default.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@terraform/kubernetes/cloud_deps/provider.tf` around lines 14 - 22, The kubeconfig settings in the kubernetes and kustomization provider blocks are hard-coded to a specific home path and context, making this module environment-specific. Update the provider configuration to take both the kubeconfig path and context from inputs instead of embedding "~/.kube/cockpit-config" and "default". Use existing Terraform input variables or add new ones, and if you keep a home-directory default, resolve it with pathexpand() in the provider setup.

coderabbitai · 2026-06-28T09:16:46Z

+set -e
+
+usage() {
+    if [ "$#" -ne 2 ]; then
+        echo "Illegal number of parameters"
+        echo "Usage: $0 <characters_to_keep> <number_of_characters>"
+        exit 1
+    fi
+}
+
+usage "$@"
+
+chars_to_keep=$1
+num_chars=$2
+
+bytes=$(< /dev/urandom tr -dc "${chars_to_keep}" | fold -w "$num_chars" | head -n 1)
+
+jq -n --arg output "$bytes" '{"output":$output}'


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Fail fast before entering the /dev/urandom pipeline.

An empty chars_to_keep makes tr -dc reject every byte, so head -n 1 never returns; and without pipefail, an upstream failure can still emit {"output":""}. That can hang terraform apply or generate empty secrets.

💡 Proposed fix

- set -e + set -euo pipefail ... usage "$@" chars_to_keep=$1 num_chars=$2 + +if [ -z "$chars_to_keep" ] || ! [[ "$num_chars" =~ ^[1-9][0-9]*$ ]]; then + echo "Usage: $0 <characters_to_keep> <number_of_characters>" >&2 + exit 1 +fi -bytes=$(< /dev/urandom tr -dc "${chars_to_keep}" | fold -w "$num_chars" | head -n 1) +bytes=$(LC_ALL=C tr -dc "${chars_to_keep}" < /dev/urandom | fold -w "$num_chars" | head -n 1)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

set -e

usage() {

if [ "$#" -ne 2 ]; then

echo "Illegal number of parameters"

echo "Usage: $0 <characters_to_keep> <number_of_characters>"

exit 1

fi

}

usage "$@"

chars_to_keep=$1

num_chars=$2

bytes=$(< /dev/urandom tr -dc "${chars_to_keep}" | fold -w "$num_chars" | head -n 1)

jq -n --arg output "$bytes" '{"output":$output}'

set -euo pipefail

usage() {

if [ "$#" -ne 2 ]; then

echo "Illegal number of parameters"

echo "Usage: $0 <characters_to_keep> <number_of_characters>"

exit 1

fi

}

usage "$@"

chars_to_keep=$1

num_chars=$2

if [ -z "$chars_to_keep" ] || ! [[ "$num_chars" =~ ^[1-9][0-9]*$ ]]; then

echo "Usage: $0 <characters_to_keep> <number_of_characters>" >&2

exit 1

fi

bytes=$(LC_ALL=C tr -dc "${chars_to_keep}" < /dev/urandom | fold -w "$num_chars" | head -n 1)

jq -n --arg output "$bytes" '{"output":$output}'

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@terraform/kubernetes/scripts/create_random_bytes.sh` around lines 19 - 36, The random-bytes generation in create_random_bytes.sh can hang or emit an empty JSON payload when chars_to_keep is empty or when an upstream command in the /dev/urandom pipeline fails. Update the script to validate inputs before the pipeline starts, reject empty or invalid values for chars_to_keep and num_chars in usage(), and make the pipeline fail fast by ensuring the script propagates failures from bytes generation instead of still reaching jq. Use the existing usage() and bytes assignment as the main touchpoints.

…name-linter Reverts the private/cockpit/ -> cockpit/ rename so this restore matches the original main tree exactly: all 10 cockpit files (incl. kustomization.yaml with its original ../../k8s/cloud relative paths) are now byte-identical to the pre-deletion state at 610975b. Upstream main carries filename_linter.yaml (added by pixie-io#2391), which fails any PR touching a 'private' path. Adding a single '!private/cockpit/**' negative glob keeps that guard for everything else while letting this known fork-cloud config through. This is the only deviation in the PR from a pure byte-for-byte restore.

entlein

restoring original commits that were thrown out of main

pixie-agent added 2 commits June 28, 2026 08:54

coderabbitai Bot requested changes Jun 28, 2026

View reviewed changes

coderabbitai Bot approved these changes Jun 29, 2026

View reviewed changes

entlein approved these changes Jun 29, 2026

View reviewed changes

		- name: Checkout pixie (harness scripts)
		uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2

Uh oh!

Conversation

ConstanzeTU commented Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

The one non-restore change

Not included

Uh oh!

coderabbitai Bot commented Jun 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 28, 2026

Choose a reason for hiding this comment

Uh oh!

entlein left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ConstanzeTU commented Jun 28, 2026 •

edited

Loading

coderabbitai Bot commented Jun 28, 2026 •

edited

Loading