|
1 | | -# CLAUDE.md |
2 | | - |
3 | | -This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
4 | | - |
5 | | -## Build Commands |
6 | | - |
7 | | -**NixOS (tom):** |
8 | | -```bash |
9 | | -sudo nixos-rebuild switch --flake .#tom # apply system changes |
10 | | -nix build .#nixosConfigurations.tom.config.system.build.toplevel # build without applying |
11 | | -``` |
12 | | - |
13 | | -**macOS (darwin):** |
14 | | -```bash |
15 | | -sudo darwin-rebuild switch --flake .#"$(hostname)" # apply system changes |
16 | | -nix build .#darwinConfigurations.ezmbp24.local.system # build puma |
17 | | -nix build .#darwinConfigurations.eztim25.local.system # build tim |
18 | | -nix build .#darwinConfigurations.edenzim-ltmbn8v.internal.salesforce.com.system # build work |
19 | | -``` |
20 | | - |
21 | | -**Cloud infrastructure:** |
22 | | -```bash |
23 | | -nix develop # enter shell with opentofu |
24 | | -./cloud.sh plan # preview changes |
25 | | -./cloud.sh apply # deploy |
26 | | -``` |
27 | | - |
28 | | -**Secrets:** |
29 | | -```bash |
30 | | -sops machines/tom/secrets/vault.yaml # edit encrypted secrets (requires age key) |
31 | | -``` |
32 | | - |
33 | | -**Services (tom):** |
34 | | -```bash |
35 | | -journalctl -u <service> -n 50 # recent logs |
36 | | -journalctl -u <service> -f # follow logs |
37 | | -systemctl status <service> # service status |
38 | | -systemctl restart <service> # restart service |
39 | | -``` |
40 | | - |
41 | | -## Architecture |
42 | | - |
43 | | -This is a Nix flake managing four machines with Home Manager, sops-nix for secrets, and impermanence for ephemeral root on NixOS. |
44 | | - |
45 | | -### Machines |
46 | | - |
47 | | -| Machine | Hostname | OS | Notes | |
48 | | -|---------|----------|----|-------| |
49 | | -| tom | tom | NixOS (x86_64-linux) | Desktop/server, NVIDIA GPU, impermanence, Plasma 6 | |
50 | | -| puma | ezmbp24.local | macOS (aarch64-darwin) | Personal laptop | |
51 | | -| tim | eztim25.local | macOS (aarch64-darwin) | Home server, GitHub runners | |
52 | | -| work | edenzim-ltmbn8v.internal.salesforce.com | macOS (aarch64-darwin) | Work laptop | |
53 | | - |
54 | | -### Configuration Layers |
55 | | - |
56 | | -- `flake.nix` - Defines all inputs, machine outputs, and wires modules together. Each machine gets `configuration.nix` (system), `home.nix` (machine-specific user config), and `programs/home.nix` (shared user config). |
57 | | -- `programs/home.nix` - Shared Home Manager config imported by every machine. Contains languages, LSPs, formatters, and program module imports (neovim, zsh, tmux, git, etc.). |
58 | | -- `machines/{name}/home.nix` - Machine-specific Home Manager overrides. Imports from `programs/` for shared configs, from `machines/{name}/programs/` for local overrides. |
59 | | -- `machines/{name}/configuration.nix` - System-level NixOS/darwin config. Imports from `machines/{name}/services/`, `security/`, `hardware/`, etc. |
60 | | - |
61 | | -### Impermanence (tom only) |
62 | | - |
63 | | -tom uses btrfs subvolumes (`root`, `nix`, `persistent`) with an ephemeral root recreated on each boot via `machines/tom/start.sh` in initrd. Old root snapshots are kept for 30 days. Directories/files that must survive reboots are declared in `environment.persistence."/persistent"` in `machines/tom/configuration.nix`. |
64 | | - |
65 | | -**DynamicUser conflict:** NixOS services with `DynamicUser=true` store state in `/var/lib/private/{service}` (symlinked from `/var/lib/{service}`). Impermanence bind mounts on `/var/lib/{service}` conflict with this, causing "Device or resource busy". Fix by either persisting `/var/lib/private/{service}` or switching to a static user with `DynamicUser = lib.mkForce false`. |
66 | | - |
67 | | -### Secrets (sops-nix) |
68 | | - |
69 | | -All secrets are age-encrypted YAML/JSON files under `machines/{name}/secrets/`. The age key location differs per OS: `/var/lib/sops-nix/key.txt` on NixOS (persisted through impermanence), `~/Library/Application Support/sops/age/keys.txt` on macOS. Secret declarations and path mappings live in each machine's `configuration.nix` under the `sops` attribute set. |
70 | | - |
71 | | -Every secret must have explicit `owner` and `group`. Secrets are owned by the service user that reads them, not root. Each service runs as a dedicated user aligned per project (e.g., runner tokens owned by the runner user, minecraft secrets owned by `minecraft`). Only use root for secrets that genuinely require it (tailscale, SSH host keys, wireguard, password). |
72 | | - |
73 | | -### CI |
74 | | - |
75 | | -`versioning.yml` runs daily on self-hosted runners (tim then tom). It updates `flake.lock`, builds all darwin and NixOS configurations, and auto-merges to main on success. |
76 | | - |
77 | | -### Cloud Proxy |
78 | | - |
79 | | -The EC2 instance (`cloud/`) runs nginx as a reverse proxy with automatic ACME certificates. Public traffic arrives at the instance, terminates TLS, and forwards through a WireGuard tunnel to tom (`10.100.0.2`). Each proxied domain needs: |
80 | | - |
81 | | -1. `cloud/proxy.tf` — Route53 A record pointing to the EC2 instance |
82 | | -2. `cloud/configuration.nix` — ACME cert entry and nginx virtualHost with proxyPass to tom's port, plus the port in `allowedTCPPorts` |
83 | | -3. `machines/tom/configuration.nix` — TCP port in firewall `allowedTCPPorts` |
84 | | - |
85 | | -For new top-level domains (not subdomains of existing zones), also add the zone ID to `cloud/tofu.auto.tfvars.json` under `proxy_hosted_zones`. |
86 | | - |
87 | | -| Domain | Tom Port | Service | |
88 | | -|--------|----------|---------| |
89 | | -| api.o526.net | 8083 | endpoints | |
90 | | -| dev.o526.net | 3000 | blog preview | |
91 | | -| git.o526.net | 23231 (SSH stream) | soft-serve | |
92 | | -| o526.net | 4321 | blog | |
93 | | -| quintus.sh | 5000 | quintus | |
94 | | -| todos.guide | 8082 | todos | |
95 | | -| tom.o526.net | 25565 (TCP stream) | minecraft | |
96 | | - |
97 | | -**TCP stream proxying:** For non-HTTP services (like git SSH or Minecraft), use nginx `streamConfig` with TCP forwarding instead of ACME/virtualHost. The cloud proxy forwards the port directly to tom over WireGuard. This requires a security group ingress rule and Route53 record in `cloud/proxy.tf`, but no ACME cert or virtualHost. |
98 | | - |
99 | | -### GitHub Runners (tom) |
100 | | - |
101 | | -Runner configuration lives in `machines/tom/services/github-runners/default.nix`. Each runner needs a corresponding secret declaration in `machines/tom/configuration.nix`. |
102 | | - |
103 | | -**Multiple runners per repo:** Supported. Each runner needs a unique nix attribute name, unique GitHub `name`, and its own token. They can share user/group. The slacks repo uses 9 runners named after celestial bodies (mercury, venus, earth, mars, jupiter, saturn, uranus, neptune, pluto). |
104 | | - |
105 | | -**Secret path conventions:** |
106 | | -- Single runner: `github/runners/{repo}` (e.g., `github/runners/blog`) |
107 | | -- Multiple runners: `github/runners/{repo}/{runner}` (e.g., `github/runners/slacks/mercury`) |
108 | | - |
109 | | -**extraPackages:** Adds tools to the runner environment. Use this for CLI tools needed by workflows (e.g., `pkgs.gh` for GitHub CLI). The "self-hosted" label is added automatically. |
110 | | - |
111 | | -**Polkit:** If a runner's workflow restarts a systemd service, add a polkit rule in `machines/tom/security/polkit/50-runners.rules` granting the runner's user permission to manage that unit. Runners use `NoNewPrivileges` which blocks sudo, so polkit is the only option. |
112 | | - |
113 | | -### Systemd Services (tom) |
114 | | - |
115 | | -Custom systemd services live in `machines/tom/systemd/services/`. Each service requires a corresponding user and group declared in `machines/tom/configuration.nix` under `users.users` and `users.groups`. Services with secrets also need sops declarations in the same file. Follow the pattern of existing services (snaek, tails, todos) when adding new ones. |
116 | | - |
117 | | -**Non-root services using `nix run`** need a writable cache directory. Set `CacheDirectory = "{service}"` in `serviceConfig` and `HOME`/`XDG_CACHE_HOME` to `/var/cache/{service}` in `environment`. Without this, `nix run` fails because the user has no home or cache path. |
118 | | - |
119 | | -**Privileged ports** (below 1024) require `AmbientCapabilities = [ "CAP_NET_BIND_SERVICE" ]` in `serviceConfig` when the service runs as a non-root user. |
120 | | - |
121 | | -### Polkit |
122 | | - |
123 | | -Use polkit (not sudo) when services need to restart other services. GitHub runners set `NoNewPrivileges` internally via prctl, which blocks sudo regardless of systemd overrides. Polkit rules in `machines/tom/security/polkit/` grant specific users permission to manage specific units without privilege escalation. |
124 | | - |
125 | | -### NixOS Privilege Escalation |
126 | | - |
127 | | -- Sudo binary lives at `/run/wrappers/bin/sudo` (setuid wrapper, not in nix store) |
128 | | -- `security.sudo.execWheelOnly = true` restricts sudo execution to wheel group at filesystem level |
129 | | -- `pkgs.sudo` in extraPackages doesn't provide setuid - use full path to wrapper |
130 | | -- GitHub runners sandbox with `NoNewPrivileges`, `PrivateUsers`, `RestrictSUIDSGID` - these block sudo even with `serviceOverrides` |
131 | | - |
132 | | -### Soft Serve (tom) |
133 | | - |
134 | | -SSH-only git server at `machines/tom/services/soft-serve/`. Runs as static user `git` (not DynamicUser) on port 23231. Cloud proxy forwards port 22 → tom:23231 via nginx TCP stream. Key quirks: |
135 | | -- `initial_admin_keys` only applies on first DB init — wipe `/var/lib/soft-serve` to re-initialize |
136 | | -- NixOS module sets `UMask = "0027"` stripping execute from hooks — override with `lib.mkForce "0022"` |
137 | | -- Use `--sync-hooks` flag to fix stale nix store paths after rebuilds |
138 | | -- Restic backups to S3 bucket `tom.git` |
139 | | - |
140 | | -## Working Conventions |
141 | | - |
142 | | -- Use `man configuration.nix` for NixOS option documentation instead of searching online. |
143 | | -- Never read files from the nix store. |
144 | | -- Use the Plan subagent for designing implementation approaches instead of the Explore subagent. |
145 | | -- Cloud infrastructure, tom, and tom's GitHub runners for repos are interconnected. Use existing patterns across these as reference when making changes. |
146 | | -- Maintain alphabetical ordering throughout config files: imports, firewall ports, security groups, Route53 records, nginx streams, users/groups. |
| 1 | +@../AGENTS.md |
0 commit comments