Goal
Make ghst run on macOS and Windows, not just Linux/PipeWire.
Current state
Only one place in the code is hard-coded to Linux: startCapture / findDefaultSink / ensureMonitorVolumeFull in src/main/index.ts (lines ~66-110). It spawns pw-record and shells out to pactl. The platform guard at line 68 throws on non-Linux.
Everything else is already cross-platform:
setContentProtection(true) is already gated on darwin / win32 (src/main/index.ts:161).
- Hotkeys use
CommandOrControl+....
keyStore.ts uses Electron safeStorage — Keychain (macOS) / DPAPI (Windows) work out of the box.
- The VAD / Whisper / copilot pipeline in
src/core/* and the worker renderer is pure web APIs.
Missing:
mac and win targets in package.json#build.
- Cross-platform audio capture strategy.
- CI matrix / signing / notarization.
Proposed approach: electron-audio-loopback in the worker renderer
The README and CLAUDE.md already reference this as the planned cross-platform path. It exposes getDisplayMedia({ audio: true }) with system-audio loopback on macOS 13+ (via ScreenCaptureKit) and Windows 10+ (WASAPI loopback). The captured MediaStream plugs directly into the existing AudioWorklet → VAD → Whisper pipeline that already lives in the worker renderer.
Pros: pure JS, no native module to build/sign/notarize.
Cons: macOS requires Screen Recording permission; a system picker may appear on first capture per session.
Native-module alternatives (ScreenCaptureKit addon on macOS, WASAPI addon on Windows) are possible but add a large build/sign burden. Skip unless the loopback UX is unacceptable.
Plan
- Refactor
startCapture into a CaptureStrategy interface under src/main/capture/, with linux-pipewire.ts (existing logic) and a new strategy that signals the worker renderer to start a getDisplayMedia capture. Pick strategy by process.platform at app.whenReady().
- On non-Linux, the worker renderer owns the audio source: it pipes the loopback
MediaStreamTrack into an AudioContext at 16 kHz and feeds the existing VAD pipeline. Main just sends cmd:capture-start / cmd:capture-stop. Linux keeps the existing pw-record path in main.
- Extend
package.json#build with:
mac: dmg + zip, arm64 and x64, hardened runtime, entitlements for audio input + screen recording, notarization config.
win: nsis, x64.
extendInfoPlist: NSMicrophoneUsageDescription and NSScreenCaptureUsageDescription.
- Extend
.github/workflows/release.yml matrix with macos-14 (arm64), macos-13 (x64), windows-latest. Add secrets for Apple ID notarization (and optionally Windows code-signing cert).
- README: drop the Linux-only badge, add per-OS install sections, document the macOS Screen Recording permission step.
- Tests: unit test for the strategy selector. Capture strategies themselves are integration-only.
Open questions
- Minimum macOS version: recommend 13+ (clean ScreenCaptureKit loopback). Drop 12?
- Apple Developer ID for notarization ($99/yr) — required to ship without Gatekeeper warnings. Acceptable?
- Windows code-signing cert ($200-400/yr EV) — optional but avoids SmartScreen warnings.
Out of scope for this issue
- Linux/Wayland ghost-mode (still no per-window screencast exclude API).
- Migrating away from Groq Whisper.
Goal
Make ghst run on macOS and Windows, not just Linux/PipeWire.
Current state
Only one place in the code is hard-coded to Linux:
startCapture/findDefaultSink/ensureMonitorVolumeFullinsrc/main/index.ts(lines ~66-110). It spawnspw-recordand shells out topactl. The platform guard at line 68 throws on non-Linux.Everything else is already cross-platform:
setContentProtection(true)is already gated ondarwin/win32(src/main/index.ts:161).CommandOrControl+....keyStore.tsuses ElectronsafeStorage— Keychain (macOS) / DPAPI (Windows) work out of the box.src/core/*and the worker renderer is pure web APIs.Missing:
macandwintargets inpackage.json#build.Proposed approach:
electron-audio-loopbackin the worker rendererThe README and
CLAUDE.mdalready reference this as the planned cross-platform path. It exposesgetDisplayMedia({ audio: true })with system-audio loopback on macOS 13+ (viaScreenCaptureKit) and Windows 10+ (WASAPI loopback). The capturedMediaStreamplugs directly into the existing AudioWorklet → VAD → Whisper pipeline that already lives in the worker renderer.Pros: pure JS, no native module to build/sign/notarize.
Cons: macOS requires Screen Recording permission; a system picker may appear on first capture per session.
Native-module alternatives (
ScreenCaptureKitaddon on macOS, WASAPI addon on Windows) are possible but add a large build/sign burden. Skip unless the loopback UX is unacceptable.Plan
startCaptureinto aCaptureStrategyinterface undersrc/main/capture/, withlinux-pipewire.ts(existing logic) and a new strategy that signals the worker renderer to start agetDisplayMediacapture. Pick strategy byprocess.platformatapp.whenReady().MediaStreamTrackinto anAudioContextat 16 kHz and feeds the existing VAD pipeline. Main just sendscmd:capture-start/cmd:capture-stop. Linux keeps the existingpw-recordpath in main.package.json#buildwith:mac: dmg + zip,arm64andx64, hardened runtime, entitlements for audio input + screen recording, notarization config.win: nsis, x64.extendInfoPlist:NSMicrophoneUsageDescriptionandNSScreenCaptureUsageDescription..github/workflows/release.ymlmatrix withmacos-14(arm64),macos-13(x64),windows-latest. Add secrets for Apple ID notarization (and optionally Windows code-signing cert).Open questions
Out of scope for this issue