ghst

Live system-audio transcription overlay. ghst taps whatever your speakers are playing (Meet, Zoom, browser, Spotify — anything mixed to the default sink), runs Silero VAD to gate Whisper hallucinations, streams chunks to Groq whisper-large-v3-turbo for low-latency captions, and can answer end-of-turn with a copilot reply — all in a transparent, always-on-top window.

A free, open-source alternative to paid meeting-copilots like Cluely and Parakeet. Pay only for the Groq API calls (Groq's free tier is generous enough for casual use). 🚧 WIP — v0.1, Linux-only. macOS and Windows are stubbed for the future; the app exits cleanly on those platforms.

Install

Pre-built Linux artifacts (x86_64) are published on the Releases page.

AppImage (any distro)

wget https://github.com/bnovik0v/ghst/releases/latest/download/ghst-x86_64.AppImage
chmod +x ghst-x86_64.AppImage
./ghst-x86_64.AppImage

.deb (Ubuntu / Debian / Mint)

wget https://github.com/bnovik0v/ghst/releases/latest/download/ghst_amd64.deb
sudo apt install ./ghst_amd64.deb
ghst

The .deb declares all runtime deps. For the AppImage, install them yourself:

sudo apt install pipewire-bin pulseaudio-utils libgtk-3-0 libnotify4 libnss3 libxss1 libxtst6 xdg-utils libasound2

Runtime requirements

Linux with PipeWire (Ubuntu 22.10+, Fedora 34+, Arch, recent Mint/Pop!_OS) — PulseAudio-only systems aren't supported.
pipewire-bin (provides pw-record) and pulseaudio-utils (provides pactl).
A free Groq API key — sign up at console.groq.com/keys.

Setup (first run)

Launch ghst. The overlay appears at the top of your screen.
The Settings dialog opens automatically on first launch — paste your Groq API key and click Save.
Press Ctrl+Shift+Space to start listening to your system audio.
Speak / play a video call / podcast — captions appear in the overlay; the copilot suggests replies after each turn.

Your API key is encrypted via your OS keyring (libsecret / gnome-keyring on Linux) and saved to ~/.config/ghst/config.json. It is never sent anywhere except directly to Groq.

Re-opening Settings

Click the ⚙ settings button on the overlay's hotkey row. From there you can:

Save — replace the stored key with a new one.
Remove key — wipe the saved key from disk (the app will fall back to the GROQ_API_KEY env var if you have one set, otherwise prompt again).
Close — dismiss without changes (also: Esc or click outside the dialog).

About you (persona)

The Settings dialog has an About you textarea. Anything you type there (name, role, current company, key projects, strong opinions, tone you want the copilot to take) is injected as background context into every copilot reply, so suggestions stay grounded in your specifics rather than generic boilerplate. Capped at 4000 characters. Stored in plaintext in config.json (it's not a secret); leave it blank to opt out. Edits take effect on the next copilot turn — no restart needed.

Saving transcripts

Toggle Save transcripts to disk in Settings to write a plain-text transcript every time you stop listening. Each session lands in a timestamped .txt file under your chosen folder (defaults to ~/Documents/ghst/transcripts/). Use Browse… to pick a different folder, Open folder to reveal it in your file manager, or Reset to default to restore the default path. Transcripts are written locally only — nothing is uploaded.

Hotkeys

Combo	Action
`Ctrl+Shift+Space`	Start / stop listening
`Ctrl+Shift+Enter`	Ask copilot now (manual)
`Ctrl+Shift+C`	Clear transcript
`Ctrl+Shift+L`	Show / hide overlay

Building from source

git clone https://github.com/bnovik0v/ghst.git
cd ghst
npm ci
npm run dev          # launches Electron in dev mode
npm test             # unit tests
npm run typecheck
npm run dist:linux   # produces dist/*.AppImage and dist/*.deb

For verbose logging set DEBUG=ghst in the environment, or in DevTools run localStorage.setItem("ghst:debug", "1") and reload.

Architecture

Three Electron processes with strict boundaries — this is why the API key never leaves main and why the heavy audio work survives Chromium throttling.

main — owns the pw-record child process, both BrowserWindows, global shortcuts, the encrypted key store (Electron safeStorage), and IPC routing between worker → overlay.
worker renderer (hidden) — runs Silero VAD over the PCM stream, encodes Float32 → 16-bit WAV, calls Groq Whisper, applies LocalAgreement-2 for live committed/tentative captions, filters hallucinations + backchannels, optionally streams a copilot reply.
overlay renderer (transparent, frameless, always-on-top) — renders rolling captions, copilot cards, the listen toggle, and the Settings dialog.

Pure logic is isolated in src/core/* (no Electron imports) so it's all unit-testable: wav.ts, groq.ts, copilot.ts, stream.ts (LocalAgreement), transcript.ts (ring buffer + hallucination filter).

Ghost mode (invisible to screenshare)

macOS / Windows: would use setContentProtection(true) — the window is excluded from screen capture. (Not yet supported on those platforms.)
Linux (Wayland): not possible. There is no per-window exclude API in xdg-desktop-portal. The overlay is visible to any screenshare. Workarounds: second monitor, second device, or external preview.

Troubleshooting

"Captions never appear / VAD never fires." The default sink's monitor source may be muted or attenuated. ghst force-sets it to 100% on start, but if a session manager re-attenuates it, captions will stop. Verify:

pactl get-default-sink
pactl list sources | grep -A 5 monitor

"pw-record: command not found." Install pipewire-bin. The .deb package declares this dep; AppImage users on bare systems need to install it manually.

"App says system-audio capture is Linux-only." Correct — v0.1 ships Linux-only. macOS/Windows support is on the roadmap.

"It captures my microphone instead of system audio." This means PipeWire didn't honor the stream.capture.sink=true property. Make sure you're on PipeWire (not pure PulseAudio): pactl info | grep "Server Name" should mention PipeWire.

Tests

npm test            # one-shot
npm run test:watch  # watch mode
npx vitest run -t "<pattern>"

Pure modules covered:

wav.ts — RIFF/WAVE header, clamping, int16 encoding
groq.ts — multipart form shape, bearer auth, error surfacing
transcript.ts — hallucination filter, backchannel detection, ring buffer
stream.ts — LocalAgreement word- and token-level prefix agreement
copilot.ts — SSE delta parsing, message assembly, AbortSignal handling

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.github/workflows		.github/workflows
build		build
docs		docs
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.release-it.json		.release-it.json
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
electron.vite.config.ts		electron.vite.config.ts
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
tsconfig.web.json		tsconfig.web.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ghst

Install

AppImage (any distro)

.deb (Ubuntu / Debian / Mint)

Runtime requirements

Setup (first run)

Re-opening Settings

About you (persona)

Saving transcripts

Hotkeys

Building from source

Architecture

Ghost mode (invisible to screenshare)

Troubleshooting

Tests

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ghst

Install

AppImage (any distro)

.deb (Ubuntu / Debian / Mint)

Runtime requirements

Setup (first run)

Re-opening Settings

About you (persona)

Saving transcripts

Hotkeys

Building from source

Architecture

Ghost mode (invisible to screenshare)

Troubleshooting

Tests

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages