Java (Javalin) demo app for Deepgram Live Transcription.
- Backend: Java (Javalin) (Java) on port 8081
- Frontend: Vite + vanilla JS on port 8080 (git submodule:
live-transcription-html) - API type: WebSocket —
WS /api/live-transcription - Deepgram API: Live Speech-to-Text (
wss://api.deepgram.com/v1/listen) - Auth: JWT session tokens via
/api/session(WebSocket auth usesaccess_token.<jwt>subprotocol)
| File | Purpose |
|---|---|
src/main/java/com/deepgram/starter/App.java |
Main backend — API endpoints and WebSocket proxy |
deepgram.toml |
Metadata, lifecycle commands, tags |
Makefile |
Standardized build/run targets |
sample.env |
Environment variable template |
frontend/main.js |
Frontend logic — UI controls, WebSocket connection, audio streaming |
frontend/index.html |
HTML structure and UI layout |
deploy/Dockerfile |
Production container (Caddy + backend) |
deploy/Caddyfile |
Reverse proxy, rate limiting, static serving |
# Initialize (clone submodules + install deps)
make init
# Set up environment
test -f .env || cp sample.env .env # then set DEEPGRAM_API_KEY
# Start both servers
make start
# Backend: http://localhost:8081
# Frontend: http://localhost:8080Start (recommended):
make startStart separately:
# Terminal 1 — Backend
mvn compile exec:java
# Terminal 2 — Frontend
cd frontend && corepack pnpm run dev -- --port 8080 --no-openStop all:
lsof -ti:8080,8081 | xargs kill -9 2>/dev/nullClean rebuild:
rm -rf target frontend/node_modules frontend/.vite
make init- Backend:
pom.xml— Uses Maven for dependency management. Javalin framework for HTTP/WebSocket. - Frontend:
frontend/package.json— Vite dev server - Submodules:
frontend/(live-transcription-html),contracts/(starter-contracts)
Install: mvn dependency:resolve
Frontend: cd frontend && corepack pnpm install
| Endpoint | Method | Auth | Purpose |
|---|---|---|---|
/api/session |
GET | None | Issue JWT session token |
/api/metadata |
GET | None | Return app metadata (useCase, framework, language) |
/api/live-transcription |
WS | JWT | Streams microphone audio to Deepgram for real-time transcription. |
The WebSocket connection URL passes parameters to Deepgram. Find where the Deepgram WebSocket URL is constructed in the backend and modify defaults:
| Parameter | Default | Options | Effect |
|---|---|---|---|
model |
nova-3 |
nova-3, nova-2, base |
STT model |
language |
en |
Any BCP-47 code | Transcription language |
smart_format |
true |
true/false |
Smart formatting |
encoding |
linear16 |
linear16, opus, flac |
Audio encoding |
sample_rate |
16000 |
8000, 16000, 44100, 48000 |
Audio sample rate |
channels |
1 |
1, 2 |
Mono or stereo |
These can be appended to the Deepgram WebSocket URL as query parameters:
| Feature | Parameter | Example | Effect |
|---|---|---|---|
| Interim results | interim_results |
true |
Show partial transcripts while speaking |
| Endpointing | endpointing |
300 |
Silence duration (ms) before finalization |
| Utterance end | utterance_end_ms |
1000 |
Detect end of utterance |
| VAD events | vad_events |
true |
Voice activity detection events |
| Diarization | diarize |
true |
Speaker identification |
| Punctuation | punctuate |
true |
Auto-punctuation |
| Keywords | keywords |
deepgram:2 |
Boost keyword with weight |
| No delay | no_delay |
true |
Minimize latency (may reduce accuracy) |
Backend: Append params to the Deepgram URL in the WebSocket proxy handler.
Frontend: The frontend sends these as query params when opening the WebSocket. To add a UI control for a new param, edit frontend/main.js — add an input/checkbox and include it in the URLSearchParams when connecting.
If changing from browser microphone (Linear16) to another source:
- Update
encodingandsample_rateparams - The frontend captures audio via
AudioContextat 16kHz and converts Float32 → Int16 PCM - If your audio source uses a different format, modify the frontend audio processing pipeline
The frontend is a git submodule from deepgram-starters/live-transcription-html. To modify:
- Edit files in
frontend/— this is the working copy - Test locally — changes reflect immediately via Vite HMR
- Commit in the submodule:
cd frontend && git add . && git commit -m "feat: description" - Push the frontend repo:
cd frontend && git push origin main - Update the submodule ref:
cd .. && git add frontend && git commit -m "chore(deps): update frontend submodule"
IMPORTANT: Always edit frontend/ inside THIS starter directory. The standalone live-transcription-html/ directory at the monorepo root is a separate checkout.
- Add the HTML element in
frontend/index.html(input, checkbox, dropdown, etc.) - Read the value in
frontend/main.jswhen making the API call or opening the WebSocket - Pass it as a query parameter in the WebSocket URL
- Handle it in the backend
src/main/java/com/deepgram/starter/App.java— read the param and pass it to the Deepgram API
| Variable | Required | Default | Purpose |
|---|---|---|---|
DEEPGRAM_API_KEY |
Yes | — | Deepgram API key |
PORT |
No | 8081 |
Backend server port |
HOST |
No | 0.0.0.0 |
Backend bind address |
SESSION_SECRET |
No | — | JWT signing secret (production) |
All commits must follow conventional commits format. Never include Co-Authored-By lines for Claude.
feat(java-live-transcription): add diarization support
fix(java-live-transcription): resolve WebSocket close handling
refactor(java-live-transcription): simplify session endpoint
chore(deps): update frontend submodule
# Run conformance tests (requires app to be running)
make test
# Manual endpoint check
curl -sf http://localhost:8081/api/metadata | python3 -m json.tool
curl -sf http://localhost:8081/api/session | python3 -m json.tool