Lightweight Go wrapper around the Silero Voice Activity Detector using ONNX Runtime and miniaudio.
This library enables real-time or file-based voice activity detection (VAD) in Go. It wraps the Silero VAD model and provides microphone input/output and WAV file integration with native audio processing.
- 🧠 ONNX-based Silero VAD inference
- 🎙️ Real-time microphone speech detection
- 📁 WAV file-based speech detection
- 🔊 Microphone and WAV playback support
- 🔄 Optional WAV conversion (resample, format, channels)
- Go 1.20+
- ONNX Runtime installed on your system
- Silero VAD model
.onnxfile (see below)
This library uses yalue/onnxruntime_go.
Please follow its installation instructions to correctly set up ONNX Runtime.
go get github.com/plandem/silero-gomkdir -p models
wget -O models/silero_vad.onnx https://github.com/snakers4/silero-vad/files/7603706/silero_vad.onnximport "github.com/plandem/silero-go/onnx"
onnx.Init("/path/to/libonnxruntime.so")
defer onnx.Destroy()import (
"github.com/plandem/silero-go/vad"
"time"
)
model, _ := vad.NewModel(16000, "models/silero_vad.onnx")
detector, _ := vad.NewDetector(model, vad.Config{
SpeechThreshold: 0.5,
MinSilence: 100 * time.Millisecond,
SpeechPad: 30 * time.Millisecond,
}, func(start, end vad.SampleOffset) {
// Called on speech start/end
})
defer detector.Destroy()You can run detection from:
- Microphone input (via
miniaudio.Capture) - WAV files (any reader implementing
io.Readerwith raw PCM samples)
detector.ReadFrom(myInputSource)To ensure compatibility with the Silero model:
- Sample Rate: 16kHz
- Channels: Mono (1)
- Format: 32-bit float PCM
Use a converter if your audio input doesn't match. This library includes minimal utilities for conversion using malgo.
MIT License © 2025 Andrei Gaivoronskii