Extract sharp, GPS-tagged frames from drone video for photogrammetry.
Given a drone video and its telemetry (DJI .SRT sidecar by default), the pipeline picks the sharpest non-redundant frames using a Laplacian blur score and a GPS-distance filter, embeds full EXIF GPS + camera metadata in each output JPEG, and runs an overlap quality check ready for Metashape, OpenDroneMap, RealityCapture, or any structure-from-motion tool.
.MP4 + .SRT ──► parse-drone-srt ──► telemetry.csv (lat, lon, alt, focal_len, fnum)
│
.MP4 + telemetry.csv ──► extract-frames ──► images/*.jpg (EXIF GPS embedded)
│ dataset_summary.csv
│ dataset_geotags.csv
│
images/*.jpg ──► check-overlap ──► QA report (PASS / MARGINAL / FAIL)
│
images/*.jpg ──► Metashape / ODM / RC ──► point cloud (.las) + DEM + orthomosaic
│
.las + .tif ──► reproject-crs ──► outputs in any target CRS (geoid optional)
Tested with DJI Mavic 3 (4K/60 fps H.265 + DJI subtitle SRT). Other DJI drones that emit the same SRT subtitle format (Mavic 3 Pro, Air 3, Mini 4 Pro, etc.) should work without changes. For non-DJI drones, either:
- adapt the regex in
parse_drone_srt.py, or - bypass the parser and supply your own CSV with the columns
latitude, longitude, abs_alt, rel_alt, focal_len, fnum, timestamp.
docker build -t drone-frame-extractor .
docker run --rm \
-v /path/to/footage:/data/footage:ro \
-v /path/to/dataset:/data/dataset \
drone-frame-extractor \
--flyover /data/footage --out /data/dataset --interval-sec 0.5Or with docker compose (mount ./footage and ./dataset from the repo root):
docker compose run --rm pipelinegit clone https://github.com/0mdb/drone-frame-extractor.git
cd drone-frame-extractor
uv sync
uv run run-pipeline --helpSystem requirements for local installs: ffmpeg / ffprobe. Optional NVIDIA
GPU for hardware-accelerated H.265 decode (--gpu-decode, default on).
parse-drone-srt DJI_0925.SRT -o DJI_0925_telemetry.csv
extract-frames \
--video DJI_0925.MP4 \
--telemetry DJI_0925_telemetry.csv \
--out my_dataset \
--prefix 0925 \
--interval-sec 0.5
check-overlap my_dataset/dataset_geotags.csvrun-pipeline \
--flyover /path/to/footage \
--out my_dataset \
--interval-sec 0.5 \
--min-distance-m 1.5 \
--blur-floor 50The flyover directory should contain matched pairs of *.MP4 + *.SRT. The
pipeline discovers all pairs, parses telemetry, extracts frames in parallel,
and runs overlap QA automatically.
| Flag | Default | Description |
|---|---|---|
--interval-sec |
1.0 | Time window (seconds) for selecting sharpest frame. 0.5 = 2 fps, 1.0 = 1 fps. |
--blur-thresh |
150 | Minimum Laplacian blur score. Higher = sharper frames only. |
--blur-floor |
50 | Fallback threshold for gap filling. Set equal to --blur-thresh to disable. |
--min-distance-m |
3.0 | Primary filter: minimum GPS distance (metres) between accepted frames. |
--sim-thresh |
0.98 | Secondary filter: max HSV histogram similarity to previous frame. |
--score-scale |
0.25 | Resize factor for blur/histogram scoring. Does not affect output JPEG quality. |
--gpu-decode |
True | Use ffmpeg NVDEC for hardware video decode. --no-gpu-decode for CPU-only. |
--jpeg-quality |
95 | JPEG output quality (1-100). |
--workers |
0 (auto) | Parallel video workers. 0 = one per video, capped at CPU count. |
my_dataset/
├── images/
│ ├── 0925_f000000.jpg # 3840x2160, EXIF GPS + camera metadata
│ ├── 0925_f000001.jpg
│ └── ...
├── dataset_summary.csv # Per-frame: blur score, distance, similarity, fallback flag
└── dataset_geotags.csv # Per-frame: lat, lon, abs_alt, rel_alt, telemetry delta
Each JPEG carries:
- GPS: Latitude, longitude, altitude (sub-millimetre EXIF encoding precision)
- Camera: Make, model, focal length, aperture, crop factor
- Timestamp:
DateTimeOriginal+SubSecTimeOriginal
reproject-crs converts photogrammetry outputs (LAS / DEM / orthomosaic) to
any target CRS. Defaults perform EPSG:32612 → EPSG:6655 (NAD83(CSRS) /
UTM 12N + CGVD2013 height) with the CGG2013 geoid grid auto-downloaded via
PROJ. Override anything you need:
# Generic projected reprojection without vertical correction
reproject-crs --input-dir output/ \
--source-las-epsg 32612 --target-horiz-epsg 32613 --no-geoiduv run pytest # 92 tests, all self-containedSee metashape/README.md for Agisoft Metashape Pro
automation scripts (Windows, commercial licence). They are excluded from the
Docker image and are not required for the core pipeline.
DRONE_CAPTURE_GUIDE.md covers altitude, speed,
overlap, camera settings, and lessons learned from real flights.
Issues and pull requests welcome at github.com/0mdb/drone-frame-extractor.
MIT — see LICENSE.