Skip to content

0mdb/drone-frame-extractor

Repository files navigation

drone-frame-extractor

Extract sharp, GPS-tagged frames from drone video for photogrammetry.

Given a drone video and its telemetry (DJI .SRT sidecar by default), the pipeline picks the sharpest non-redundant frames using a Laplacian blur score and a GPS-distance filter, embeds full EXIF GPS + camera metadata in each output JPEG, and runs an overlap quality check ready for Metashape, OpenDroneMap, RealityCapture, or any structure-from-motion tool.

What it does

.MP4 + .SRT ──► parse-drone-srt   ──► telemetry.csv (lat, lon, alt, focal_len, fnum)
                                          │
.MP4 + telemetry.csv ──► extract-frames ──► images/*.jpg (EXIF GPS embedded)
                                          │   dataset_summary.csv
                                          │   dataset_geotags.csv
                                          │
images/*.jpg ──► check-overlap ──► QA report (PASS / MARGINAL / FAIL)
                                          │
images/*.jpg ──► Metashape / ODM / RC ──► point cloud (.las) + DEM + orthomosaic
                                          │
.las + .tif ──► reproject-crs ──► outputs in any target CRS (geoid optional)

Drone compatibility

Tested with DJI Mavic 3 (4K/60 fps H.265 + DJI subtitle SRT). Other DJI drones that emit the same SRT subtitle format (Mavic 3 Pro, Air 3, Mini 4 Pro, etc.) should work without changes. For non-DJI drones, either:

  • adapt the regex in parse_drone_srt.py, or
  • bypass the parser and supply your own CSV with the columns latitude, longitude, abs_alt, rel_alt, focal_len, fnum, timestamp.

Install

Docker (recommended)

docker build -t drone-frame-extractor .

docker run --rm \
  -v /path/to/footage:/data/footage:ro \
  -v /path/to/dataset:/data/dataset \
  drone-frame-extractor \
  --flyover /data/footage --out /data/dataset --interval-sec 0.5

Or with docker compose (mount ./footage and ./dataset from the repo root):

docker compose run --rm pipeline

Local (uv)

git clone https://github.com/0mdb/drone-frame-extractor.git
cd drone-frame-extractor
uv sync
uv run run-pipeline --help

System requirements for local installs: ffmpeg / ffprobe. Optional NVIDIA GPU for hardware-accelerated H.265 decode (--gpu-decode, default on).

Quick start

Single video

parse-drone-srt DJI_0925.SRT -o DJI_0925_telemetry.csv

extract-frames \
  --video DJI_0925.MP4 \
  --telemetry DJI_0925_telemetry.csv \
  --out my_dataset \
  --prefix 0925 \
  --interval-sec 0.5

check-overlap my_dataset/dataset_geotags.csv

Batch (all videos in a directory)

run-pipeline \
  --flyover /path/to/footage \
  --out my_dataset \
  --interval-sec 0.5 \
  --min-distance-m 1.5 \
  --blur-floor 50

The flyover directory should contain matched pairs of *.MP4 + *.SRT. The pipeline discovers all pairs, parses telemetry, extracts frames in parallel, and runs overlap QA automatically.

Frame extraction parameters

Flag Default Description
--interval-sec 1.0 Time window (seconds) for selecting sharpest frame. 0.5 = 2 fps, 1.0 = 1 fps.
--blur-thresh 150 Minimum Laplacian blur score. Higher = sharper frames only.
--blur-floor 50 Fallback threshold for gap filling. Set equal to --blur-thresh to disable.
--min-distance-m 3.0 Primary filter: minimum GPS distance (metres) between accepted frames.
--sim-thresh 0.98 Secondary filter: max HSV histogram similarity to previous frame.
--score-scale 0.25 Resize factor for blur/histogram scoring. Does not affect output JPEG quality.
--gpu-decode True Use ffmpeg NVDEC for hardware video decode. --no-gpu-decode for CPU-only.
--jpeg-quality 95 JPEG output quality (1-100).
--workers 0 (auto) Parallel video workers. 0 = one per video, capped at CPU count.

Output format

my_dataset/
├── images/
│   ├── 0925_f000000.jpg    # 3840x2160, EXIF GPS + camera metadata
│   ├── 0925_f000001.jpg
│   └── ...
├── dataset_summary.csv     # Per-frame: blur score, distance, similarity, fallback flag
└── dataset_geotags.csv     # Per-frame: lat, lon, abs_alt, rel_alt, telemetry delta

Each JPEG carries:

  • GPS: Latitude, longitude, altitude (sub-millimetre EXIF encoding precision)
  • Camera: Make, model, focal length, aperture, crop factor
  • Timestamp: DateTimeOriginal + SubSecTimeOriginal

CRS reprojection

reproject-crs converts photogrammetry outputs (LAS / DEM / orthomosaic) to any target CRS. Defaults perform EPSG:32612EPSG:6655 (NAD83(CSRS) / UTM 12N + CGVD2013 height) with the CGG2013 geoid grid auto-downloaded via PROJ. Override anything you need:

# Generic projected reprojection without vertical correction
reproject-crs --input-dir output/ \
  --source-las-epsg 32612 --target-horiz-epsg 32613 --no-geoid

Testing

uv run pytest                   # 92 tests, all self-contained

Optional: Metashape automation

See metashape/README.md for Agisoft Metashape Pro automation scripts (Windows, commercial licence). They are excluded from the Docker image and are not required for the core pipeline.

Field guide

DRONE_CAPTURE_GUIDE.md covers altitude, speed, overlap, camera settings, and lessons learned from real flights.

Contributing

Issues and pull requests welcome at github.com/0mdb/drone-frame-extractor.

License

MIT — see LICENSE.

Releases

No releases published

Packages

 
 
 

Contributors