Skip to content

Releases: Tok/depth-surge-3d

v0.9.2 - Bug Fixes & UI Improvements

19 Jan 19:34
@Tok Tok

Choose a tag to compare

Fixed

  • Critical bug fix: Added null checks for progress_tracker in depth_processor.py
    • Fixes crash when using CLI without web interface (#14)
    • Error: 'NoneType' object has no attribute 'update_progress'
    • Added guards at lines 111 and 399 in src/depth_surge_3d/processing/frames/depth_processor.py
    • All 770 tests passing

Added

  • Python version pinning: Added .python-version file pinned to Python 3.12
    • Improves development environment consistency
    • Recommended by contributor for uv project stability
  • Windows test script: Added test.ps1 for PowerShell users
    • Matches functionality of test.sh for cross-platform consistency
    • Verifies Python dependencies, CUDA, model files, and FFmpeg

Changed

  • UI reorganization: Improved logical grouping of settings in web UI
    • Step 7 (VR Assembly): Now contains VR Format, Headset Preset, and VR Resolution
    • Step 8 (Video Encoding & Output): Focused on encoding, audio, and file management
    • Clearer separation: assembly settings vs. output encoding settings
  • Dependency configuration: Merged PR #13 from @danrossi
    • Added depth-anything-3 package with git source configuration to pyproject.toml
    • Enables proper installation via uv package manager
    • Resolves installation issues for Python 3.9-3.12 users (#11)
  • Script colors: Updated all user-facing scripts to use exact CSS colors
    • Lime green: #39ff14 (RGB 57, 255, 20) - matches --accent-lime
    • Cyan: #00d9ff (RGB 0, 217, 255) - matches info/cyan
    • Consistent branding across CLI scripts and web UI

Documentation

  • Reorganized project structure
    • Moved TODO.md to docs/ directory for better organization
    • Archived codex-review.md to docs/archive/
    • Updated contributor documentation links
  • Rewrote CONTRIBUTING.md with separate sections for human vs AI contributors
    • Human contributors: Relaxed requirements, focus on ideas over perfection
    • AI contributors: Strict requirements, points to CLAUDE.md
    • Acknowledges AI may refactor human contributions later
  • Updated example_settings.json to v0.9.2 with current settings

Contributors

Special thanks to @danrossi for identifying and helping resolve installation issues.


Full Changelog: v0.9.1...v0.9.2

v0.9.1 - Python 3.13 Compatibility, Documentation & UX

18 Jan 21:04
@Tok Tok

Choose a tag to compare

 █▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀█
 █ ░█▀▄░█▀▀░█▀█░▀█▀░█░█░░░█▀▀░█░█░█▀▄░█▀▀░█▀▀░░░▀▀█░█▀▄░ █
 █ ░█░█░█▀▀░█▀▀░░█░░█▀█░░░▀▀█░█░█░█▀▄░█░█░█▀▀░░░░▀▄░█░█░ █
 █ ░▀▀░░▀▀▀░▀░░░░▀░░▀░▀░░░▀▀▀░▀▀▀░▀░▀░▀▀▀░▀▀▀░░░▀▀░░▀▀░░ █
 █▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄█

🐍 Python 3.13 Compatibility

Fixed Python 3.13 compatibility issue caused by `open3d` dependency in Depth-Anything V3:

  • Version guard: Added `requires-python = ">=3.9,<3.13"` in `pyproject.toml`
  • Runtime check: Clear error message with GitHub issue link for users on Python 3.13+
  • Compilation guide: Added comprehensive 161-line guide for advanced users who need Python 3.13
    • Step-by-step instructions for Ubuntu/Debian, macOS, and Windows
    • CMake and C++14 compiler requirements documented
    • Troubleshooting section with common build errors

Closes: #11


📚 Documentation Improvements

VR Headset Compatibility Guide (NEW)

Comprehensive 538-line guide covering the top 10 VR headsets by market share:

  • Headsets covered: Meta Quest 2/3/3S/Pro, PlayStation VR2, Valve Index, HTC Vive/Pro 2, Pico 4
  • Per-device specs: Resolution per eye, IPD range, FOV, refresh rates
  • Recommended CLI settings: Optimal parameters for each device
  • IPD measurement guide: How to measure and configure interpupillary distance
  • Official sources: Meta, Sony, Valve, HTC, Pico documentation

Performance Benchmarks Guide (NEW)

Comprehensive 597-line performance reference:

  • GPU benchmarks: RTX 4070 Ti SUPER baseline with processing times
  • V2 vs V3 comparison: Detailed speed and VRAM usage tables
  • Resolution scaling: 720p, 1080p, 1440p, 4K processing times
  • VRAM patterns: Memory usage by model size and resolution
  • GPU performance tiers: High-end, mid-range, entry-level recommendations
  • Optimization guide: 8 tips for faster processing with CLI examples
  • Reference table: 25+ GPU models with VRAM and expected performance

🎨 UX Improvements

Colored Completion Banner (NEW)

Beautiful terminal banner when processing finishes:

```
══════════════════════════════════════════════════════════════════════
✓ PROCESSING COMPLETE
──────────────────────────────────────────────────────────────────
Output: /path/to/output/video_3D_side_by_side.mp4
Format: side_by_side
Frames: 300
Time: 2h 15m 30s
══════════════════════════════════════════════════════════════════════
```

  • Lime green borders (#39ff14) matching UI theme
  • Electric blue file path highlighting (#00d9ff)
  • Human-readable time: Formats as "1h 23m 45s" instead of raw seconds
  • Processing metrics: Frame count, VR format, output path
  • Saved to settings.json: Processing time tracked for batch analysis

🧹 Technical Improvements

Error Handling Standardization

  • Moved inline `import traceback` to module-level imports (5 files)
  • Consistent exception handling patterns across all modules
  • Specific exceptions where appropriate, broad catches for resilience

TODO Reorganization

  • Created `docs/archive/completed-tasks.md` for v0.9.0-v0.9.1 history
  • Moved `TODO.md` to root directory
  • Simplified roadmap to show only v0.10.0+ future enhancements
  • Removed legacy model references (MiDaS, ZoeDepth won't be added)

📊 Test Coverage

  • 745 unit tests (+14 new console tests)
  • 91% coverage (up from 90.41%)
  • 100% coverage: console.py, vram_manager.py, depth_cache.py, image_processing.py
  • All pre-commit checks passing (Black, Flake8, Mypy)

🔧 Installation

```bash

Python 3.9-3.12 required (NOT 3.13+)

pip install depth-surge-3d

Or from source

git clone https://github.com/Tok/depth-surge-3d.git
cd depth-surge-3d
./setup.sh
```


📖 Full Changelog

Python Compatibility:

  • fix: add Python 3.13 compatibility guard (#11)
  • docs: add advanced Python 3.13 compilation guide

Documentation:

  • docs: add comprehensive VR headset compatibility guide
  • docs: add comprehensive performance benchmarks guide

UX Improvements:

  • feat: add colored completion banner with processing summary
  • feat: implement RGB lerp gradient with diagonal flow and stretched margins
  • feat: add balancing column after 3D in banner for symmetry
  • refactor: apply gradient to entire banner and increase stretch margin
  • feat: use brand lime color (#39ff14) for authentic gradient

Technical Improvements:

  • refactor: standardize error handling and archive TODO
  • feat: add colored completion banner and reorganize TODO
  • test: add comprehensive console output tests (100% coverage)

🙏 Credits

Thanks to the community for feedback and issue reports!

🤖 Generated with Claude Sonnet 4.5

v0.9.0 - Major Modularization & Feature Update

18 Jan 18:11
@Tok Tok

Choose a tag to compare

Release v0.9.0

This is a major release with comprehensive architecture improvements, dual depth model support, AI upscaling, and extensive quality enhancements.

🌟 Highlights

Architecture Overhaul

  • Refactored VideoProcessor from 2002 LOC monolith to 7 focused modules (~500 LOC each)
  • Test coverage increased from 78% to 91% (723 unit + 4 integration tests)
  • All Codex security review findings addressed
  • Module structure organized by domain (inference, processing, utils)

New Features

Dual Depth Model Support

  • Video-Depth-Anything V2: Temporal consistency via 32-frame sliding windows
  • Depth-Anything V3: 50% less VRAM usage, faster processing
  • CLI flag: --depth-model-version {v2,v3}
  • V3 is now the default for better performance

AI Upscaling (Real-ESRGAN)

  • Integrated Real-ESRGAN for enhanced output quality
  • Models: x2, x4, x4-conservative with auto-download
  • Positioned as Step 6 in the pipeline (after cropping, before VR assembly)
  • VRAM overhead: ~2-4GB depending on model variant

Real-Time Preview System

  • Live preview during processing via WebSocket
  • Shows current depth maps, stereo pairs, and VR frames
  • Configurable update frequency (1-5 seconds)
  • Bandwidth-optimized: ~25-100 KB/sec

Performance Optimizations

  • Smart VRAM management with automatic batch sizing
  • Parallel frame processing using multiprocessing
  • Depth map caching system with BLAKE2b hashing
  • Auto-resume detection for interrupted processing

Quality & Testing

  • 91% code coverage (exceeds 90% goal)
  • 723 unit tests + 4 integration tests (100% pass rate)
  • All pre-commit checks passing (Black, Flake8)
  • Comprehensive security review completed
  • Automated import path checking

UI/UX Enhancements

  • Drag-and-drop video upload with visual feedback
  • Progress ETA estimates with per-step time tracking
  • VR headset presets for Quest 2/3, Vive, PSVR2
  • Colored console output with lime green arrows (→) for steps
  • Lime theme consistency throughout interface

📊 Performance

Model Speed VRAM Usage Best For
V3 ~2-3 sec/frame 6-8 GB General use, efficiency
V2 ~3-4 sec/frame 12-15 GB Temporal consistency

Benchmarked on RTX 4070 Ti SUPER

Example: 1-minute 1080p @ 30fps = ~2-3 hours with V3 on modern GPU

🛠️ Technical Improvements

Code Architecture

  • depth_processor.py (596 LOC) - Depth map generation with caching
  • stereo_generator.py (165 LOC) - Stereo pair creation
  • distortion_processor.py (274 LOC) - Fisheye distortion and cropping
  • frame_upscaler.py (256 LOC) - AI upscaling orchestration
  • vr_assembler.py (192 LOC) - VR frame assembly
  • video_encoder.py (288 LOC) - Video encoding with FFmpeg
  • pipeline_orchestrator.py (439 LOC) - High-level pipeline coordination

Module Reorganization

src/depth_surge_3d/
├── inference/
│   ├── depth/          # V2 and V3 depth models
│   └── upscaling/      # Real-ESRGAN
├── processing/
│   ├── frames/         # Frame-level processing
│   ├── video/          # Video encoding
│   └── orchestration/  # Pipeline control
└── utils/
    ├── domain/         # Business logic
    ├── imaging/        # Image processing
    └── system/         # System utilities

Security Fixes

  • Fixed Real-ESRGAN model checksum validation
  • Added SHA-256 verification for downloaded models
  • Improved input sanitization and validation
  • Enhanced error handling and logging

Developer Tools

  • Pre-commit check scripts for Linux/macOS and Windows
  • Unit test runner scripts with coverage support
  • Automated import path checking to prevent refactoring errors
  • Comprehensive test fixtures for all processor modules

🐛 Bug Fixes

  • Real-ESRGAN checksum errors - Corrected SHA-256 hashes for x2plus and x4-conservative models
  • Console output restoration - Restored colored arrow messages lost during refactor
  • Integration test compatibility - Fixed import paths in CI tests
  • Audio pre-extraction - Added debugging and improved error messages
  • UI styling - Drag-and-drop zone now matches lime theme
  • Modal backdrop cleanup - Fixed UI freezing after processing completion
  • Progress tracking - Fixed ETA calculations for slow steps like upscaling

📦 Installation

Prerequisites

  • Python 3.10+
  • FFmpeg
  • NVIDIA GPU with CUDA support (recommended)

Quick Start

# Clone the repository
git clone https://github.com/Tok/depth-surge-3d.git
cd depth-surge-3d

# Run setup (Linux/macOS)
./setup.sh

# Or Windows
.\setup.ps1

# Start the web UI
./run_ui.sh     # Linux/macOS
.\run_ui.ps1    # Windows

CLI Usage

# Basic conversion with V3 (default)
python depth_surge_3d.py video.mp4

# Use V2 for temporal consistency
python depth_surge_3d.py video.mp4 --depth-model-version v2

# Enable AI upscaling
python depth_surge_3d.py video.mp4 --upscale-model x4

# High quality 4K output
python depth_surge_3d.py video.mp4 --vr-resolution 16x9-4k

📚 Documentation

🙏 Contributors

Special thanks to:

  • @danrossi for Windows setup improvements (PR #10)
  • All testers who provided feedback during development

🔗 Links


What's Next?

v0.9.1 (Planned)

  • Performance regression tests
  • Additional VR format optimizations
  • Enhanced upscaling presets

Future Releases

  • Custom depth model selection
  • ML-based hole filling
  • 360° video support
  • Depth map quality metrics

Full Changelog: v0.8.1...v0.9.0

Release v0.8.1 - Quality & Stability

17 Jan 13:51
@Tok Tok
fb43cb0

Choose a tag to compare

Release v0.8.1 - Quality & Stability

This release focuses on code quality improvements, bug fixes, and significantly improved test coverage.

🎯 Highlights

  • Coverage improved from 76% → 89% (exceeded 85% target!)
  • Modernized type hints to Python 3.10+ standards (PEP 585/604)
  • Fixed critical UI bug where interface stuck at 100% completion
  • Fixed video file locks preventing playback after processing
  • Fully suppressed library warnings for cleaner console output

✨ Features & Improvements

Code Quality

  • Migrated all type hints to modern Python syntax:
    • Dict[K, V]dict[K, V]
    • List[X]list[X]
    • Optional[X]X | None
    • Callable from typingcollections.abc.Callable
  • Fixed invalid type defaults in public APIs (str = Nonestr | None = None)
  • Reduced cyclomatic complexity in load_model() method
  • All code now passes flake8 and black formatting

Bug Fixes

  • UI Completion Fix: Added socketio.sleep() to ensure completion messages are transmitted before background thread terminates
  • Video File Locks: Added try/finally blocks to guarantee cv2.VideoCapture.release() is always called
  • Warning Suppression: Completely suppressed Depth-Anything V3 gsplat dependency warnings using stdout/stderr redirection

Testing & Coverage

  • Overall: 76% → 89% coverage
  • Added 25 comprehensive tests for video_processor.py (10% → 51% coverage)
  • Achieved 100% coverage on io_operations.py
  • All 541 tests passing

📊 Coverage Breakdown

Module Coverage Change
Overall 89% +13%
video_processor.py 51% +41%
io_operations.py 100% +9%
image_processing.py 100% -
video_processing.py 100% -

🔧 Technical Details

Type Hints Modernization

All modules now use modern Python 3.10+ type hints following PEP 585 and PEP 604:

  • from __future__ import annotations added to all modules
  • Built-in generics preferred over typing imports
  • Union syntax using | instead of Optional

Resource Management

Improved file handle management:

  • app.py: Wrapped get_video_info() in try/finally
  • io_operations.py: Added try/finally to get_video_properties()
  • Follows RAII pattern for guaranteed cleanup

SocketIO Threading

Fixed race condition where completion messages weren't delivered:

  • Added socketio.sleep(SOCKETIO_SLEEP_YIELD) after all completion emits
  • Ensures context switch for message transmission
  • No perceptible delay (SOCKETIO_SLEEP_YIELD = 0)

📝 Commits

All changes included in PR #9:

  • 15 commits total
  • All CI/CD checks passing
  • Flake8, black, and mypy compliant
  • 541/541 tests passing

🙏 Contributors

Co-Authored-By: Claude Sonnet 4.5 noreply@anthropic.com


Full Changelog: v0.8.0...v0.8.1

v0.8.0

16 Jan 23:44
@Tok Tok

Choose a tag to compare

🚀 Major Release: Depth Anything V3 Integration

This release brings significant improvements in memory efficiency, performance, and reliability with the integration of Depth Anything V3 and comprehensive FFmpeg fixes.

✨ New Features

Depth Anything V3 Support

  • ~50% lower VRAM usage compared to Video-Depth-Anything V2
  • Optimized for GPUs with limited VRAM (6GB RTX cards and up)
  • Configurable depth resolution (518px to 4K) via Web UI and CLI
  • Available model sizes: Small, Base, Large (+ Large-Metric, Giant variants)
  • Direct numpy array processing (no file I/O overhead)
  • Automatic xformers detection for optimized attention
  • Closes #7

Web UI Enhancements

  • SVG Favicon with gravity-warped grid design
  • Video Encoder Selection dropdown (NVENC, H.264, H.265)
  • Depth Resolution Configuration UI with auto-detection
  • Fixed UI Reset after processing completion - no more stuck state!
  • Better progress tracking with 7-step weighted pipeline

FFmpeg Improvements

  • Fixed CUDA flag parsing (split into separate arguments)
  • Added CPU fallback for CUDA frame extraction on non-NVIDIA systems
  • Fixed NVENC codec ordering (now correctly placed after inputs)
  • Automatic encoder detection with graceful software fallback
  • Hardware-accelerated encoding with proper error handling

🛠️ Technical Improvements

CI/CD Pipeline

  • GitHub Actions workflow with Tests, Code Quality, and Security checks
  • Codecov integration ready (badge will activate on first coverage upload)
  • Black formatting enforcement (line length: 100)
  • Flake8 linting with McCabe complexity limits (≤10)

Code Quality

  • All functions now comply with McCabe complexity ≤10
  • Complete type hints across the codebase
  • Zero flake8 violations
  • Comprehensive error handling with graceful degradation

Testing

  • 263 new unit tests for Depth Anything V3
  • 173 tests for Video-Depth-Anything V2
  • Integration tests for end-to-end workflow
  • Test fixtures and infrastructure

🐛 Bug Fixes

Critical Fixes (Codex Review)

  • Fixed zero FPS division error in video properties (prevents crashes on invalid metadata)
  • Fixed invalid CUDA flag token in stereo_projector.py (was single string, now separate args)
  • Fixed NVENC-only output without fallback (now detects availability and falls back to software)
  • Fixed CUDA-only frame extraction (now gracefully degrades to CPU on non-NVIDIA systems)

UI/UX Fixes

  • Fixed UI getting stuck after processing completion
  • File input now properly resets after each processing job
  • Video preview and info panels correctly hide on reset
  • No more Ctrl-C needed between processing sessions

📚 Documentation

  • New docs/CONTRIBUTING.md with development guidelines and codecov setup
  • Compacted CLAUDE.md (256 → 199 lines) with clear code quality standards
  • Updated README with badges (CI, codecov, Python version, License)
  • Comprehensive inline documentation and docstrings

🔧 Breaking Changes

None! This release is fully backward compatible with v0.7.x.

📊 Statistics

  • 44 files changed: 3,374 additions, 491 deletions
  • 44 new files including tests, CI config, and documentation
  • All CI checks passing

🙏 Credits

Special thanks to:

  • @odoucet for virtual environment improvements (#3, #4)
  • @danrossi for suggesting Depth Anything V3 integration (#7)
  • Community feedback and testing

🔮 Coming Soon

  • RAFT (Optical Flow) integration for motion-aware depth refinement
  • Improved temporal consistency
  • Major refactoring and architectural improvements

Full Changelog: v0.7.7...v0.8.0