Releases: Tok/depth-surge-3d
v0.9.2 - Bug Fixes & UI Improvements
Fixed
- Critical bug fix: Added null checks for
progress_trackerin depth_processor.py- Fixes crash when using CLI without web interface (#14)
- Error:
'NoneType' object has no attribute 'update_progress' - Added guards at lines 111 and 399 in
src/depth_surge_3d/processing/frames/depth_processor.py - All 770 tests passing
Added
- Python version pinning: Added
.python-versionfile pinned to Python 3.12- Improves development environment consistency
- Recommended by contributor for uv project stability
- Windows test script: Added
test.ps1for PowerShell users- Matches functionality of
test.shfor cross-platform consistency - Verifies Python dependencies, CUDA, model files, and FFmpeg
- Matches functionality of
Changed
- UI reorganization: Improved logical grouping of settings in web UI
- Step 7 (VR Assembly): Now contains VR Format, Headset Preset, and VR Resolution
- Step 8 (Video Encoding & Output): Focused on encoding, audio, and file management
- Clearer separation: assembly settings vs. output encoding settings
- Dependency configuration: Merged PR #13 from @danrossi
- Added
depth-anything-3package with git source configuration topyproject.toml - Enables proper installation via uv package manager
- Resolves installation issues for Python 3.9-3.12 users (#11)
- Added
- Script colors: Updated all user-facing scripts to use exact CSS colors
- Lime green:
#39ff14(RGB 57, 255, 20) - matches--accent-lime - Cyan:
#00d9ff(RGB 0, 217, 255) - matches info/cyan - Consistent branding across CLI scripts and web UI
- Lime green:
Documentation
- Reorganized project structure
- Moved
TODO.mdtodocs/directory for better organization - Archived
codex-review.mdtodocs/archive/ - Updated contributor documentation links
- Moved
- Rewrote CONTRIBUTING.md with separate sections for human vs AI contributors
- Human contributors: Relaxed requirements, focus on ideas over perfection
- AI contributors: Strict requirements, points to CLAUDE.md
- Acknowledges AI may refactor human contributions later
- Updated example_settings.json to v0.9.2 with current settings
Contributors
Special thanks to @danrossi for identifying and helping resolve installation issues.
Full Changelog: v0.9.1...v0.9.2
v0.9.1 - Python 3.13 Compatibility, Documentation & UX
█▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀▀█
█ ░█▀▄░█▀▀░█▀█░▀█▀░█░█░░░█▀▀░█░█░█▀▄░█▀▀░█▀▀░░░▀▀█░█▀▄░ █
█ ░█░█░█▀▀░█▀▀░░█░░█▀█░░░▀▀█░█░█░█▀▄░█░█░█▀▀░░░░▀▄░█░█░ █
█ ░▀▀░░▀▀▀░▀░░░░▀░░▀░▀░░░▀▀▀░▀▀▀░▀░▀░▀▀▀░▀▀▀░░░▀▀░░▀▀░░ █
█▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄▄█
🐍 Python 3.13 Compatibility
Fixed Python 3.13 compatibility issue caused by `open3d` dependency in Depth-Anything V3:
- Version guard: Added `requires-python = ">=3.9,<3.13"` in `pyproject.toml`
- Runtime check: Clear error message with GitHub issue link for users on Python 3.13+
- Compilation guide: Added comprehensive 161-line guide for advanced users who need Python 3.13
- Step-by-step instructions for Ubuntu/Debian, macOS, and Windows
- CMake and C++14 compiler requirements documented
- Troubleshooting section with common build errors
Closes: #11
📚 Documentation Improvements
VR Headset Compatibility Guide (NEW)
Comprehensive 538-line guide covering the top 10 VR headsets by market share:
- Headsets covered: Meta Quest 2/3/3S/Pro, PlayStation VR2, Valve Index, HTC Vive/Pro 2, Pico 4
- Per-device specs: Resolution per eye, IPD range, FOV, refresh rates
- Recommended CLI settings: Optimal parameters for each device
- IPD measurement guide: How to measure and configure interpupillary distance
- Official sources: Meta, Sony, Valve, HTC, Pico documentation
Performance Benchmarks Guide (NEW)
Comprehensive 597-line performance reference:
- GPU benchmarks: RTX 4070 Ti SUPER baseline with processing times
- V2 vs V3 comparison: Detailed speed and VRAM usage tables
- Resolution scaling: 720p, 1080p, 1440p, 4K processing times
- VRAM patterns: Memory usage by model size and resolution
- GPU performance tiers: High-end, mid-range, entry-level recommendations
- Optimization guide: 8 tips for faster processing with CLI examples
- Reference table: 25+ GPU models with VRAM and expected performance
🎨 UX Improvements
Colored Completion Banner (NEW)
Beautiful terminal banner when processing finishes:
```
══════════════════════════════════════════════════════════════════════
✓ PROCESSING COMPLETE
──────────────────────────────────────────────────────────────────
Output: /path/to/output/video_3D_side_by_side.mp4
Format: side_by_side
Frames: 300
Time: 2h 15m 30s
══════════════════════════════════════════════════════════════════════
```
- Lime green borders (#39ff14) matching UI theme
- Electric blue file path highlighting (#00d9ff)
- Human-readable time: Formats as "1h 23m 45s" instead of raw seconds
- Processing metrics: Frame count, VR format, output path
- Saved to settings.json: Processing time tracked for batch analysis
🧹 Technical Improvements
Error Handling Standardization
- Moved inline `import traceback` to module-level imports (5 files)
- Consistent exception handling patterns across all modules
- Specific exceptions where appropriate, broad catches for resilience
TODO Reorganization
- Created `docs/archive/completed-tasks.md` for v0.9.0-v0.9.1 history
- Moved `TODO.md` to root directory
- Simplified roadmap to show only v0.10.0+ future enhancements
- Removed legacy model references (MiDaS, ZoeDepth won't be added)
📊 Test Coverage
- 745 unit tests (+14 new console tests)
- 91% coverage (up from 90.41%)
- 100% coverage: console.py, vram_manager.py, depth_cache.py, image_processing.py
- All pre-commit checks passing (Black, Flake8, Mypy)
🔧 Installation
```bash
Python 3.9-3.12 required (NOT 3.13+)
pip install depth-surge-3d
Or from source
git clone https://github.com/Tok/depth-surge-3d.git
cd depth-surge-3d
./setup.sh
```
📖 Full Changelog
Python Compatibility:
- fix: add Python 3.13 compatibility guard (#11)
- docs: add advanced Python 3.13 compilation guide
Documentation:
- docs: add comprehensive VR headset compatibility guide
- docs: add comprehensive performance benchmarks guide
UX Improvements:
- feat: add colored completion banner with processing summary
- feat: implement RGB lerp gradient with diagonal flow and stretched margins
- feat: add balancing column after 3D in banner for symmetry
- refactor: apply gradient to entire banner and increase stretch margin
- feat: use brand lime color (#39ff14) for authentic gradient
Technical Improvements:
- refactor: standardize error handling and archive TODO
- feat: add colored completion banner and reorganize TODO
- test: add comprehensive console output tests (100% coverage)
🙏 Credits
Thanks to the community for feedback and issue reports!
🤖 Generated with Claude Sonnet 4.5
v0.9.0 - Major Modularization & Feature Update
Release v0.9.0
This is a major release with comprehensive architecture improvements, dual depth model support, AI upscaling, and extensive quality enhancements.
🌟 Highlights
Architecture Overhaul
- Refactored VideoProcessor from 2002 LOC monolith to 7 focused modules (~500 LOC each)
- Test coverage increased from 78% to 91% (723 unit + 4 integration tests)
- All Codex security review findings addressed
- Module structure organized by domain (inference, processing, utils)
New Features
Dual Depth Model Support
- Video-Depth-Anything V2: Temporal consistency via 32-frame sliding windows
- Depth-Anything V3: 50% less VRAM usage, faster processing
- CLI flag:
--depth-model-version {v2,v3} - V3 is now the default for better performance
AI Upscaling (Real-ESRGAN)
- Integrated Real-ESRGAN for enhanced output quality
- Models: x2, x4, x4-conservative with auto-download
- Positioned as Step 6 in the pipeline (after cropping, before VR assembly)
- VRAM overhead: ~2-4GB depending on model variant
Real-Time Preview System
- Live preview during processing via WebSocket
- Shows current depth maps, stereo pairs, and VR frames
- Configurable update frequency (1-5 seconds)
- Bandwidth-optimized: ~25-100 KB/sec
Performance Optimizations
- Smart VRAM management with automatic batch sizing
- Parallel frame processing using multiprocessing
- Depth map caching system with BLAKE2b hashing
- Auto-resume detection for interrupted processing
Quality & Testing
- 91% code coverage (exceeds 90% goal)
- 723 unit tests + 4 integration tests (100% pass rate)
- All pre-commit checks passing (Black, Flake8)
- Comprehensive security review completed
- Automated import path checking
UI/UX Enhancements
- Drag-and-drop video upload with visual feedback
- Progress ETA estimates with per-step time tracking
- VR headset presets for Quest 2/3, Vive, PSVR2
- Colored console output with lime green arrows (→) for steps
- Lime theme consistency throughout interface
📊 Performance
| Model | Speed | VRAM Usage | Best For |
|---|---|---|---|
| V3 | ~2-3 sec/frame | 6-8 GB | General use, efficiency |
| V2 | ~3-4 sec/frame | 12-15 GB | Temporal consistency |
Benchmarked on RTX 4070 Ti SUPER
Example: 1-minute 1080p @ 30fps = ~2-3 hours with V3 on modern GPU
🛠️ Technical Improvements
Code Architecture
- depth_processor.py (596 LOC) - Depth map generation with caching
- stereo_generator.py (165 LOC) - Stereo pair creation
- distortion_processor.py (274 LOC) - Fisheye distortion and cropping
- frame_upscaler.py (256 LOC) - AI upscaling orchestration
- vr_assembler.py (192 LOC) - VR frame assembly
- video_encoder.py (288 LOC) - Video encoding with FFmpeg
- pipeline_orchestrator.py (439 LOC) - High-level pipeline coordination
Module Reorganization
src/depth_surge_3d/
├── inference/
│ ├── depth/ # V2 and V3 depth models
│ └── upscaling/ # Real-ESRGAN
├── processing/
│ ├── frames/ # Frame-level processing
│ ├── video/ # Video encoding
│ └── orchestration/ # Pipeline control
└── utils/
├── domain/ # Business logic
├── imaging/ # Image processing
└── system/ # System utilities
Security Fixes
- Fixed Real-ESRGAN model checksum validation
- Added SHA-256 verification for downloaded models
- Improved input sanitization and validation
- Enhanced error handling and logging
Developer Tools
- Pre-commit check scripts for Linux/macOS and Windows
- Unit test runner scripts with coverage support
- Automated import path checking to prevent refactoring errors
- Comprehensive test fixtures for all processor modules
🐛 Bug Fixes
- Real-ESRGAN checksum errors - Corrected SHA-256 hashes for x2plus and x4-conservative models
- Console output restoration - Restored colored arrow messages lost during refactor
- Integration test compatibility - Fixed import paths in CI tests
- Audio pre-extraction - Added debugging and improved error messages
- UI styling - Drag-and-drop zone now matches lime theme
- Modal backdrop cleanup - Fixed UI freezing after processing completion
- Progress tracking - Fixed ETA calculations for slow steps like upscaling
📦 Installation
Prerequisites
- Python 3.10+
- FFmpeg
- NVIDIA GPU with CUDA support (recommended)
Quick Start
# Clone the repository
git clone https://github.com/Tok/depth-surge-3d.git
cd depth-surge-3d
# Run setup (Linux/macOS)
./setup.sh
# Or Windows
.\setup.ps1
# Start the web UI
./run_ui.sh # Linux/macOS
.\run_ui.ps1 # WindowsCLI Usage
# Basic conversion with V3 (default)
python depth_surge_3d.py video.mp4
# Use V2 for temporal consistency
python depth_surge_3d.py video.mp4 --depth-model-version v2
# Enable AI upscaling
python depth_surge_3d.py video.mp4 --upscale-model x4
# High quality 4K output
python depth_surge_3d.py video.mp4 --vr-resolution 16x9-4k📚 Documentation
- CHANGELOG.md - Complete list of changes
- README.md - Project overview and quick start
- docs/INSTALLATION.md - Detailed installation guide
- docs/USAGE.md - Usage instructions
- docs/ARCHITECTURE.md - Architecture overview
🙏 Contributors
Special thanks to:
- @danrossi for Windows setup improvements (PR #10)
- All testers who provided feedback during development
🔗 Links
- Repository: https://github.com/Tok/depth-surge-3d
- Issues: https://github.com/Tok/depth-surge-3d/issues
- Discussions: https://github.com/Tok/depth-surge-3d/discussions
What's Next?
v0.9.1 (Planned)
- Performance regression tests
- Additional VR format optimizations
- Enhanced upscaling presets
Future Releases
- Custom depth model selection
- ML-based hole filling
- 360° video support
- Depth map quality metrics
Full Changelog: v0.8.1...v0.9.0
Release v0.8.1 - Quality & Stability
Release v0.8.1 - Quality & Stability
This release focuses on code quality improvements, bug fixes, and significantly improved test coverage.
🎯 Highlights
- Coverage improved from 76% → 89% (exceeded 85% target!)
- Modernized type hints to Python 3.10+ standards (PEP 585/604)
- Fixed critical UI bug where interface stuck at 100% completion
- Fixed video file locks preventing playback after processing
- Fully suppressed library warnings for cleaner console output
✨ Features & Improvements
Code Quality
- Migrated all type hints to modern Python syntax:
Dict[K, V]→dict[K, V]List[X]→list[X]Optional[X]→X | NoneCallablefromtyping→collections.abc.Callable
- Fixed invalid type defaults in public APIs (
str = None→str | None = None) - Reduced cyclomatic complexity in
load_model()method - All code now passes flake8 and black formatting
Bug Fixes
- UI Completion Fix: Added
socketio.sleep()to ensure completion messages are transmitted before background thread terminates - Video File Locks: Added try/finally blocks to guarantee
cv2.VideoCapture.release()is always called - Warning Suppression: Completely suppressed Depth-Anything V3 gsplat dependency warnings using stdout/stderr redirection
Testing & Coverage
- Overall: 76% → 89% coverage
- Added 25 comprehensive tests for
video_processor.py(10% → 51% coverage) - Achieved 100% coverage on
io_operations.py - All 541 tests passing
📊 Coverage Breakdown
| Module | Coverage | Change |
|---|---|---|
| Overall | 89% | +13% |
| video_processor.py | 51% | +41% |
| io_operations.py | 100% | +9% |
| image_processing.py | 100% | - |
| video_processing.py | 100% | - |
🔧 Technical Details
Type Hints Modernization
All modules now use modern Python 3.10+ type hints following PEP 585 and PEP 604:
from __future__ import annotationsadded to all modules- Built-in generics preferred over
typingimports - Union syntax using
|instead ofOptional
Resource Management
Improved file handle management:
app.py: Wrappedget_video_info()in try/finallyio_operations.py: Added try/finally toget_video_properties()- Follows RAII pattern for guaranteed cleanup
SocketIO Threading
Fixed race condition where completion messages weren't delivered:
- Added
socketio.sleep(SOCKETIO_SLEEP_YIELD)after all completion emits - Ensures context switch for message transmission
- No perceptible delay (SOCKETIO_SLEEP_YIELD = 0)
📝 Commits
All changes included in PR #9:
- 15 commits total
- All CI/CD checks passing
- Flake8, black, and mypy compliant
- 541/541 tests passing
🙏 Contributors
Co-Authored-By: Claude Sonnet 4.5 noreply@anthropic.com
Full Changelog: v0.8.0...v0.8.1
v0.8.0
🚀 Major Release: Depth Anything V3 Integration
This release brings significant improvements in memory efficiency, performance, and reliability with the integration of Depth Anything V3 and comprehensive FFmpeg fixes.
✨ New Features
Depth Anything V3 Support
- ~50% lower VRAM usage compared to Video-Depth-Anything V2
- Optimized for GPUs with limited VRAM (6GB RTX cards and up)
- Configurable depth resolution (518px to 4K) via Web UI and CLI
- Available model sizes: Small, Base, Large (+ Large-Metric, Giant variants)
- Direct numpy array processing (no file I/O overhead)
- Automatic xformers detection for optimized attention
- Closes #7
Web UI Enhancements
- SVG Favicon with gravity-warped grid design
- Video Encoder Selection dropdown (NVENC, H.264, H.265)
- Depth Resolution Configuration UI with auto-detection
- Fixed UI Reset after processing completion - no more stuck state!
- Better progress tracking with 7-step weighted pipeline
FFmpeg Improvements
- Fixed CUDA flag parsing (split into separate arguments)
- Added CPU fallback for CUDA frame extraction on non-NVIDIA systems
- Fixed NVENC codec ordering (now correctly placed after inputs)
- Automatic encoder detection with graceful software fallback
- Hardware-accelerated encoding with proper error handling
🛠️ Technical Improvements
CI/CD Pipeline
- GitHub Actions workflow with Tests, Code Quality, and Security checks
- Codecov integration ready (badge will activate on first coverage upload)
- Black formatting enforcement (line length: 100)
- Flake8 linting with McCabe complexity limits (≤10)
Code Quality
- All functions now comply with McCabe complexity ≤10
- Complete type hints across the codebase
- Zero flake8 violations
- Comprehensive error handling with graceful degradation
Testing
- 263 new unit tests for Depth Anything V3
- 173 tests for Video-Depth-Anything V2
- Integration tests for end-to-end workflow
- Test fixtures and infrastructure
🐛 Bug Fixes
Critical Fixes (Codex Review)
- Fixed zero FPS division error in video properties (prevents crashes on invalid metadata)
- Fixed invalid CUDA flag token in stereo_projector.py (was single string, now separate args)
- Fixed NVENC-only output without fallback (now detects availability and falls back to software)
- Fixed CUDA-only frame extraction (now gracefully degrades to CPU on non-NVIDIA systems)
UI/UX Fixes
- Fixed UI getting stuck after processing completion
- File input now properly resets after each processing job
- Video preview and info panels correctly hide on reset
- No more Ctrl-C needed between processing sessions
📚 Documentation
- New docs/CONTRIBUTING.md with development guidelines and codecov setup
- Compacted CLAUDE.md (256 → 199 lines) with clear code quality standards
- Updated README with badges (CI, codecov, Python version, License)
- Comprehensive inline documentation and docstrings
🔧 Breaking Changes
None! This release is fully backward compatible with v0.7.x.
📊 Statistics
- 44 files changed: 3,374 additions, 491 deletions
- 44 new files including tests, CI config, and documentation
- All CI checks passing ✅
🙏 Credits
Special thanks to:
- @odoucet for virtual environment improvements (#3, #4)
- @danrossi for suggesting Depth Anything V3 integration (#7)
- Community feedback and testing
🔮 Coming Soon
- RAFT (Optical Flow) integration for motion-aware depth refinement
- Improved temporal consistency
- Major refactoring and architectural improvements
Full Changelog: v0.7.7...v0.8.0