# VR180 Streaming Matting True streaming implementation for VR180 human matting with constant memory usage. ## Key Features - **True Streaming**: Process frames one at a time without accumulation - **Constant Memory**: No memory buildup regardless of video length - **Stereo Consistency**: Master-slave processing ensures matched detection - **2-3x Faster**: Eliminates chunking overhead from original implementation - **Direct FFmpeg Pipe**: Zero-copy frame writing ## Architecture ``` Input Video → Frame Reader → SAM2 Streaming → Frame Writer → Output Video ↓ ↓ ↓ ↓ (no chunks) (one frame) (propagate) (immediate write) ``` ### Components 1. **StreamingFrameReader** (`frame_reader.py`) - Reads frames one at a time - Supports seeking for resume/recovery - Constant memory footprint 2. **StreamingFrameWriter** (`frame_writer.py`) - Direct pipe to ffmpeg encoder - GPU-accelerated encoding (H.264/H.265) - Preserves audio from source 3. **StereoConsistencyManager** (`stereo_manager.py`) - Master-slave eye processing - Disparity-aware detection transfer - Automatic consistency validation 4. **SAM2StreamingProcessor** (`sam2_streaming.py`) - Integrates with SAM2's native video predictor - Memory-efficient state management - Continuous correction support 5. **VR180StreamingProcessor** (`streaming_processor.py`) - Main orchestrator - Adaptive GPU scaling - Checkpoint/resume support ## Usage ### Quick Start ```bash # Generate example config python -m vr180_streaming --generate-config my_config.yaml # Edit config with your paths vim my_config.yaml # Run processing python -m vr180_streaming my_config.yaml ``` ### Command Line Options ```bash # Override output path python -m vr180_streaming config.yaml --output /path/to/output.mp4 # Process specific frame range python -m vr180_streaming config.yaml --start-frame 1000 --max-frames 5000 # Override scale factor python -m vr180_streaming config.yaml --scale 0.25 # Dry run to validate config python -m vr180_streaming config.yaml --dry-run ``` ## Configuration Key configuration options: ```yaml streaming: mode: true # Enable streaming mode buffer_frames: 10 # Lookahead buffer processing: scale_factor: 0.5 # Resolution scaling adaptive_scaling: true # Dynamic GPU optimization stereo: mode: "master_slave" # Stereo consistency mode master_eye: "left" # Which eye leads detection recovery: enable_checkpoints: true # Save progress auto_resume: true # Resume from checkpoint ``` ## Performance Compared to chunked implementation: | Metric | Chunked | Streaming | Improvement | |--------|---------|-----------|-------------| | Speed | ~0.54s/frame | ~0.18s/frame | 3x faster | | Memory | 100GB+ peak | <50GB constant | 2x lower | | VRAM | 2.5% usage | 70%+ usage | 28x better | | Consistency | Variable | Guaranteed | ✓ | ## Requirements - Python 3.10+ - PyTorch 2.0+ - CUDA GPU (8GB+ VRAM recommended) - FFmpeg with GPU encoding support - SAM2 (segment-anything-2) ## Troubleshooting ### Out of Memory - Reduce `scale_factor` in config - Enable `adaptive_scaling` - Ensure `memory_offload: true` ### Stereo Mismatch - Adjust `consistency_threshold` - Enable `disparity_correction` - Check `baseline` and `focal_length` settings ### Slow Processing - Use GPU video codec (`h264_nvenc`) - Reduce `correction_interval` - Lower output quality (`crf: 23`) ## Advanced Features ### Adaptive Scaling Automatically adjusts processing resolution based on GPU load: ```yaml processing: adaptive_scaling: true target_gpu_usage: 0.7 min_scale: 0.25 max_scale: 1.0 ``` ### Continuous Correction Periodically refines tracking for long videos: ```yaml matting: continuous_correction: true correction_interval: 300 # Every 5 seconds at 60fps ``` ### Checkpoint Recovery Automatically resume from interruptions: ```yaml recovery: enable_checkpoints: true checkpoint_interval: 1000 auto_resume: true ``` ## Contributing Please ensure your code follows the streaming architecture principles: - No frame accumulation in memory - Immediate processing and writing - Proper resource cleanup - Checkpoint support for long videos