first commit
This commit is contained in:
199
README.md
Normal file
199
README.md
Normal file
@@ -0,0 +1,199 @@
|
||||
# VR180 Human Matting with Det-SAM2
|
||||
|
||||
A proof-of-concept implementation for automated human matting on VR180 3D side-by-side equirectangular video using Det-SAM2 and YOLOv8 detection.
|
||||
|
||||
## Features
|
||||
|
||||
- **Automatic Person Detection**: Uses YOLOv8 to eliminate manual point selection
|
||||
- **VRAM Optimization**: Memory management for RTX 3080 (10GB) compatibility
|
||||
- **VR180-Specific Processing**: Side-by-side stereo handling with disparity mapping
|
||||
- **Flexible Scaling**: 25%, 50%, or 100% processing resolution with AI upscaling
|
||||
- **Multiple Output Formats**: Alpha channel or green screen background
|
||||
- **Chunked Processing**: Handles long videos with memory-efficient chunking
|
||||
- **Cloud GPU Ready**: Docker containerization for RunPod, Vast.ai deployment
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
# Clone repository
|
||||
git clone <repository-url>
|
||||
cd sam2e
|
||||
|
||||
# Install dependencies
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Install in development mode
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
## Quick Start
|
||||
|
||||
1. **Generate example configuration:**
|
||||
```bash
|
||||
vr180-matting --generate-config config.yaml
|
||||
```
|
||||
|
||||
2. **Edit configuration file:**
|
||||
```yaml
|
||||
input:
|
||||
video_path: "path/to/your/vr180_video.mp4"
|
||||
|
||||
processing:
|
||||
scale_factor: 0.5 # Start with 50% for testing
|
||||
|
||||
output:
|
||||
path: "output/matted_video.mp4"
|
||||
format: "alpha" # or "greenscreen"
|
||||
```
|
||||
|
||||
3. **Process video:**
|
||||
```bash
|
||||
vr180-matting config.yaml
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Input Settings
|
||||
- `video_path`: Path to VR180 side-by-side video file
|
||||
|
||||
### Processing Settings
|
||||
- `scale_factor`: Resolution scaling (0.25, 0.5, 1.0)
|
||||
- `chunk_size`: Frames per chunk (0 for auto-calculation)
|
||||
- `overlap_frames`: Frame overlap between chunks
|
||||
|
||||
### Detection Settings
|
||||
- `confidence_threshold`: YOLO detection confidence (0.1-1.0)
|
||||
- `model`: YOLO model size (yolov8n, yolov8s, yolov8m)
|
||||
|
||||
### Matting Settings
|
||||
- `use_disparity_mapping`: Enable stereo optimization
|
||||
- `memory_offload`: CPU offloading for VRAM management
|
||||
- `fp16`: Use FP16 precision to reduce memory usage
|
||||
|
||||
### Output Settings
|
||||
- `path`: Output file/directory path
|
||||
- `format`: "alpha" for RGBA or "greenscreen" for RGB with background
|
||||
- `background_color`: RGB background color for green screen mode
|
||||
- `maintain_sbs`: Keep side-by-side format vs separate eye outputs
|
||||
|
||||
### Hardware Settings
|
||||
- `device`: "cuda" or "cpu"
|
||||
- `max_vram_gb`: VRAM limit (e.g., 10 for RTX 3080)
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Basic Processing
|
||||
```bash
|
||||
# Process with default settings
|
||||
vr180-matting config.yaml
|
||||
|
||||
# Override scale factor
|
||||
vr180-matting config.yaml --scale 0.25
|
||||
|
||||
# Use CPU processing
|
||||
vr180-matting config.yaml --device cpu
|
||||
```
|
||||
|
||||
### Output Formats
|
||||
```bash
|
||||
# Alpha channel output (RGBA PNG sequence)
|
||||
vr180-matting config.yaml --format alpha
|
||||
|
||||
# Green screen output (RGB video)
|
||||
vr180-matting config.yaml --format greenscreen
|
||||
```
|
||||
|
||||
### Memory Optimization
|
||||
```bash
|
||||
# Smaller chunks for limited VRAM
|
||||
vr180-matting config.yaml --chunk-size 300
|
||||
|
||||
# Validate config without processing
|
||||
vr180-matting config.yaml --dry-run
|
||||
```
|
||||
|
||||
## Performance Guidelines
|
||||
|
||||
### RTX 3080 (10GB VRAM)
|
||||
- **25% Scale**: ~5-8 FPS, 6 minutes for 30s clip
|
||||
- **50% Scale**: ~3-5 FPS, 10 minutes for 30s clip
|
||||
- **100% Scale**: Chunked processing, 15-20 minutes for 30s clip
|
||||
|
||||
### Cloud GPU Scaling
|
||||
- **A6000 (48GB)**: $6-8 per hour video
|
||||
- **A100 (80GB)**: $8-12 per hour video
|
||||
- **H100 (80GB)**: $6-10 per hour video
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**CUDA Out of Memory:**
|
||||
- Reduce `scale_factor` (try 0.25)
|
||||
- Lower `chunk_size`
|
||||
- Enable `memory_offload: true`
|
||||
- Use `fp16: true`
|
||||
|
||||
**No Persons Detected:**
|
||||
- Lower `confidence_threshold`
|
||||
- Try larger YOLO model (yolov8s, yolov8m)
|
||||
- Check input video quality
|
||||
|
||||
**Poor Edge Quality:**
|
||||
- Increase `scale_factor` for final processing
|
||||
- Reduce compression in output format
|
||||
- Enable edge refinement post-processing
|
||||
|
||||
### Memory Monitoring
|
||||
The tool provides detailed memory usage reports:
|
||||
```
|
||||
VRAM Allocated: 8.2 GB
|
||||
VRAM Free: 1.8 GB
|
||||
VRAM Utilization: 82%
|
||||
```
|
||||
|
||||
## Architecture
|
||||
|
||||
### Processing Pipeline
|
||||
1. **Video Analysis**: Load metadata, analyze SBS layout
|
||||
2. **Chunking**: Divide video into memory-efficient chunks
|
||||
3. **Detection**: YOLOv8 person detection per chunk
|
||||
4. **Matting**: SAM2 mask propagation with memory optimization
|
||||
5. **VR180 Processing**: Stereo-aware matting with consistency validation
|
||||
6. **Output**: Combine chunks and save in requested format
|
||||
|
||||
### Memory Management
|
||||
- Automatic VRAM monitoring and emergency cleanup
|
||||
- CPU offloading for frame storage
|
||||
- FP16 precision support
|
||||
- Adaptive chunk sizing based on available memory
|
||||
|
||||
## Development
|
||||
|
||||
### Project Structure
|
||||
```
|
||||
vr180_matting/
|
||||
├── config.py # Configuration management
|
||||
├── detector.py # YOLOv8 person detection
|
||||
├── sam2_wrapper.py # SAM2 integration
|
||||
├── memory_manager.py # VRAM optimization
|
||||
├── video_processor.py # Base video processing
|
||||
├── vr180_processor.py # VR180-specific processing
|
||||
└── main.py # CLI entry point
|
||||
```
|
||||
|
||||
### Contributing
|
||||
1. Fork the repository
|
||||
2. Create a feature branch
|
||||
3. Make changes with tests
|
||||
4. Submit a pull request
|
||||
|
||||
## License
|
||||
|
||||
[License information]
|
||||
|
||||
## Acknowledgments
|
||||
|
||||
- SAM2 team for the segmentation model
|
||||
- Ultralytics for YOLOv8 detection
|
||||
- Research referenced in `research.md`
|
||||
Reference in New Issue
Block a user