scott/test2

Fork 0

Go to file

Scott Register 86274ba04a video debug

2025-07-26 09:07:57 -07:00

vr180_matting

video debug

2025-07-26 09:07:57 -07:00

agents.md

first commit

2025-07-26 07:23:50 -07:00

config_example.yaml

install sam2 the way facebook says

2025-07-26 08:14:35 -07:00

config_runpod.yaml

please fix

2025-07-26 08:43:18 -07:00

docker-compose.yml

commit2 runpod

2025-07-26 07:35:44 -07:00

Dockerfile

commit2 runpod

2025-07-26 07:35:44 -07:00

README.md

first commit

2025-07-26 07:23:50 -07:00

requirements.txt

decord

2025-07-26 08:55:27 -07:00

research.md

first commit

2025-07-26 07:23:50 -07:00

RUNPOD_DEPLOYMENT.md

no double clone

2025-07-26 07:49:24 -07:00

runpod_setup.sh

decord

2025-07-26 08:55:27 -07:00

setup.py

commit2 runpod

2025-07-26 07:35:44 -07:00

spec.md

first commit

2025-07-26 07:23:50 -07:00

test_installation.py

install sam2 the way facebook says

2025-07-26 08:14:35 -07:00

README.md

VR180 Human Matting with Det-SAM2

A proof-of-concept implementation for automated human matting on VR180 3D side-by-side equirectangular video using Det-SAM2 and YOLOv8 detection.

Features

Automatic Person Detection: Uses YOLOv8 to eliminate manual point selection
VRAM Optimization: Memory management for RTX 3080 (10GB) compatibility
VR180-Specific Processing: Side-by-side stereo handling with disparity mapping
Flexible Scaling: 25%, 50%, or 100% processing resolution with AI upscaling
Multiple Output Formats: Alpha channel or green screen background
Chunked Processing: Handles long videos with memory-efficient chunking
Cloud GPU Ready: Docker containerization for RunPod, Vast.ai deployment

Installation

# Clone repository
git clone <repository-url>
cd sam2e

# Install dependencies
pip install -r requirements.txt

# Install in development mode
pip install -e .

Quick Start

Generate example configuration:

vr180-matting --generate-config config.yaml

Edit configuration file:

input:
  video_path: "path/to/your/vr180_video.mp4"
  
processing:
  scale_factor: 0.5  # Start with 50% for testing
  
output:
  path: "output/matted_video.mp4"
  format: "alpha"    # or "greenscreen"

Process video:

vr180-matting config.yaml

Configuration

Input Settings

video_path: Path to VR180 side-by-side video file

Processing Settings

scale_factor: Resolution scaling (0.25, 0.5, 1.0)
chunk_size: Frames per chunk (0 for auto-calculation)
overlap_frames: Frame overlap between chunks

Detection Settings

confidence_threshold: YOLO detection confidence (0.1-1.0)
model: YOLO model size (yolov8n, yolov8s, yolov8m)

Matting Settings

use_disparity_mapping: Enable stereo optimization
memory_offload: CPU offloading for VRAM management
fp16: Use FP16 precision to reduce memory usage

Output Settings

path: Output file/directory path
format: "alpha" for RGBA or "greenscreen" for RGB with background
background_color: RGB background color for green screen mode
maintain_sbs: Keep side-by-side format vs separate eye outputs

Hardware Settings

device: "cuda" or "cpu"
max_vram_gb: VRAM limit (e.g., 10 for RTX 3080)

Usage Examples

Basic Processing

# Process with default settings
vr180-matting config.yaml

# Override scale factor
vr180-matting config.yaml --scale 0.25

# Use CPU processing
vr180-matting config.yaml --device cpu

Output Formats

# Alpha channel output (RGBA PNG sequence)
vr180-matting config.yaml --format alpha

# Green screen output (RGB video)
vr180-matting config.yaml --format greenscreen

Memory Optimization

# Smaller chunks for limited VRAM
vr180-matting config.yaml --chunk-size 300

# Validate config without processing
vr180-matting config.yaml --dry-run

Performance Guidelines

RTX 3080 (10GB VRAM)

25% Scale: ~5-8 FPS, 6 minutes for 30s clip
50% Scale: ~3-5 FPS, 10 minutes for 30s clip
100% Scale: Chunked processing, 15-20 minutes for 30s clip

Cloud GPU Scaling

A6000 (48GB): $6-8 per hour video
A100 (80GB): $8-12 per hour video
H100 (80GB): $6-10 per hour video

Troubleshooting

Common Issues

CUDA Out of Memory:

Reduce scale_factor (try 0.25)
Lower chunk_size
Enable memory_offload: true
Use fp16: true

No Persons Detected:

Lower confidence_threshold
Try larger YOLO model (yolov8s, yolov8m)
Check input video quality

Poor Edge Quality:

Increase scale_factor for final processing
Reduce compression in output format
Enable edge refinement post-processing

Memory Monitoring

The tool provides detailed memory usage reports:

VRAM Allocated: 8.2 GB
VRAM Free: 1.8 GB  
VRAM Utilization: 82%

Architecture

Processing Pipeline

Video Analysis: Load metadata, analyze SBS layout
Chunking: Divide video into memory-efficient chunks
Detection: YOLOv8 person detection per chunk
Matting: SAM2 mask propagation with memory optimization
VR180 Processing: Stereo-aware matting with consistency validation
Output: Combine chunks and save in requested format

Memory Management

Automatic VRAM monitoring and emergency cleanup
CPU offloading for frame storage
FP16 precision support
Adaptive chunk sizing based on available memory

Development

Project Structure

vr180_matting/
├── config.py          # Configuration management
├── detector.py        # YOLOv8 person detection
├── sam2_wrapper.py     # SAM2 integration
├── memory_manager.py   # VRAM optimization
├── video_processor.py  # Base video processing
├── vr180_processor.py  # VR180-specific processing
└── main.py            # CLI entry point

Contributing

Fork the repository
Create a feature branch
Make changes with tests
Submit a pull request

License

[License information]

Acknowledgments

SAM2 team for the segmentation model
Ultralytics for YOLOv8 detection
Research referenced in research.md