215 lines
4.8 KiB
Markdown
215 lines
4.8 KiB
Markdown
# RunPod Deployment Guide
|
|
|
|
## Quick Start (Recommended)
|
|
|
|
### 1. Create RunPod Instance
|
|
- **Template**: `runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04`
|
|
- **GPU**: NVIDIA A40 (48GB VRAM)
|
|
- **Storage**: 50GB+ (for videos and models)
|
|
- **Persistent Storage**: Recommended for model caching
|
|
|
|
### 2. Connect and Setup
|
|
```bash
|
|
# After SSH/Terminal access
|
|
cd /workspace
|
|
|
|
# Clone your repository
|
|
git clone https://github.com/YOUR_USERNAME/sam2e.git
|
|
cd sam2e
|
|
|
|
# Run setup script (already executable in git)
|
|
./runpod_setup.sh
|
|
|
|
# Install SAM2 separately (not on PyPI)
|
|
pip install git+https://github.com/facebookresearch/segment-anything-2.git
|
|
```
|
|
|
|
### 3. Upload Your Video
|
|
```bash
|
|
# Option 1: wget from URL
|
|
wget -O /workspace/data/myvideo.mp4 "https://your-video-url.com/video.mp4"
|
|
|
|
# Option 2: Use RunPod's file browser
|
|
# Upload to /workspace/data/
|
|
|
|
# Option 3: rclone from cloud storage
|
|
rclone copy remote:path/to/video.mp4 /workspace/data/
|
|
```
|
|
|
|
### 4. Configure and Run
|
|
```bash
|
|
# Use RunPod-optimized config
|
|
cp config_runpod.yaml config.yaml
|
|
|
|
# Edit video path
|
|
nano config.yaml # Update video_path
|
|
|
|
# Run processing
|
|
vr180-matting config.yaml
|
|
```
|
|
|
|
## Advanced Docker Deployment
|
|
|
|
### Option 1: Pre-built Docker Image
|
|
```bash
|
|
# Build and push to Docker Hub (do this locally first)
|
|
docker build -t yourusername/vr180-matting:latest .
|
|
docker push yourusername/vr180-matting:latest
|
|
|
|
# On RunPod, use your image as template
|
|
```
|
|
|
|
### Option 2: Build on RunPod
|
|
```bash
|
|
cd /workspace/sam2e
|
|
docker build -t vr180-matting .
|
|
docker run --gpus all -v /workspace/data:/app/data -v /workspace/output:/app/output -it vr180-matting
|
|
```
|
|
|
|
## Performance Tips for A40
|
|
|
|
### Optimal Settings
|
|
```yaml
|
|
processing:
|
|
scale_factor: 0.75 # A40 can handle higher resolution
|
|
chunk_size: 0 # Let it auto-calculate for 48GB
|
|
|
|
detection:
|
|
model: "yolov8m" # or yolov8l for better accuracy
|
|
|
|
matting:
|
|
memory_offload: false # Plenty of VRAM
|
|
fp16: true # Still use FP16 for speed
|
|
```
|
|
|
|
### Expected Performance
|
|
- **50% scale**: ~10-15 FPS processing
|
|
- **75% scale**: ~6-10 FPS processing
|
|
- **100% scale**: ~4-6 FPS processing
|
|
|
|
### Memory Monitoring
|
|
```bash
|
|
# Watch GPU usage while processing
|
|
watch -n 1 nvidia-smi
|
|
|
|
# Or in another terminal
|
|
nvidia-smi dmon -s um
|
|
```
|
|
|
|
## Batch Processing Script
|
|
|
|
Create `batch_process.py` for multiple videos:
|
|
```python
|
|
import os
|
|
import sys
|
|
from pathlib import Path
|
|
from vr180_matting.config import VR180Config
|
|
from vr180_matting.vr180_processor import VR180Processor
|
|
|
|
# Directory setup
|
|
input_dir = Path("/workspace/data/videos")
|
|
output_dir = Path("/workspace/output")
|
|
base_config = "config_runpod.yaml"
|
|
|
|
for video_file in input_dir.glob("*.mp4"):
|
|
print(f"Processing: {video_file.name}")
|
|
|
|
# Load base config
|
|
config = VR180Config.from_yaml(base_config)
|
|
|
|
# Update paths
|
|
config.input.video_path = str(video_file)
|
|
config.output.path = str(output_dir / f"matted_{video_file.stem}.mp4")
|
|
|
|
# Process
|
|
processor = VR180Processor(config)
|
|
processor.process_video()
|
|
|
|
print(f"Completed: {video_file.name}\n")
|
|
```
|
|
|
|
## Cost Optimization
|
|
|
|
### RunPod Spot Instances
|
|
- A40 spot: ~$0.44/hour vs $0.79/hour on-demand
|
|
- Perfect for batch processing
|
|
- Add persistence: $0.10/GB/month for model storage
|
|
|
|
### Processing Time Estimates
|
|
- 30s clip @ 50% scale: ~10 minutes = ~$0.07
|
|
- 1 hour video @ 50% scale: ~12 hours = ~$5.28
|
|
- 1 hour video @ 25% scale: ~6 hours = ~$2.64
|
|
|
|
### Auto-shutdown Script
|
|
```bash
|
|
# Add to end of processing script
|
|
echo "Processing complete, shutting down in 60 seconds..."
|
|
sleep 60
|
|
runpodctl stop instance
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### SAM2 Model Download Issues
|
|
```bash
|
|
# Manual download
|
|
cd /workspace/sam2e/models
|
|
wget https://dl.fbaipublicfiles.com/segment_anything_2/sam2_hiera_large.pt
|
|
```
|
|
|
|
### CUDA Version Mismatch
|
|
```bash
|
|
# Check CUDA version
|
|
nvcc --version
|
|
python -c "import torch; print(torch.version.cuda)"
|
|
|
|
# Reinstall PyTorch if needed
|
|
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
|
|
```
|
|
|
|
### Out of Memory on A40
|
|
```yaml
|
|
# Unlikely, but if it happens:
|
|
processing:
|
|
scale_factor: 0.25
|
|
chunk_size: 300
|
|
matting:
|
|
memory_offload: true
|
|
```
|
|
|
|
## Monitoring Dashboard
|
|
|
|
Create `monitor.py` for real-time stats:
|
|
```python
|
|
import psutil
|
|
import GPUtil
|
|
import time
|
|
|
|
while True:
|
|
# GPU stats
|
|
gpus = GPUtil.getGPUs()
|
|
for gpu in gpus:
|
|
print(f"GPU: {gpu.memoryUsed}MB / {gpu.memoryTotal}MB ({gpu.memoryUtil*100:.1f}%)")
|
|
|
|
# CPU/RAM stats
|
|
print(f"RAM: {psutil.virtual_memory().percent}%")
|
|
print(f"CPU: {psutil.cpu_percent()}%")
|
|
print("-" * 40)
|
|
time.sleep(2)
|
|
```
|
|
|
|
## Quick Commands Reference
|
|
|
|
```bash
|
|
# Test on short clip
|
|
vr180-matting config.yaml --dry-run
|
|
|
|
# Process with monitoring
|
|
vr180-matting config.yaml --verbose
|
|
|
|
# Override settings
|
|
vr180-matting config.yaml --scale 0.25 --format greenscreen
|
|
|
|
# Generate config
|
|
vr180-matting --generate-config my_config.yaml
|
|
``` |