Files

Scott Register b97a3752a7 stereo mask working

2025-07-31 11:13:31 -07:00

8.0 KiB

Raw Permalink Blame History

Plan: Separate Left/Right Eye Processing for VR180 SAM2 Pipeline

Overview

Implement a new processing mode that splits VR180 side-by-side frames into separate left and right halves, processes each eye independently through SAM2, then recombines them into the final output. This should improve tracking accuracy by removing parallax confusion between eyes.

Key Changes Required

1. Configuration Updates

File: config.yaml

Add new configuration option: processing.separate_eye_processing: false (default off for backward compatibility)
Add related options:
- processing.enable_greenscreen_fallback: true (render full green if no humans detected)
- processing.eye_overlap_pixels: 0 (optional overlap for blending)

2. Core SAM2 Processor Enhancements

File: core/sam2_processor.py

New Methods:

split_frame_into_eyes(frame) -> (left_frame, right_frame)
split_video_into_eyes(video_path, left_output, right_output, scale)
process_single_eye_segment(segment_info, eye_side, yolo_prompts, previous_masks, inference_scale)
combine_eye_masks(left_masks, right_masks, full_frame_shape) -> combined_masks
create_greenscreen_segment(segment_info, duration_seconds) -> bool

Modified Methods:

process_single_segment() - Add branch for separate eye processing mode
New processing flow:
1. Check if separate_eye_processing enabled
2. If enabled: split segment video into left/right eye videos
3. Process each eye independently with SAM2
4. Combine masks back to full frame format
5. If fallback needed: create full greenscreen segment

3. YOLO Detector Enhancements

File: core/yolo_detector.py

New Methods:

detect_humans_in_single_eye(frame, eye_side) -> List[Dict]
convert_eye_detections_to_sam2_prompts(detections, eye_side) -> List[Dict]
has_any_detections(detections_list) -> bool

Modified Methods:

detect_humans_in_video_first_frame() - Add eye-specific detection support
Object ID assignment: Always use obj_id=1 for single-eye processing (since each eye is processed independently)

4. Mask Processor Updates

File: core/mask_processor.py

New Methods:

create_full_greenscreen_frame(frame_shape) -> np.ndarray
process_greenscreen_only_segment(segment_info, frame_count) -> bool

Modified Methods:

apply_green_mask() - Handle combined eye masks properly
Add support for full-greenscreen fallback when no humans detected

5. Main Pipeline Integration

File: main.py

Processing Flow Changes:

# For each segment:
if config.get('processing.separate_eye_processing', False):
    # 1. Run YOLO on full frame to check for ANY human presence
    full_frame_detections = detector.detect_humans_in_video_first_frame(segment_video)
    
    if not full_frame_detections:
        # No humans detected anywhere - create full greenscreen segment
        success = mask_processor.process_greenscreen_only_segment(segment_info, expected_frame_count)
        continue
    
    # 2. Split detections by eye and process separately
    left_detections = [d for d in full_frame_detections if is_in_left_half(d, frame_width)]
    right_detections = [d for d in full_frame_detections if is_in_right_half(d, frame_width)]
    
    # 3. Process left eye (if detections exist)
    left_masks = None
    if left_detections:
        left_eye_prompts = detector.convert_eye_detections_to_sam2_prompts(left_detections, 'left')
        left_masks = sam2_processor.process_single_eye_segment(segment_info, 'left', left_eye_prompts, previous_left_masks, inference_scale)
    
    # 4. Process right eye (if detections exist)  
    right_masks = None
    if right_detections:
        right_eye_prompts = detector.convert_eye_detections_to_sam2_prompts(right_detections, 'right')
        right_masks = sam2_processor.process_single_eye_segment(segment_info, 'right', right_eye_prompts, previous_right_masks, inference_scale)
    
    # 5. Combine masks back to full frame format
    if left_masks or right_masks:
        combined_masks = sam2_processor.combine_eye_masks(left_masks, right_masks, full_frame_shape)
        # Continue with normal mask processing...
    else:
        # Neither eye had trackable humans - full greenscreen fallback
        success = mask_processor.process_greenscreen_only_segment(segment_info, expected_frame_count)

else:
    # Original processing mode (current behavior)
    # ... existing logic unchanged

6. File Structure Changes

New Files:

core/eye_processor.py - Dedicated class for eye-specific operations
utils/video_utils.py - Video manipulation utilities (splitting, combining)

Modified Files:

All core processing modules as detailed above
Update logging to distinguish left/right eye processing
Update debug frame generation for eye-specific visualization

7. Debug and Monitoring Enhancements

Debug Outputs:

left_eye_debug.jpg - Left eye YOLO detections
right_eye_debug.jpg - Right eye YOLO detections
left_eye_sam2_masks.jpg - Left eye SAM2 results
right_eye_sam2_masks.jpg - Right eye SAM2 results
combined_masks_debug.jpg - Final combined result

Logging Enhancements:

Clear distinction between left/right eye processing stages
Performance metrics for each eye processing
Fallback trigger logging when no humans detected

8. Performance Considerations

Optimizations:

Parallel Processing: Process left and right eyes simultaneously using threading
Selective Processing: Skip SAM2 for eyes with no YOLO detections
Memory Management: Clean up intermediate eye videos promptly
Caching: Cache split eye videos if processing multiple segments

Resource Usage:

Memory: ~2x peak usage during eye processing (temporary)
Storage: Temporary left/right eye videos (~1.5x original size)
Compute: Potentially faster overall due to smaller frame processing

9. Backward Compatibility

Default Behavior:

separate_eye_processing: false by default
Existing configurations work unchanged
All current functionality preserved

Migration Path:

Users can gradually test new mode on problematic segments
Configuration flag allows easy A/B testing
Existing debug outputs remain functional

10. Error Handling and Fallbacks

Robust Error Recovery:

If eye splitting fails → fall back to original processing
If single eye SAM2 fails → use greenscreen for that eye
If both eyes fail → full greenscreen segment
Comprehensive logging of all fallback triggers

Quality Validation:

Verify combined masks have reasonable pixel counts
Check for mask alignment issues between eyes
Validate segment completeness before marking done

Implementation Priority

Phase 1 (Core Functionality)

Configuration schema updates
Basic eye splitting and recombining logic
Modified SAM2 processor with separate eye support
Greenscreen fallback implementation

Phase 2 (Integration)

Main pipeline integration with new processing mode
YOLO detector eye-specific enhancements
Mask processor updates for combined masks
Basic error handling and fallbacks

Phase 3 (Polish)

Performance optimizations (parallel processing)
Enhanced debug outputs and logging
Comprehensive testing and validation
Documentation updates

Expected Benefits

Tracking Improvements:

Eliminated Parallax Confusion: SAM2 processes single viewpoint per eye
Better Object Consistency: Single object tracking per eye view
Improved Temporal Coherence: Less cross-eye interference
Reduced False Positives: Eye-specific context for tracking

Operational Benefits:

Graceful Degradation: Full greenscreen when humans not detected
Flexible Processing: Can enable/disable per pipeline
Better Debug Visibility: Eye-specific debug outputs
Performance Scalability: Smaller frames = faster processing per eye

This plan maintains full backward compatibility while adding the requested separate eye processing capability with robust fallback mechanisms.

8.0 KiB Raw Permalink Blame History

Plan: Separate Left/Right Eye Processing for VR180 SAM2 Pipeline

Overview

Key Changes Required

1. Configuration Updates

2. Core SAM2 Processor Enhancements

New Methods:

Modified Methods:

3. YOLO Detector Enhancements

New Methods:

Modified Methods:

4. Mask Processor Updates

New Methods:

Modified Methods:

5. Main Pipeline Integration

Processing Flow Changes:

6. File Structure Changes

New Files:

Modified Files:

7. Debug and Monitoring Enhancements

Debug Outputs:

Logging Enhancements:

8. Performance Considerations

Optimizations:

Resource Usage:

9. Backward Compatibility

Default Behavior:

Migration Path:

10. Error Handling and Fallbacks

Robust Error Recovery:

Quality Validation:

Implementation Priority

Phase 1 (Core Functionality)

Phase 2 (Integration)

Phase 3 (Polish)

Expected Benefits

Tracking Improvements:

Operational Benefits:

8.0 KiB

Raw Permalink Blame History