Skip to content

Conversation

@danielaskdd
Copy link
Collaborator

Add Multimodal Processing Status Support to DocProcessingStatus

Summary

This PR adds support for tracking multimodal processing status in document metadata and implements automatic status conversion logic to ensure accurate document state representation with RayAnything.

Problem

RayAnything use LightRAG to process normal text data, and add multimodal_processed field to doc_status storage. When multimodal_processed is Flase, means additional multimodal processing is required. The current system treats these documents as fully PROCESSED, which doesn't accurately reflect their actual processing state.

Solution

Added a multimodal_processed field to DocProcessingStatus class with automatic status conversion logic:

Key Changes:

  • Added multimodal_processed: bool | None field to DocProcessingStatus (with repr=False to keep it internal)
  • Implemented __post_init__ method to handle status conversion
  • Field is kept in the object but hidden from repr() output
  • Remove multimodal_processed from DocStatus enum value
  • Update UI filter logic

Business Logic:

  • When multimodal_processed=False AND status=PROCESSED, the status is automatically converted to PREPROCESSED
  • When multimodal_processed=True OR is None, no status conversion occurs
  • Only affects documents in PROCESSED status
  • Other statuses (PENDING, PROCESSING, FAILED) are not affected

Technical Details

Implementation:

@dataclass
class DocProcessingStatus:
    # ... existing fields ...
    multimodal_processed: bool | None = field(default=None, repr=False)
    
    def __post_init__(self):
        if self.multimodal_processed is not None:
            if self.multimodal_processed is False and self.status == DocStatus.PROCESSED:
                self.status = DocStatus.PREPROCESSED

Design Decision:

  • Field uses repr=False to hide it from debug output while keeping it accessible
  • No additional private field needed - simplified single-field approach
  • Fully backward compatible with existing data

Impact Analysis

API Layer: No impact - API response models use explicit field mapping, multimodal_processed is not exposed

Frontend: No impact - Frontend uses TypeScript types that don't include this field

Storage: No impact - Field is persisted but doesn't affect existing logic

Backward Compatibility: Fully compatible - documents without the field default to None

Benefits

  1. Accurate Status Tracking: Documents are correctly marked as PREPROCESSED when multimodal processing is incomplete
  2. Internal Debugging: Field remains accessible for debugging and monitoring
  3. Clean Output: Hidden from repr() to avoid cluttering debug output
  4. Future-Ready: Field is available for future multimodal processing features
  5. Zero Breaking Changes: Completely backward compatible

• Use "preprocessed" to indicate multimodal processing is required
• Update DocProcessingStatus to process status convertion automatically
• Remove multimodal_processed from DocStatus enum value
• Update UI filter logic
@danielaskdd danielaskdd merged commit 06533fd into HKUDS:main Oct 22, 2025
1 check passed
@danielaskdd danielaskdd deleted the preprocess-rayanything branch October 22, 2025 12:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant