Skip to content

Add table properties to govern Iceberg materialized view refresh and query behavior #26755

@tdcmeehan

Description

@tdcmeehan

Expected Behavior or Use Case

Materialized views should support configurable staleness tolerance. Users can specify:

  1. Staleness tracking mode:
    - BY_REFRESH: Staleness is measured by time since last refresh
    - BY_BASE_TABLE: Staleness is measured by time since base tables were modified
  2. Staleness window: A duration (e.g., 1h, 30m) that defines acceptable staleness
  3. Stale read behavior: What happens when staleness exceeds the window:
    - FAIL: Query fails with an error
    - USE_VIEW_QUERY: Fall back to executing the view query instead of reading the data table

When querying a materialized view:

  • If fully materialized (no base table changes), use the data table
  • If stale but within the staleness window, use the data table
  • If stale beyond the staleness window, apply the configured stale read behavior
  • If no staleness config is set, use the session property materialized_view_stale_read_behavior as the default

Presto Component, Service, or Connector

  • presto-spi: Core interfaces (MaterializedViewStatus, MaterializedViewDefinition, MaterializedViewStalenessConfig)
  • presto-main-base: Rewrite rule (MaterializedViewRewrite), session properties
  • presto-iceberg: Iceberg connector implementation
  • presto-memory: Memory connector implementation

Possible Implementation

  1. Add MaterializedViewStalenessConfig class with three fields:
    - staleReadBehavior (enum: FAIL, USE_VIEW_QUERY)
    - stalenessTracking (enum: BY_REFRESH, BY_BASE_TABLE)
    - stalenessWindow (Duration)
  2. Add runtime timestamp fields to MaterializedViewStatus:
    - lastRefreshTime: When the MV was last refreshed
    - lastBaseTableModificationTime: When base tables were last modified
  3. Update MaterializedViewRewrite rule to:
    - Check if view is fully materialized first
    - If stale, check staleness config and compare timestamps against the window
    - Apply the configured behavior when staleness exceeds tolerance
  4. Add session property materialized_view_stale_read_behavior for default behavior when no staleness config is set on the MV
  5. Update connectors to populate timestamps in getMaterializedViewStatus()

Example Screenshots (if appropriate):

Context

The desired behavior of how to handle staleness may depend on the view itself and how it is intended to be used. By introducing table properties, we give users flexibility to configure it on a per table basis.

Metadata

Metadata

Assignees

Type

No type

Projects

Status

🆕 Unprioritized

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions