Skip to content

Conversation

@dain
Copy link
Member

@dain dain commented Dec 24, 2025

This PR introduces support for the VARIANT type in Trino, based on the
Apache Iceberg Variant specification.

Key points:

  • Adds a new VARIANT type to the SPI and engine
  • Implements casts to and from VARIANT, including arrays, rows, maps,
    and JSON
  • Supports dereferencing using the SQL subscript operator
  • Adds client protocol support for VARIANT
  • Adds Iceberg integration for VARIANT backed by Parquet
  • Documents VARIANT functions, operators, casts, and Iceberg constraints

Iceberg notes:

  • VARIANT is supported only for Iceberg tables using format version 3
  • Iceberg format version 3 support is experimental
  • Only Parquet-backed Iceberg tables are supported in this change
  • Variant is currently transported to clients as variant spec binary format.
  • Vairant is not integrated into clients, and clients will either error or render as binary.

This PR does not modify existing Delta Lake variant-to-JSON behavior.

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

# Iceberg
* Add experimental support for the `VARIANT` type for Iceberg tables using format version 3 and Parquet format.

# SPI
* Add `VARIANT` type defined by the Iceberg specification.

@cla-bot cla-bot bot added the cla-signed label Dec 24, 2025
@github-actions github-actions bot added docs iceberg Iceberg connector labels Dec 24, 2025
@ebyhr ebyhr requested a review from martint December 24, 2025 05:53
@dain dain force-pushed the variant branch 3 times, most recently from 56b80e4 to 9336d65 Compare December 25, 2025 04:00
@dain dain marked this pull request as ready for review December 25, 2025 04:00
@dain dain requested a review from electrum December 25, 2025 04:01
{
public static final int FORMAT_VERSION_SUPPORT_MIN = 1;
public static final int FORMAT_VERSION_SUPPORT_MAX = 2;
public static final int FORMAT_VERSION_SUPPORT_MAX = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to bump the supported iceberg version without support for some of the other V3 features ?
E.g. we don't support reading deletion vectors yet and if other engines write those, then we may have correctness issues or errors.
It looks we backed out changes for deletion vectors previously #25550 because V3 support requires handling of row lineage as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could consider adding variant type support to delta lake connector first, as we already support variant there. That way the rest of the changes in this PR won't need to wait for iceberg V3.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know anything about delta, but I expect it will take less than 10 mintues to integrate for someone that knows what they are doing. Basically, instead of calling the "deserialize variant to json" method, you call "deserialize variant to variant" method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for v3, I think a reasonable solution is to add a config property that let's you use v3 until DVs are implemented. This PR is huge, and we should check it in instead of waiting.

I'm going to take a look at adding DV support because it seems trivial, but I don't know if I'll have time to finish it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the DV code looks simple. I've already hacked together the read side, and should be able to quicly get the write side working.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only supporting DV is insufficient. Please also add tests for default values and row lineage.

The connector throws exceptions for unsupported types (for example, timestamp nanos and geospatial types), which makes it safer. However, there are still cases where we could violate the spec if unsupported features are neither properly supported nor explicitly rejected.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a commit to the end that checks for unsupported v3 features and just fails the queries. So if you have a table with default values, v3 encryption, or DV, you get an exception. We should not need to do anything for row lineage since we do not expose the new columns (so you would get an exception if you try to use them), and the existing optimizer code checks if the version is exactly 2 (so it fails on v3).

I believe this is every feature that effects correctness, and they should all just fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with current state of the PR, is Trino allowed to v3 tables?
does it fulfill all the writer duties?

@martint
Copy link
Member

martint commented Dec 25, 2025

Thanks for putting this up. We talked about it a few weeks ago over Slack, so I wanted to capture my thoughts here.

SQL 2023 defines a rich SQL/JSON data model and set of operations and semantics that's almost a 1-to-1 mapping with Iceberg Variant. Note that, importantly, SQL/JSON does not mandate a textual encoding, but only conversion functions to and from text that produce RFC 8259 JSON.

image (4) image (3) image (2) image (1)

Instead of adding a new data type, we should:

  • Change Trino's JSON data type to use a more efficient encoding internally (we can adopt Variant's encoding if that's suitable enough)
  • Add support for conversions to/from the additional data types (datetime types, etc)
  • Adjust the existing JSON functions (json_query, json_value, json_table, etc) to operate on the new encoding
  • Map Iceberg's and Delta Variant to Trino JSON. We already do that in Delta, but we'll need to adjust the implementation a bit to account for the change in format.

This gives us:

  • A single data type that has all the semantics we need. That means less confusion for users on when to use one vs the other.
  • Trino's JSON type, which predates SQL's standardization of the JSON type, up to compliance with the specification
  • All the functions and syntax that work for JSON will work out of the box, without having to define a parallel set of functions to operate on Variant.

@dain
Copy link
Member Author

dain commented Dec 27, 2025

@martint Thanks for writing this up — I agree that SQL:2023 defines a rich SQL/JSON data model, and at the abstract model level there is significant overlap with Iceberg Variant (null / scalar / array / object / sequence).

However, I think it is important to separate data model similarity from type semantics, interoperability guarantees, and physical representation.

This PR is intentionally focused on first-class Iceberg Variant support. That implies a number of constraints that are part of the Iceberg specification and cannot be relaxed or evolved independently:

  • A closed and spec-defined scalar set
  • A fixed binary encoding that must round-trip exactly
  • Stable semantics across engines for read/write interoperability

Concretely, Iceberg Variant has a number of limitations that are part of the contract:

  • Timestamps are limited to microsecond and nanosecond precision
  • There is no stored time zone information
  • There are no interval types
  • There are no IP address types
  • There are no geometry or geography types (including Iceberg’s own geo extensions)
  • Equality and comparison semantics are explicitly defined by the Variant spec and differ from SQL equality in important ways

Because these constraints are fundamental to Variant, equating it with SQL/JSON is dangerous in both directions:

  • SQL/JSON allows (and encourages) richer and potentially extensible scalar semantics that Variant cannot support without a spec revision
  • Treating JSON as backed by Variant would implicitly restrict JSON semantics and create surprising behavior or broken round-tripping guarantees

For these reasons, representing Variant as a distinct Trino type is deliberate. It makes the interoperability boundary explicit and avoids accidental promises about future extensibility or semantic compatibility.

It is also important to note that JSON is already a trivial cast away. This PR supports casts in and out of Variant, including JSON ↔ VARIANT, and users who want JSON semantics can get them immediately via explicit casts. The hard problem here is not JSON manipulation — it is correctness and fidelity of Variant itself.

More broadly, if Trino were to implement deeper SQL/JSON compliance, Variant is not a good internal encoding for that purpose. Variant values are expensive and difficult to construct:

  • Values must be sorted according to Variant rules
  • Sizes are encoded as prefixes, which prevents true streaming construction
  • Building a Variant often requires multiple passes over the data (this PR uses three passes)

These characteristics are acceptable and necessary for an interchange format, but they are not a good foundation for a general-purpose, extensible SQL/JSON implementation.

Refactoring Trino’s JSON type to adopt a new internal encoding and updating the full SQL/JSON function surface would be a valuable effort, but it is a large, orthogonal project with broad impact. This PR is intentionally scoped to deliver correct, interoperable Iceberg Variant support without blocking on that work.

In short, this PR is about delivering real Variant semantics with strong interoperability guarantees. SQL/JSON compliance and JSON internals can evolve independently without conflating the two or weakening either.

{
public static final int FORMAT_VERSION_SUPPORT_MIN = 1;
public static final int FORMAT_VERSION_SUPPORT_MAX = 2;
public static final int FORMAT_VERSION_SUPPORT_MAX = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only supporting DV is insufficient. Please also add tests for default values and row lineage.

The connector throws exceptions for unsupported types (for example, timestamp nanos and geospatial types), which makes it safer. However, there are still cases where we could violate the spec if unsupported features are neither properly supported nor explicitly rejected.

dain added 8 commits December 27, 2025 15:53
The variant implementation is the stack type for the variant type. This
includes the necessary code to manipulate variants for the casts
necessary in Trino
For json encoding, variant is encoded as binary for the metadata and a
binary for the value.
Any type supported by variant can be cast to or from variant, including
array, row, and map of containing any supported type. Additionally,
JSON can be cast to and from variant.
Variant can also be dereferenced using the dereference operator.
Other Iceberg formats are not supported in this change.
The existing Delta Lake variant to JSON mapping is untouched.
Currently, default column values, v3 file encryption, deletion vectors,
and any updates to v3 tables are not supported and throw.
@martint
Copy link
Member

martint commented Dec 29, 2025

This PR is intentionally focused on first-class Iceberg Variant support. That implies a number of constraints that are part of the Iceberg specification and cannot be relaxed or evolved independently:

In that case, we should keep the type and functions confined to the Iceberg plugin.

Concretely, Iceberg Variant has a number of limitations that are part of the contract:
Timestamps are limited to microsecond and nanosecond precision
There is no stored time zone information

That's fair. There would need to be some conversion/checking on write to ensure the contents have the expected precision that the connector needs. Alternatively, we could have an envelope with metadata about the contained data types so that the connector can decide whether to store as is or convert.

There are no interval types
There are no IP address types
There are no geometry or geography types (including Iceberg’s own geo extensions)

SQL/JSON doesn't have those either, so that's not a problem.

Equality and comparison semantics are explicitly defined by the Variant spec and differ from SQL equality in important ways

This could be a problem, yes. But in what ways? Can you elaborate?

SQL/JSON allows (and encourages) richer and potentially extensible scalar semantics that Variant cannot support without a spec revision

Not really. It doesn't encourage extensions -- that's a choice we would make.

It is also important to note that JSON is already a trivial cast away. This PR supports casts in and out of Variant, including JSON ↔ VARIANT, and users who want JSON semantics can get them immediately via explicit casts.

I could get behind that as long as we don't then go and try to replicate all the operations we have for JSON to work for variant. It's a slippery slope.

These characteristics are acceptable and necessary for an interchange format, but they are not a good foundation for a general-purpose, extensible SQL/JSON implementation.

Not sure I understand this. Variant is not just an interchange format. If that were the purpose, we'd might as well use varbinary instead.

Refactoring Trino’s JSON type to adopt a new internal encoding and updating the full SQL/JSON function surface would be a valuable effort, but it is a large, orthogonal project with broad impact.

There are only a handful of functions that expect JSON type at this point:

  • is_json_scalar, json_array_contains, json_array_length, which are pretty trivial
  • json_array_get, which is currently broken, anyway
  • json_extract, json_extract_scalar, json_format, json_parse, which could be implemented via an internal conversion to json first and then invoke the existing logic
  • casts, which you've already implemented

In short, this PR is about delivering real Variant semantics with strong interoperability guarantees. SQL/JSON compliance and JSON internals can evolve independently without conflating the two or weakening either.

Databases implement variant semantics in different ways, with no interoperable semantics. In particular databrick's, snowflake's and Iceberg variants differ from each other in important ways. Other databases use different types to achieve those semantics:

Database Variant / Equivalent
Snowflake VARIANT
SQL Server sql_variant
Redshift SUPER
Databricks VARIANT
Oracle ANYDATA, JSON
PostgreSQL JSON, JSONB
MySQL JSON
SQLite Dynamic typing
MongoDB BSON documents
BigQuery JSON, STRUCT

So, if by interoperable we mean interoperable with other engines that read/write Iceberg, this should by an Iceberg connector feature, not an engine-level feature.

@findepi
Copy link
Member

findepi commented Dec 29, 2025

It is also important to note that JSON is already a trivial cast away. This PR supports casts in and out of Variant, including JSON ↔ VARIANT, and users who want JSON semantics can get them immediately via explicit casts.

I could get behind that as long as we don't then go and try to replicate all the operations we have for JSON to work for variant

As much as i don't like duplicated logic especially if it's complex, I also want Trino to be less performant than it could be (and less performant than other systems).

For example, having to convert whole variant document to json type in order to perform a path lookup would sound wasteful.
FWIW Snowflake has JSON functions, but doesn't have a JSON type. All the JSON functions operate on their VARIANT type, which is quite similar to (but not same as) Parquet/Iceberg VARIANT.

My point is not really to decide on the direction here. Rather, I only want us not to make promises that we may regret in the future, and to keep the door open for future improvements around Trino's JSON and VARIANT story.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

5 participants