-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Add VARIANT type with Iceberg Variant support (Parquet-only) #27753
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
56b80e4 to
9336d65
Compare
| { | ||
| public static final int FORMAT_VERSION_SUPPORT_MIN = 1; | ||
| public static final int FORMAT_VERSION_SUPPORT_MAX = 2; | ||
| public static final int FORMAT_VERSION_SUPPORT_MAX = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to bump the supported iceberg version without support for some of the other V3 features ?
E.g. we don't support reading deletion vectors yet and if other engines write those, then we may have correctness issues or errors.
It looks we backed out changes for deletion vectors previously #25550 because V3 support requires handling of row lineage as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could consider adding variant type support to delta lake connector first, as we already support variant there. That way the rest of the changes in this PR won't need to wait for iceberg V3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know anything about delta, but I expect it will take less than 10 mintues to integrate for someone that knows what they are doing. Basically, instead of calling the "deserialize variant to json" method, you call "deserialize variant to variant" method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for v3, I think a reasonable solution is to add a config property that let's you use v3 until DVs are implemented. This PR is huge, and we should check it in instead of waiting.
I'm going to take a look at adding DV support because it seems trivial, but I don't know if I'll have time to finish it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the DV code looks simple. I've already hacked together the read side, and should be able to quicly get the write side working.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only supporting DV is insufficient. Please also add tests for default values and row lineage.
The connector throws exceptions for unsupported types (for example, timestamp nanos and geospatial types), which makes it safer. However, there are still cases where we could violate the spec if unsupported features are neither properly supported nor explicitly rejected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a commit to the end that checks for unsupported v3 features and just fails the queries. So if you have a table with default values, v3 encryption, or DV, you get an exception. We should not need to do anything for row lineage since we do not expose the new columns (so you would get an exception if you try to use them), and the existing optimizer code checks if the version is exactly 2 (so it fails on v3).
I believe this is every feature that effects correctness, and they should all just fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with current state of the PR, is Trino allowed to v3 tables?
does it fulfill all the writer duties?
|
Thanks for putting this up. We talked about it a few weeks ago over Slack, so I wanted to capture my thoughts here. SQL 2023 defines a rich SQL/JSON data model and set of operations and semantics that's almost a 1-to-1 mapping with Iceberg Variant. Note that, importantly, SQL/JSON does not mandate a textual encoding, but only conversion functions to and from text that produce RFC 8259 JSON.
Instead of adding a new data type, we should:
This gives us:
|
|
@martint Thanks for writing this up — I agree that SQL:2023 defines a rich SQL/JSON data model, and at the abstract model level there is significant overlap with Iceberg Variant (null / scalar / array / object / sequence). However, I think it is important to separate data model similarity from type semantics, interoperability guarantees, and physical representation. This PR is intentionally focused on first-class Iceberg Variant support. That implies a number of constraints that are part of the Iceberg specification and cannot be relaxed or evolved independently:
Concretely, Iceberg Variant has a number of limitations that are part of the contract:
Because these constraints are fundamental to Variant, equating it with SQL/JSON is dangerous in both directions:
For these reasons, representing Variant as a distinct Trino type is deliberate. It makes the interoperability boundary explicit and avoids accidental promises about future extensibility or semantic compatibility. It is also important to note that JSON is already a trivial cast away. This PR supports casts in and out of Variant, including JSON ↔ VARIANT, and users who want JSON semantics can get them immediately via explicit casts. The hard problem here is not JSON manipulation — it is correctness and fidelity of Variant itself. More broadly, if Trino were to implement deeper SQL/JSON compliance, Variant is not a good internal encoding for that purpose. Variant values are expensive and difficult to construct:
These characteristics are acceptable and necessary for an interchange format, but they are not a good foundation for a general-purpose, extensible SQL/JSON implementation. Refactoring Trino’s JSON type to adopt a new internal encoding and updating the full SQL/JSON function surface would be a valuable effort, but it is a large, orthogonal project with broad impact. This PR is intentionally scoped to deliver correct, interoperable Iceberg Variant support without blocking on that work. In short, this PR is about delivering real Variant semantics with strong interoperability guarantees. SQL/JSON compliance and JSON internals can evolve independently without conflating the two or weakening either. |
| { | ||
| public static final int FORMAT_VERSION_SUPPORT_MIN = 1; | ||
| public static final int FORMAT_VERSION_SUPPORT_MAX = 2; | ||
| public static final int FORMAT_VERSION_SUPPORT_MAX = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only supporting DV is insufficient. Please also add tests for default values and row lineage.
The connector throws exceptions for unsupported types (for example, timestamp nanos and geospatial types), which makes it safer. However, there are still cases where we could violate the spec if unsupported features are neither properly supported nor explicitly rejected.
The variant implementation is the stack type for the variant type. This includes the necessary code to manipulate variants for the casts necessary in Trino
For json encoding, variant is encoded as binary for the metadata and a binary for the value.
Any type supported by variant can be cast to or from variant, including array, row, and map of containing any supported type. Additionally, JSON can be cast to and from variant. Variant can also be dereferenced using the dereference operator.
Other Iceberg formats are not supported in this change. The existing Delta Lake variant to JSON mapping is untouched.
Currently, default column values, v3 file encryption, deletion vectors, and any updates to v3 tables are not supported and throw.
In that case, we should keep the type and functions confined to the Iceberg plugin.
That's fair. There would need to be some conversion/checking on write to ensure the contents have the expected precision that the connector needs. Alternatively, we could have an envelope with metadata about the contained data types so that the connector can decide whether to store as is or convert.
SQL/JSON doesn't have those either, so that's not a problem.
This could be a problem, yes. But in what ways? Can you elaborate?
Not really. It doesn't encourage extensions -- that's a choice we would make.
I could get behind that as long as we don't then go and try to replicate all the operations we have for JSON to work for variant. It's a slippery slope.
Not sure I understand this. Variant is not just an interchange format. If that were the purpose, we'd might as well use varbinary instead.
There are only a handful of functions that expect JSON type at this point:
Databases implement variant semantics in different ways, with no interoperable semantics. In particular databrick's, snowflake's and Iceberg variants differ from each other in important ways. Other databases use different types to achieve those semantics:
So, if by interoperable we mean interoperable with other engines that read/write Iceberg, this should by an Iceberg connector feature, not an engine-level feature. |
As much as i don't like duplicated logic especially if it's complex, I also want Trino to be less performant than it could be (and less performant than other systems). For example, having to convert whole variant document to json type in order to perform a path lookup would sound wasteful. My point is not really to decide on the direction here. Rather, I only want us not to make promises that we may regret in the future, and to keep the door open for future improvements around Trino's JSON and VARIANT story. |




This PR introduces support for the
VARIANTtype in Trino, based on theApache Iceberg Variant specification.
Key points:
VARIANTtype to the SPI and engineVARIANT, including arrays, rows, maps,and JSON
VARIANTVARIANTbacked by ParquetVARIANTfunctions, operators, casts, and Iceberg constraintsIceberg notes:
VARIANTis supported only for Iceberg tables using format version 3This PR does not modify existing Delta Lake variant-to-JSON behavior.
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text: