Skip to content

Conversation

@TimothyW553
Copy link
Collaborator

@TimothyW553 TimothyW553 commented Dec 11, 2025

🥞 Stacked PR

Use this link to review incremental changes.


Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

This PR introduces a factory that returns the appropriate snapshot manager for a SparkTable. This factory checks the CatalogTable to determine if the snapshot manager should be a PathBasedSnapshotManager, UCManagedTableSnapshotManager, or another CatalogedManagedSnapshotManager in the future.

How was this patch tested?

Locally and CI.

Does this PR introduce any user-facing changes?

No.

@TimothyW553 TimothyW553 changed the title Create factory and routing [Kernel-Spark][DSv2] Add CCv2 routing to SparkTable and SnapshotManagerFactory Dec 11, 2025
@TimothyW553 TimothyW553 force-pushed the stack/snapshotmanager-factory-wireup branch from 998c638 to c0df33e Compare December 11, 2025 04:31
@TimothyW553 TimothyW553 requested review from huan233usc and scottsand-db and removed request for scottsand-db December 11, 2025 04:31
@TimothyW553 TimothyW553 force-pushed the stack/snapshotmanager-factory-wireup branch 5 times, most recently from d59108e to 6538c8b Compare December 11, 2025 22:28
tdas pushed a commit that referenced this pull request Dec 15, 2025
## 🥞 Stacked PR
Use this
[link](https://github.com/delta-io/delta/pull/5655/files/07f3b5b1a0e9f9eb1794ac4b7b7ff151cc9a3b10..0b5886852bbe9871ab2c6ee5893da940ef69edcc)
to review incremental changes.
- [catalogtableutils-ccv2](#5477)
[[Files changed](https://github.com/delta-io/delta/pull/5477/files)]
- [stack/ccv2-uc-utils](#5605)
[[Files
changed](https://github.com/delta-io/delta/pull/5605/files/9ca17935c6cb19b8c94168853e18d9965130f9f1..07f3b5b1a0e9f9eb1794ac4b7b7ff151cc9a3b10)]
-
[**stack/snapshotmanager-wireframe**](#5655)
[[Files
changed](https://github.com/delta-io/delta/pull/5655/files/07f3b5b1a0e9f9eb1794ac4b7b7ff151cc9a3b10..0b5886852bbe9871ab2c6ee5893da940ef69edcc)]
-
[stack/snapshotmanager-implementation](#5677)
[[Files
changed](https://github.com/delta-io/delta/pull/5677/files/0b5886852bbe9871ab2c6ee5893da940ef69edcc..a7b46bbc8d16040a10cc61fa65876af5b9e6ed0b)]
-
[stack/snapshotmanager-factory-wireup](#5678)
[[Files
changed](https://github.com/delta-io/delta/pull/5678/files/a7b46bbc8d16040a10cc61fa65876af5b9e6ed0b..6538c8b460050666e918f8853f67ab70ccea2d92)]

---------
<!--
Thanks for sending a pull request!  Here are some tips for you:
1. If this is your first time, please read our contributor guidelines:
https://github.com/delta-io/delta/blob/master/CONTRIBUTING.md
2. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP]
Your PR title ...'.
  3. Be sure to keep the PR description updated to reflect all changes.
  4. Please write your PR title to summarize what this PR proposes.
5. If possible, provide a concise example to reproduce the issue for a
faster review.
6. If applicable, include the corresponding issue number in the PR title
and link it in the body.
-->

#### Which Delta project/connector is this regarding?
<!--
Please add the component selected below to the beginning of the pull
request title
For example: [Spark] Title of my pull request
-->

- [ ] Spark
- [ ] Standalone
- [ ] Flink
- [ ] Kernel
- [ ] Other (fill in here)

## Description

<!--
- Describe what this PR changes.
- Describe why we need the change.
 
If this PR resolves an issue be sure to include "Resolves #XXX" to
correctly link and close the issue upon merge.
-->
This PR:
- Adds `kernelUnityCatalog` as a dependency for `sparkV2` to wire in
UC-managed functionality.
- Introduces `UCManagedSnapshotManager` skeleton for UC-managed tables
that will delegate snapshot/
commit operations to UCCatalogManagedClient (methods currently TODO).

(Orange is the previous PR, green is what we've introduced in this PR.
Routing is deferred)
```mermaid
flowchart TB
    A["Spark table CatalogTable"] -- decide snapshot manager --> B{"Select snapshot manager"}
    B -- path based --> P["PathBasedSnapshotManager loadLatestSnapshot loadSnapshotAt"]
    C["UCCatalogManagedClient"] -- build UC REST client --> D["UC REST client"]
    D -- GET ratified commits and table metadata --> U["Unity Catalog server"]
    X["UCUtils extractTableInfo<br>(tableId, path, uri, token)"] -- needs UC table details --> M["UCManagedSnapshotManager<br>(UCCatalogManagedClient(UCClient))"]
    B -- UC managed --> X
    M --> C

     C:::added
     X:::extraction
     M:::added
    classDef extraction fill:#ffb347,stroke:#d97706,color:#000
    classDef added fill:#b7e1a1,stroke:#2d6a4f,color:#000
```

## How was this patch tested?
<!--
If tests were added, say they were added here. Please make sure to test
the changes thoroughly including negative and positive cases if
possible.
If the changes were tested in any way other than unit tests, please
clarify how you tested step by step (ideally copy and paste-able, so
that other reviewers can test and check, and descendants can verify in
the future).
If the changes were not tested, please explain why.
-->
Locally and CI.

## Does this PR introduce _any_ user-facing changes?
No.
<!--
If yes, please clarify the previous behavior and the change this PR
proposes - provide the console output, description and/or an example to
show the behavior difference if possible.
If possible, please also clarify if this is a user-facing change
compared to the released Delta Lake versions or within the unreleased
branches such as master.
If no, write 'No'.
-->

---------

Signed-off-by: Timothy Wang <[email protected]>
Signed-off-by: TimothyW553 <[email protected]>
@TimothyW553 TimothyW553 force-pushed the stack/snapshotmanager-factory-wireup branch from 6538c8b to 8993eb0 Compare December 16, 2025 00:19
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package io.delta.kernel.spark.snapshot;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW are we sure this should be under io.delta.kernel? Not under io.delta.spark?

cc @raveeram-db and @tdas

Copy link
Contributor

@tdas tdas Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this making no sense. cc @huan233usc we MUST change this.
in the same vein, even the name kernel-spark does not make sense at all.

Copy link
Collaborator

@raveeram-db raveeram-db Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, let's file an issue to update this in separate PR Xin? Thanks for pointing this out @scottsand-db !

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tracking here: #5728 and will be addressed in a separate PR

@TimothyW553 TimothyW553 force-pushed the stack/snapshotmanager-factory-wireup branch from fb59e86 to f1ad709 Compare December 16, 2025 22:01
* </ul>
*/
@Experimental
public final class SnapshotManagerFactory {
Copy link
Contributor

@tdas tdas Dec 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it produces DeltaSnapshotManager, then it should be a DeltaSnapshotManagedFactory.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: everything here is Delta -- probably don't need to prefix things with it?

@TimothyW553 TimothyW553 force-pushed the stack/snapshotmanager-factory-wireup branch from f1ad709 to 7fd42af Compare December 18, 2025 18:15
@TimothyW553 TimothyW553 force-pushed the stack/snapshotmanager-factory-wireup branch from 7fd42af to a986581 Compare December 18, 2025 21:33
@TimothyW553 TimothyW553 force-pushed the stack/snapshotmanager-factory-wireup branch from a986581 to b3bcb24 Compare December 18, 2025 23:44
@TimothyW553 TimothyW553 force-pushed the stack/snapshotmanager-factory-wireup branch from b3bcb24 to 2410591 Compare December 19, 2025 02:00
@TimothyW553 TimothyW553 marked this pull request as ready for review December 19, 2025 03:53
raveeram-db pushed a commit that referenced this pull request Dec 19, 2025
…ing UCCatalogManagedClient (#5677)

## 🥞 Stacked PR
Use this
[link](https://github.com/delta-io/delta/pull/5677/files/b2ba4e4cb4d7c0dfaa4f0542e12ef27798c2f43c..29b67038e4ef85fd7dfde7bd2f493b1420976ebe)
to review incremental changes.
- [catalogtableutils-ccv2](#5477)
[[Files changed](https://github.com/delta-io/delta/pull/5477/files)]
-
[**stack/snapshotmanager-implementation**](#5677)
[[Files
changed](https://github.com/delta-io/delta/pull/5677/files/b2ba4e4cb4d7c0dfaa4f0542e12ef27798c2f43c..29b67038e4ef85fd7dfde7bd2f493b1420976ebe)]
-
[stack/snapshotmanager-factory-wireup](#5678)
[[Files
changed](https://github.com/delta-io/delta/pull/5678/files/da30e83d7911bd4a97cf73cc2c9c0fdd38dc8a39..cbb134e12af1b3d8b02d2e8a5f6632327f7f8f51)]
- [stack/e2e-ccv2-tests](#5734)
[[Files
changed](https://github.com/delta-io/delta/pull/5734/files/cbb134e12af1b3d8b02d2e8a5f6632327f7f8f51..19b8915e6f051ba875932f89494f2e27954cdcb9)]

---------
<!--
Thanks for sending a pull request!  Here are some tips for you:
1. If this is your first time, please read our contributor guidelines:
https://github.com/delta-io/delta/blob/master/CONTRIBUTING.md
2. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP]
Your PR title ...'.
  3. Be sure to keep the PR description updated to reflect all changes.
  4. Please write your PR title to summarize what this PR proposes.
5. If possible, provide a concise example to reproduce the issue for a
faster review.
6. If applicable, include the corresponding issue number in the PR title
and link it in the body.
-->

#### Which Delta project/connector is this regarding?
<!--
Please add the component selected below to the beginning of the pull
request title
For example: [Spark] Title of my pull request
-->

- [ ] Spark
- [ ] Standalone
- [ ] Flink
- [ ] Kernel
- [ ] Other (fill in here)

## Description
This PR implements the wireframe introduced in the previous PR of the
stack. For this specific `UCManagedTableSnapshotManager`, it delegates
snapshot loading operations to the `UCCatalogCommitClient`.

`getActiveCommitAtTime(t)` loads the latest snapshot, gets its commits,
and uses those along with
`DeltaHistoryManager.getActiveCommitAtTimestamp(t, catalogCommits, ...)`
to get the appropriate commit at time `t`.
<!--
- Describe what this PR changes.
- Describe why we need the change.
 
If this PR resolves an issue be sure to include "Resolves #XXX" to
correctly link and close the issue upon merge.
-->

## How was this patch tested?
Locally and CI.
<!--
If tests were added, say they were added here. Please make sure to test
the changes thoroughly including negative and positive cases if
possible.
If the changes were tested in any way other than unit tests, please
clarify how you tested step by step (ideally copy and paste-able, so
that other reviewers can test and check, and descendants can verify in
the future).
If the changes were not tested, please explain why.
-->

## Does this PR introduce _any_ user-facing changes?
No.
<!--
If yes, please clarify the previous behavior and the change this PR
proposes - provide the console output, description and/or an example to
show the behavior difference if possible.
If possible, please also clarify if this is a user-facing change
compared to the released Delta Lake versions or within the unreleased
branches such as master.
If no, write 'No'.
-->

---------

Signed-off-by: Timothy Wang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants