-
Notifications
You must be signed in to change notification settings - Fork 197
[Enhancement] Support restore/rollback sync during conversion (1/2) #569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -73,7 +73,8 @@ public Map<String, SyncResult> syncSnapshot( | |
| internalTable, | ||
| target -> target.syncFilesForSnapshot(snapshot.getPartitionedDataFiles()), | ||
| startTime, | ||
| snapshot.getPendingCommits())); | ||
| snapshot.getPendingCommits(), | ||
| snapshot.getSourceIdentifier())); | ||
| } catch (Exception e) { | ||
| log.error("Failed to sync snapshot", e); | ||
| results.put( | ||
|
|
@@ -121,7 +122,8 @@ public Map<String, List<SyncResult>> syncChanges( | |
| change.getTableAsOfChange(), | ||
| target -> target.syncFilesForDiff(change.getFilesDiff()), | ||
| startTime, | ||
| changes.getPendingCommits())); | ||
| changes.getPendingCommits(), | ||
| change.getSourceIdentifier())); | ||
| } catch (Exception e) { | ||
| log.error("Failed to sync table changes", e); | ||
| resultsForFormat.add(buildResultForError(SyncMode.INCREMENTAL, startTime, e)); | ||
|
|
@@ -149,19 +151,26 @@ private SyncResult getSyncResult( | |
| InternalTable tableState, | ||
| SyncFiles fileSyncMethod, | ||
| Instant startTime, | ||
| List<Instant> pendingCommits) { | ||
| List<Instant> pendingCommits, | ||
| String sourceIdentifier) { | ||
| // initialize the sync | ||
| conversionTarget.beginSync(tableState); | ||
| // Persist the latest commit time in table properties for incremental syncs | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here We need to move the Metadata set operation earlier because it will be required during the |
||
| // Syncing metadata must precede the following steps to ensure that the metadata is available | ||
| // before committing | ||
| TableSyncMetadata latestState = | ||
| TableSyncMetadata.of( | ||
| tableState.getLatestCommitTime(), | ||
| pendingCommits, | ||
| tableState.getTableFormat(), | ||
| sourceIdentifier); | ||
| conversionTarget.syncMetadata(latestState); | ||
| // sync schema updates | ||
| conversionTarget.syncSchema(tableState.getReadSchema()); | ||
| // sync partition updates | ||
| conversionTarget.syncPartitionSpec(tableState.getPartitioningFields()); | ||
| // Update the files in the target table | ||
| fileSyncMethod.sync(conversionTarget); | ||
| // Persist the latest commit time in table properties for incremental syncs. | ||
| TableSyncMetadata latestState = | ||
| TableSyncMetadata.of(tableState.getLatestCommitTime(), pendingCommits); | ||
| conversionTarget.syncMetadata(latestState); | ||
| conversionTarget.completeSync(); | ||
|
|
||
| return SyncResult.builder() | ||
|
|
||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. New data and test case added for backward compatibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will these identifier's always be the same? if so, is it simpler to make this a boolean method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the source identifier stored in the target table remains unchanged. However, this API is designed to retrieve the corresponding target COMMIT based on an input source identifier (It would be the rollback source COMMIT in this feature). This allows us to initiate a rollback to the retrieved target COMMIT on the target table