Skip to content

Conversation

@dzaninovic
Copy link
Contributor

Thank you for contributing to Velero!

Please add a summary of your change

Add support for backup and restore of CSI block volumes with Kopia uploader.

Design: #6590
Interface change: #6608

Testing:

  • Automated tests
  • Filesystem and Block PVC backup and restore with md5 verification

Does your change fix a particular issue?

Fixes #6548

Please indicate you've done the following:

  • Accepted the DCO. Commits without the DCO will delay acceptance.
  • Created a changelog file or added /kind changelog-not-required as a comment on this pull request.
  • Updated the corresponding documentation in site/content/docs/main.

@codecov
Copy link

codecov bot commented Aug 21, 2023

Codecov Report

Merging #6680 (aa14f83) into main (c3ec7b7) will decrease coverage by 0.13%.
Report is 6 commits behind head on main.
The diff coverage is 55.00%.

@@            Coverage Diff             @@
##             main    #6680      +/-   ##
==========================================
- Coverage   60.78%   60.66%   -0.13%     
==========================================
  Files         245      249       +4     
  Lines       26256    26464     +208     
==========================================
+ Hits        15961    16055      +94     
- Misses       9164     9266     +102     
- Partials     1131     1143      +12     
Files Coverage Δ
pkg/datapath/file_system.go 32.14% <100.00%> (-2.03%) ⬇️
pkg/exposer/generic_restore.go 71.31% <100.00%> (-0.12%) ⬇️
pkg/install/daemonset.go 100.00% <100.00%> (ø)
pkg/uploader/provider/kopia.go 83.44% <100.00%> (-0.64%) ⬇️
pkg/exposer/csi_snapshot.go 79.92% <75.00%> (-0.64%) ⬇️
pkg/controller/data_upload_controller.go 68.44% <85.71%> (+0.08%) ⬆️
pkg/install/resources.go 76.28% <0.00%> (-0.92%) ⬇️
pkg/install/deployment.go 88.46% <0.00%> (-1.54%) ⬇️
pkg/exposer/host_path.go 80.00% <45.45%> (+3.80%) ⬆️
pkg/uploader/kopia/snapshot.go 82.86% <81.81%> (+0.93%) ⬆️
... and 5 more

... and 8 files with indirect coverage changes

@dzaninovic
Copy link
Contributor Author

I fixed the reported issues and rebased.

@reasonerjt
Copy link
Contributor

Could you also add more UT? The coverage has dropped 0.2%

@dzaninovic
Copy link
Contributor Author

Could you also add more UT? The coverage has dropped 0.2%

I will work on increasing the coverage.

@dzaninovic
Copy link
Contributor Author

I added more unit tests.

@dzaninovic
Copy link
Contributor Author

CI Run failure seems unrelated:
2023-08-24T14:00:17.0705891Z [2023-08-24T14:00:17.069Z] ['error'] There was an error running the uploader: Error uploading to https://codecov.io: Error: There was an error fetching the storage URL during POST: 404 - {'detail': ErrorDetail(string='Unable to locate build via Github Actions API. Please upload with the Codecov repository upload token to resolve issue.', code='not_found')} 2023-08-24T14:00:17.0709047Z [2023-08-24T14:00:17.070Z] ['verbose'] The error stack is: Error: Error uploading to https://codecov.io: Error: There was an error fetching the storage URL during POST: 404 - {'detail': ErrorDetail(string='Unable to locate build via Github Actions API. Please upload with the Codecov repository upload token to resolve issue.', code='not_found')} 2023-08-24T14:00:17.0710931Z at main (/snapshot/repo/dist/src/index.js) 2023-08-24T14:00:17.0711849Z at processTicksAndRejections (node:internal/process/task_queues:96:5) 2023-08-24T14:00:17.0713279Z [2023-08-24T14:00:17.070Z] ['verbose'] End of uploader: 1579 milliseconds 2023-08-24T14:00:17.0871109Z ##[error]Codecov: Failed to properly upload: The process '/home/runner/work/_actions/codecov/codecov-action/v3/dist/codecov' failed with exit code 255

@dzaninovic
Copy link
Contributor Author

I rebased and squashed the commits.

I will be on vacation next week so if there is anything else that needs to be done I can do that after the vacation or @draghuram can assign somebody else to continue while I am away.

@anshulahuja98
Copy link
Collaborator

@dzaninovic / @draghuram - will this still support kopia's dedup? Or will each backup be full copies.

@draghuram
Copy link
Contributor

@anshulahuja98, Kopia dedup will occur just like for regular files. As far as Kopia is concerned, the device is being presented as a file.

@anshulahuja98
Copy link
Collaborator

@anshulahuja98, Kopia dedup will occur just like for regular files. As far as Kopia is concerned, the device is being presented as a file.

what about dedup performance, should that also be same? Any idea? Since we are streaming device as file

@shawn-hurley
Copy link
Contributor

@anshulahuja98 Dedup happens at a layer below these changes. This means that everything else, from Kopia's perspective will be the same as a normal filesystem IIUC.

@draghuram
Copy link
Contributor

That is correct. There should be no difference in behavior, including performance related, between devices and files. But it is possible to optimize device backup further (using direct IO etc) but I think we decided not to do those in this iteration.

@anshulahuja98
Copy link
Collaborator

Got it, thanks for the inputs @shawn-hurley / @draghuram
One last question to connect the dots for my understanding the next best thing after this approach is using CBT based diff correct? That would basically have further better dedup by only copying diffs and storing them. And for that the current hope is mainly on the dataprotection sig's CBT design? Or is there any other industry standard in K8s context which is being used by other vendors.

I am just further trying to see to confirm my understanding that we'll have to write another uploader in velero to interface with the CBT APIs/ interface and then upload them

@Lyndon-Li
Copy link
Contributor

@anshulahuja98
Yes, we will create a block data uploader to support CBT as well as other feature, see the discussion here.

Besides Kubernetes SIG's CBT mechanism, we may also integrate the block uploader with storage/computing platform APIs directly, which will be used to solve problems like CSI is unavailable or inefficient.

@anshulahuja98
Copy link
Collaborator

@anshulahuja98 Yes, we will create a block data uploader to support CBT as well as other feature, see the discussion here.

Besides Kubernetes SIG's CBT mechanism, we may also integrate the block uploader with storage/computing platform APIs directly, which will be used to solve problems like CSI is unavailable or inefficient.

got it, thanks for sharing this @Lyndon-Li

@dzaninovic
Copy link
Contributor Author

I rebased, resolved conflicts and retested.

On @sseago's request in the last Velero Community meeting I tested block PVC backup without the mover and PVC was skipped.

time="2023-09-07T14:47:41Z" level=warning msg="volume vol is declared in pod block1/pod-raw but not mounted by any container, skipping" backup=velero/backup2 logSource="pkg/podvolume/backupper.go:270" name=pod-raw namespace=block1 resource=pods

Code is only checking for mounted volumes but not for attached devices:

	for _, container := range pod.Spec.Containers {
		for _, volumeMount := range container.VolumeMounts {
			mountedPodVolumes.Insert(volumeMount.Name)
		}
	}
...
	for _, volumeName := range volumesToBackup {
...
		// volumes that are not mounted by any container should not be backed up, because
		// its directory is not created
		if !mountedPodVolumes.Has(volumeName) {
			msg := fmt.Sprintf("volume %s is declared in pod %s/%s but not mounted by any container, skipping", volumeName, pod.Namespace, pod.Name)
			log.Warn(msg)
			pvcSummary.addSkipped(volumeName, msg)
			continue
		}

Do we want to try to support this code path in this pull request?

@sseago
Copy link
Collaborator

sseago commented Sep 7, 2023

@dzaninovic Oh, right, I'd forgotten about that. We actually had someone try to use fs backup for a block volume and hit that exact same error message (it didn't have the block mode support, since this was velero 1.11, so it still wouldn't have worked) -- but yes, if we can support the fs-backup code path, that would be great. I don't know whether the code you pasted above is the only thing needed to fix this, but it may be worth trying. I'd say that if you can get it working with fs-backup with minimal extra work, then feel free to do it here -- if it's a lot more work, then a follow-on PR for that later might be better.

oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Sep 5, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Sep 5, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Sep 11, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Sep 12, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Sep 12, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Sep 30, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Oct 11, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Oct 16, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Oct 18, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Oct 23, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Oct 25, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Oct 28, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Oct 29, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Nov 1, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Nov 1, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Nov 4, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Nov 6, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Nov 8, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Nov 11, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Nov 13, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Nov 20, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Nov 22, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Nov 25, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Nov 27, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Nov 29, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Dec 4, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Dec 6, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Dec 9, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Dec 11, 2025
oadp-rebasebot-cloner bot pushed a commit to oadp-rebasebot/velero that referenced this pull request Dec 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support backup on volumeMode block via Data mover (Kopia)

10 participants