Skip to content

Conversation

@Lyndon-Li
Copy link
Contributor

Fix issue #8803, use deterministic name to create backupRepository

@codecov
Copy link

codecov bot commented Mar 24, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 59.57%. Comparing base (d086cb2) to head (3c5ebba).
Report is 6 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #8808   +/-   ##
=======================================
  Coverage   59.56%   59.57%           
=======================================
  Files         370      370           
  Lines       40229    40228    -1     
=======================================
+ Hits        23964    23966    +2     
+ Misses      14766    14764    -2     
+ Partials     1499     1498    -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Lyndon-Li Lyndon-Li marked this pull request as ready for review March 24, 2025 08:41
@Lyndon-Li Lyndon-Li requested review from blackpiglet and sseago March 24, 2025 08:42
@Lyndon-Li
Copy link
Contributor Author

Lyndon-Li commented Mar 24, 2025

@sseago @shubham-pampattiwar Please help to review this PR.

For the existing code, there is a comment in function EnsureRepo:

	// It's only safe to have one instance of this method executing concurrently for a
	// given BackupRepositoryKey, so synchronize based on that. It's fine
	// to run concurrently for *different* BackupRepositoryKey. If you had 2 goroutines
	// running this for the same inputs, both might find no BackupRepository exists, then
	// both would create new ones for the same BackupRepositoryKey.
	//
	// This issue could probably be avoided if we had a deterministic name for
	// each BackupRepository and we just tried to create it, checked for an
	// AlreadyExists err, and then waited for it to be ready. However, there are
	// already repositories in the wild with non-deterministic names (i.e. using
	// GenerateName) which poses a backwards compatibility problem.

However, after discussion, we don't see a problem of using a deterministic name.
So for this PR, we use a deterministic name for each backupRepository CR so as to support multiple thread calling by leveraging API server's optimistic lock mechanism.

@Lyndon-Li
Copy link
Contributor Author

@msfrucht Please also have a look.

@shubham-pampattiwar
Copy link
Collaborator

cc: @msfrucht

@msfrucht
Copy link
Contributor

msfrucht commented Mar 24, 2025

However, after discussion, we don't see a problem of using a deterministic name. So for this PR, we use a deterministic name for each backupRepository CR so as to support multiple thread calling by leveraging API server's optimistic lock mechanism.

@msfrucht Please also have a look.

I agree. Any existing BackupRepository will be found using the BackupRepositoryKey. And any new BackupRepository can be created deterministically and allow for ok on Conflict since that implies the named format is ok.

I approve with a caveat, though it is a pre-existing issue.

fmt.Sprintf("%s-%s-%s-", key.VolumeNamespace, key.BackupLocation, key.RepositoryType)

The BackupLocation is the name of the BackupStorageLocation. This is why in the bug report the name of the BSL is a uuid. Internally, we generate them as such to avoid this issue.

The name of the BackupRepository object will be at least the length of the name of the BackupStorageLocation. If the BackupStorageLocation name is set to the maximum allowed for a Kubernetes object. The object creation will fail.

This isn't the only time we've seen. I fully admit to making this error as well. And why our internal project has a standing policy that object names should not be concatenated to create new object names. With the exception for object names with a 64-character maximum like Service due to DNS format.

Because this is a pre-existing problem in the design, I do not see it as blocking, just a note that still needs further refinement.

The previous format with generateName also had this issue and was slightly worse due to the additional suffix.

@Lyndon-Li
Copy link
Contributor Author

The name of the BackupRepository object will be at least the length of the name of the BackupStorageLocation. If the BackupStorageLocation name is set to the maximum allowed for a Kubernetes object. The object creation will fail.

Please open another issue for this problem. We have multiple areas in the code with this problem, not only for backupRepository CRs.
There is no easy fix, for some legacy code, the impact may be very large.

@Lyndon-Li
Copy link
Contributor Author

object names should not be concatenated to create new object names

Fully agreed. For the new code, let's keep this in mind in coding and review.

@Lyndon-Li Lyndon-Li merged commit 883e3e4 into vmware-tanzu:main Mar 25, 2025
42 checks passed
@msfrucht
Copy link
Contributor

The name of the BackupRepository object will be at least the length of the name of the BackupStorageLocation. If the BackupStorageLocation name is set to the maximum allowed for a Kubernetes object. The object creation will fail.

Please open another issue for this problem. We have multiple areas in the code with this problem, not only for backupRepository CRs. There is no easy fix, for some legacy code, the impact may be very large.

#8815

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants