-
Notifications
You must be signed in to change notification settings - Fork 479
Introduce WithLooseOnApply option for PatchAdmissionStatus #7246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce WithLooseOnApply option for PatchAdmissionStatus #7246
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue canceled.
|
mimowo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, please also open another "test" PR which turns the feature gate as "true" so that we can make sure it works. We could then merge that PR in a follow up.
| err := workload.PatchAdmissionStatus(ctx, s.client, origWorkload, s.clock, func() (*kueue.Workload, bool, error) { | ||
| return newWorkload, true, nil | ||
| }, workload.WithLoose()) | ||
| }, workload.WithLooseOnApply()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should also be used here:
kueue/pkg/scheduler/scheduler.go
Line 317 in 8749fd8
| }, workload.WithLoose()); err != nil { |
Do we have an integration or e2e test which covers that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am working on unit tests, but sure I can work on integration test as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we already have an integration test, you can check by enabling the feature gate, and say injecting a panic in that place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I found one that fails from time to time because of the change:
Scheduler Fair Sharing Suite: [It] Scheduler when ClusterQueue head has inadmissible workload sticky workload deleted, next workload can admit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one flakes not only on this PR: #7250
I was asking if we have an integration test already which exercises the other use of PatchAdmissionStatus in schedule
kueue/pkg/scheduler/scheduler.go
Line 317 in 8749fd8
| }, workload.WithLoose()); err != nil { |
And in my first comment I also referenced the other place to update for consistency, as both are from scheduler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, does it flake if you run the tests in a loop with the changes in this PR, plus enabling the feature gate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please summarize the results of the testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A successful with flag enabled run of:
ginkgo --json-report ./ginkgo.report -focus "SchedulerWithWaitForPodsReady" -r --race --procs=4 --repeat=20
No failures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
|
Please also add a release note |
952183b to
3013cf0
Compare
3013cf0 to
6a7a46c
Compare
6a7a46c to
0543121
Compare
0543121 to
d64dff1
Compare
|
Thanks 👍 |
|
@mimowo: once the present PR merges, I will cherry-pick it on top of In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mimowo, mszadkow The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
LGTM label has been added. Git tree hash: a15d5e4b1bb22cc8643db3e0d1042f3cca620f39
|
|
We need to adjust the release note to be user-oriented |
|
/release-note-edit |
|
/remove-kind feature |
|
/release-note-edit |
|
/cherrypick release-0.14 |
|
@mimowo: new pull request created: #7279 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@mimowo: new pull request could not be created: failed to create pull request against kubernetes-sigs/kueue#release-0.14 from head k8s-infra-cherrypick-robot:cherry-pick-7246-to-release-0.14: status code 422 not one of [201], body: {"message":"Validation Failed","errors":[{"resource":"PullRequest","code":"custom","message":"A pull request already exists for k8s-infra-cherrypick-robot:cherry-pick-7246-to-release-0.14."}],"documentation_url":"https://docs.github.com/rest/pulls/pulls#create-a-pull-request","status":"422"} In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@mszadkow please prepare the cherrypick manually |
What type of PR is this?
/kind feature
What this PR does / why we need it:
As stated here
Thus to lower the risk to minimum we introduce an option that preserves strict setting for SSA.
That keeps the functionality unchanged when used with
WorkloadRequestUseMergePatchdisabled and for enabled let scheduler be more strict on patch and let it retry on the next scheduling cycle.Which issue(s) this PR fixes:
Relates to #7035
Special notes for your reviewer:
Does this PR introduce a user-facing change?