Skip to content

Conversation

@LY-today
Copy link
Contributor

@volcano-sh-bot volcano-sh-bot requested review from hwdef and k82cn June 13, 2025 02:19
@volcano-sh-bot volcano-sh-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jun 13, 2025
@LY-today
Copy link
Contributor Author

@hwdef please check

@volcano-sh-bot volcano-sh-bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 13, 2025
@LY-today
Copy link
Contributor Author

@googs1025 Do you have any other questions? It seems that my changes have a great impact on the testing of various links, and solving the testing problems seems to be the difficulty

@googs1025
Copy link
Member

can you squash commits to one?

@LY-today
Copy link
Contributor Author

can you squash commits to one?

Maybe wait until ci is fixed before merging?

@LY-today
Copy link
Contributor Author

LY-today commented Jun 13, 2025

@hwdef @googs1025 @k82cn There are many unit test issues involved here, and I personally cannot accurately measure the business logic of each unit test, so if you approve of the changes in this PR, can you help me complete the unit test repair?

@googs1025
Copy link
Member

@hwdef @googs1025 @k82cn There are many unit test issues involved here, and I personally cannot accurately measure the business logic of each unit test, so if you approve of the changes in this PR, can you help me complete the unit test repair?

Just a guess, maybe you validated the node status(ready or not). Maybe some unit tests didn't set the node status? You can look at this part

@googs1025
Copy link
Member

I still think this method is too complicated. . . What we need is the logic to filter out NotReady or unschedulable. I think this should be very simple. In fact, we don’t care about these fields in the status list at all.

@googs1025
Copy link
Member

/cc @Monokaix @JesseStutler

@LY-today
Copy link
Contributor Author

LY-today commented Jun 16, 2025

I still think this method is too complicated. . . What we need is the logic to filter out NotReady or unschedulable. I think this should be very simple. In fact, we don’t care about these fields in the status list at all.

If I analyze this PR from the perspective of implementation difficulty, I can certainly start from luxury and start from thriftiness, so that the test will not be very complicated. But I am worried that there will still be boundary issues when running it online. Referring to k8s describe node is the best practice I know so far.

When resource calculation is abnormal, if the log is consistent with the describe node output most commonly used by users, it will be easier for users to understand.

@LY-today
Copy link
Contributor Author

LY-today commented Jun 16, 2025

/cc @Monokaix @JesseStutler @hwdef please,check,This part of the logic has a great impact on the calculation of quota. We have encountered many related problems online. Please help follow up.

@JesseStutler
Copy link
Member

/cc

@Monokaix Monokaix requested a review from Copilot June 16, 2025 13:00
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a helper to determine node readiness and integrates it into the scheduler’s session logic, along with corresponding tests and adjustments to test utilities.

  • Add a default NodeReady condition in the test node builder
  • Introduce nodeIsNotReady in framework/util.go and its test suite
  • Skip unready nodes during session setup in session.go

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
pkg/scheduler/util/test_utils.go Set a NodeReady condition by default in BuildNode
pkg/scheduler/framework/util_test.go New tests for the nodeIsNotReady function
pkg/scheduler/framework/util.go Add nodeIsNotReady utility and import slices
pkg/scheduler/framework/session.go Skip nodes flagged as not ready in openSession
Comments suppressed due to low confidence (2)

pkg/scheduler/framework/util.go:268

  • [nitpick] The function name nodeIsNotReady suggests it returns true for an unready node, but it actually returns true for ready nodes. Consider renaming it to nodeIsReady or inverting its return logic to match its name.
func nodeIsNotReady(obj *v1.Node) bool {

pkg/scheduler/framework/util_test.go:11

  • [nitpick] Test name TestGetNodeStatus doesn’t match the tested function nodeIsNotReady. Rename the test to something like TestNodeIsReady or TestNodeIsNotReady for clarity.
func TestGetNodeStatus(t *testing.T) {

@LY-today
Copy link
Contributor Author

@JesseStutler @Monokaix please,check

Copy link
Member

@JesseStutler JesseStutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
Much better, thanks

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Jun 17, 2025
@JesseStutler
Copy link
Member

@LY-today Please squash your commits into one, thanks

@volcano-sh-bot volcano-sh-bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 17, 2025
@LY-today
Copy link
Contributor Author

@LY-today Please squash your commits into one, thanks

done, compressed into one line, the upstream master has changed, and merge is performed

@LY-today LY-today force-pushed the bug-fix branch 3 times, most recently from b69c29f to 3641512 Compare June 18, 2025 03:20
@LY-today
Copy link
Contributor Author

@JesseStutler please check

@Monokaix
Copy link
Member

Please merge to one commit.

@LY-today
Copy link
Contributor Author

Please merge to one commit.

@Monokaix done

@Monokaix
Copy link
Member

/approve

@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Monokaix

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 18, 2025
Copy link
Member

@JesseStutler JesseStutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Jun 18, 2025
@volcano-sh-bot volcano-sh-bot merged commit aa43923 into volcano-sh:master Jun 18, 2025
18 checks passed
@LY-today LY-today changed the title fix: add getNodeStatus fix: Node resource topology awareness, stop scheduling and notReady Jun 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants