-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add sra policy for ResourceStrategyFit Plugin #4454
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @XbaoWu, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request introduces a Scarce Resource Avoidance (SRA) policy within the ResourceStrategyFit plugin to enhance task scheduling strategies. It aims to prevent starvation for tasks requiring critical resources and improve overall resource utilization by allowing more intelligent placement of workloads, including the migration of existing proportional resource logic and the addition of a new retention policy.
Highlights
- Scarce Resource Avoidance (SRA): Implemented a new Scarce Resource Avoidance (SRA) policy within the
ResourceStrategyFitplugin to optimize resource utilization and prevent task starvation, particularly for critical resources like GPUs. - Proportional Policy Migration: The
proportionalpredicate logic has been migrated from thepredicatesplugin to theresource-strategy-fitplugin, streamlining the codebase and centralizing resource-related scheduling policies. - New Retention Policy: Introduced a
retentionSRA policy, enabling users to define weights for scarce resources. This policy ensures that tasks not requiring these resources are preferentially scheduled away from nodes possessing them, thereby retaining critical resources for specialized workloads. - Configuration Updates: Updated the scheduler configuration to support the new SRA policies, introducing
sra.policyto select betweenproportionalandretentionstrategies, along with specific arguments likesra.proportional.*andsra.retention.*. - Documentation Enhancements: Added comprehensive design documentation for the new SRA feature, including detailed explanations of both
retentionandproportionalpolicies, configuration examples, and best practices for AI-specific scheduling scenarios.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a Scarce Resource Avoidance (SRA) policy for the ResourceStrategyFit plugin, improving resource utilization and refactoring proportional logic from the predicates plugin. The review focuses on documentation clarity, a correctness issue in scoring, and code duplication.
pkg/scheduler/plugins/resource-strategy-fit/resource_strategy_fit.go
Outdated
Show resolved
Hide resolved
7df8c1e to
666446b
Compare
|
@JesseStutler @Monokaix @LY-today sra related content has been adjusted, when your time is convenient, please take a look at it, thank you :) |
pkg/scheduler/plugins/resource-strategy-fit/resource_strategy_fit.go
Outdated
Show resolved
Hide resolved
pkg/scheduler/plugins/resource-strategy-fit/resource_strategy_fit.go
Outdated
Show resolved
Hide resolved
pkg/scheduler/plugins/resource-strategy-fit/resource_strategy_fit.go
Outdated
Show resolved
Hide resolved
c5a2a9c to
47d2c9d
Compare
pkg/scheduler/plugins/resource-strategy-fit/resource_strategy_fit.go
Outdated
Show resolved
Hide resolved
pkg/scheduler/plugins/resource-strategy-fit/resource_strategy_fit.go
Outdated
Show resolved
Hide resolved
|
/cc @Monokaix |
|
@JesseStutler @LY-today The log content is slightly adjusted, and the sra's configuration items is also modified to the struct structure. |
|
@Monokaix The configuration structure has been updated, the middle semantic layer has been removed, and only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds Scarce Resource Avoidance (SRA) policy to the resource-strategy-fit plugin to prevent task starvation for critical resources and enhance resource utilization. The proportional policy processing logic is also moved from the predicates plugin to the resource-strategy-fit plugin to keep the predicates plugin clean.
- Implements SRA policy with configurable weights for scarce resources
- Moves proportional policy from predicates plugin to resource-strategy-fit plugin
- Adds comprehensive tests for the new SRA functionality
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/scheduler/plugins/util/util.go | Adds ShouldAbort utility function for status checking |
| pkg/scheduler/plugins/resource-strategy-fit/sra.go | Implements core SRA scheduling logic and scoring algorithm |
| pkg/scheduler/plugins/resource-strategy-fit/sra_test.go | Comprehensive test coverage for SRA functionality |
| pkg/scheduler/plugins/resource-strategy-fit/resource_strategy_fit.go | Integrates SRA and proportional policies into main plugin |
| pkg/scheduler/plugins/resource-strategy-fit/resource_strategy_fit_test.go | Updates existing tests to reflect structural changes |
| pkg/scheduler/plugins/resource-strategy-fit/proportional.go | Moves proportional logic from predicates plugin |
| pkg/scheduler/plugins/resource-strategy-fit/proportional_test.go | Updates package declaration for moved code |
| pkg/scheduler/plugins/predicates/predicates.go | Removes proportional logic and uses util.ShouldAbort |
| docs/design/resource-strategy-fit-scheduling.md | Documents SRA and proportional policies |
| docs/design/proportional.md | Updates configuration examples |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
7e8b695 to
d9a61d3
Compare
|
Ok, please also solve the code conflict. |
Signed-off-by: wuxiaobao <[email protected]>
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Monokaix The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
JesseStutler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
What type of PR is this?
/kind feature
What this PR does / why we need it:
Prevent task starvation for those requesting critical resources, enhance the utilization of important resources, and thereby achieve more effective task scheduling strategies.
To keep the
predicatesplugin clean, the processing logic ofpredicates.proportionalis moved to theresource-strategy-fitplugin.Which issue(s) this PR fixes:
Fixes #4244
Special notes for your reviewer:
No additional assistance messages
Does this PR introduce a user-facing change?