-
Notifications
You must be signed in to change notification settings - Fork 8k
refactor: Centralize and improve model fallback handling #7634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @abhipatel12, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request significantly refactors the model fallback mechanism to prepare the architecture for future model routing strategies, such as an "auto" mode. The changes address previous issues of duplicated logic, tight coupling between the CLI and Core packages, and an unclear communication contract, leading to a more robust and maintainable system.
Highlights
- Centralized Fallback Logic: Fallback orchestration is now consolidated into packages/core/src/fallback/handler.ts, eliminating duplicated logic previously found in GeminiClient and GeminiChat.
- Explicit Intent Contract for UI: The UI no longer directly mutates Core state. Instead, Core queries the UI for user intent via a new FallbackIntent type ('retry', 'stop', 'auth'), replacing the ambiguous boolean return.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This is an excellent pull request that significantly improves the model fallback architecture. The centralization of the fallback logic into packages/core/src/fallback/handler.ts
and the introduction of an explicit FallbackIntent
contract are great architectural decisions. This decouples the UI from core state management, making the system more robust and easier to maintain. The code changes are clean, well-structured, and consistent across all packages. The new tests are comprehensive and effectively validate the new intent-driven flow. I have reviewed the changes thoroughly and found no issues.
Code Coverage Summary
CLI Package - Full Text Report
Core Package - Full Text Report
For detailed HTML reports, please see the 'coverage-reports-22.x-ubuntu-latest' artifact from the main CI run. |
e2d1aa1
to
7b79bf4
Compare
Size Change: +776 B (+0.01%) Total Size: 12.5 MB ℹ️ View Unchanged
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a significant and well-executed refactoring of the model fallback mechanism. It successfully centralizes the fallback logic into a new core/fallback
module, introduces an intent-driven flow, and simplifies the UI components, particularly AppContainer.tsx
. The new useQuotaAndFallback
hook is a great addition that encapsulates UI-related fallback logic. The changes align with the goals stated in the description, leading to a more decoupled and maintainable architecture. I've identified a couple of high-severity issues where the implementation deviates from the new design principles, which could lead to bugs or confusion. Once these are addressed, this will be an excellent improvement.
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request is a great refactoring of the model fallback mechanism, centralizing logic and improving the architecture with an intent-driven approach. The new useQuotaAndFallback
hook effectively encapsulates UI concerns. However, I've found a critical race condition in the new hook that could cause API calls to hang indefinitely under specific circumstances. Please see the detailed comment for the issue and a suggested fix.
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This is an excellent refactoring that significantly improves the model fallback architecture. The introduction of an intent-driven flow and the centralization of logic in handleFallback
and the useQuotaAndFallback
hook greatly enhance clarity and decoupling between the core and UI packages. The code is well-structured, and the new tests are comprehensive. I have one suggestion to further improve the architecture by removing a polling mechanism, but overall, this is a very strong pull request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks solid. A few minor comments then approved. Please be sure to land the fixes as a separate commit.
Thanks for adding all of the extra testing as part of the refactor.
packages/core/src/config/config.ts
Outdated
@@ -281,7 +276,7 @@ export class Config { | |||
name: string; | |||
extensionName: string; | |||
}>; | |||
flashFallbackHandler?: FlashFallbackHandler; | |||
fallbackHandler?: FallbackHandler; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optional nit: like the handler doesn't have the name Flash in it anymore but worry it might be a little confusing now. should this be fallbackModelHandler to clarify what kind of fallback it is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Made it clearer
currentModel, | ||
fallbackModel, | ||
undefined, | ||
// Assert that the Flash model was used, not the requested model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove this comment as it is low value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
mockConfig = { | ||
getSessionId: () => 'test-session-id', | ||
getTelemetryLogPromptsEnabled: () => true, | ||
getUsageStatisticsEnabled: () => true, | ||
getDebugMode: () => false, | ||
getContentGeneratorConfig: () => ({ | ||
authType: 'oauth-personal', | ||
getContentGeneratorConfig: vi.fn().mockReturnValue({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: why is this now mockReturnValue when it wasn't before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We needed it so that we could test the fallback integration. These things weren't being tested previously which is why it was needed. We could use a fake Config instead, but would need to tests more complex by calling the real refreshAuth
, etc.
return false; | ||
} | ||
} catch (handlerError) { | ||
console.warn('Fallback UI handler failed:', handlerError); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this just .warn instead of .error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoop, copypasta. This was a direct copy of the existing try catch from the client.ts code. Changed to .error
|
||
vi.mock('node:fs'); | ||
|
||
describe('Flash Fallback Integration', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a new issue but for discussion, should this be renamed to flashFallback.test.ts
seems much more like a unit test than a regular integration test with lots of mocking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair. Done
8cd3600
to
a37673a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review. Made the changes!
packages/core/src/config/config.ts
Outdated
@@ -281,7 +276,7 @@ export class Config { | |||
name: string; | |||
extensionName: string; | |||
}>; | |||
flashFallbackHandler?: FlashFallbackHandler; | |||
fallbackHandler?: FallbackHandler; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Made it clearer
currentModel, | ||
fallbackModel, | ||
undefined, | ||
// Assert that the Flash model was used, not the requested model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
mockConfig = { | ||
getSessionId: () => 'test-session-id', | ||
getTelemetryLogPromptsEnabled: () => true, | ||
getUsageStatisticsEnabled: () => true, | ||
getDebugMode: () => false, | ||
getContentGeneratorConfig: () => ({ | ||
authType: 'oauth-personal', | ||
getContentGeneratorConfig: vi.fn().mockReturnValue({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We needed it so that we could test the fallback integration. These things weren't being tested previously which is why it was needed. We could use a fake Config instead, but would need to tests more complex by calling the real refreshAuth
, etc.
return false; | ||
} | ||
} catch (handlerError) { | ||
console.warn('Fallback UI handler failed:', handlerError); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoop, copypasta. This was a direct copy of the existing try catch from the client.ts code. Changed to .error
|
||
vi.mock('node:fs'); | ||
|
||
describe('Flash Fallback Integration', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair. Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TLDR
This PR refactors the model fallback mechanism, preparing the architecture for upcoming model routing strategies (e.g., "auto" mode).
The previous implementation suffered from duplicated logic, tight coupling where the CLI package mutated Core state, and ambiguity in the communication contract. This PR resolves these issues by centralizing the logic, introducing explicit intents, and implementing key architectural patterns for decoupling.
Key changes for reviewers:
packages/core/src/fallback/
: New centralized handler and types.packages/cli/src/ui/hooks/useQuotaAndFallback.ts
: New hook encapsulating all UI logic for this feature.packages/cli/src/ui/AppContainer.tsx
: Significantly simplified by removing the inline fallback handler.packages/core/src/core/client.ts
&geminiChat.ts
: Updated to use the new centralized handler.Dive Deeper
Model Resolution
We no longer mutate
config.setModel()
during runtime fallbacks.In an interim state, until we have a more complex router service, we check
config.isInFallbackMode()
; if true, we return the fallback model (Flash); otherwise, we return the configured model.Centralized, Intent-Driven Fallback
The logic previously duplicated in
client.ts
andgeminiChat.ts
is now handled by a singlehandleFallback
function.The new flow is:
currentAttemptModel
) is captured via closure.onPersistent429
callback callshandleFallback(config, currentAttemptModel, ...)
.handleFallback
invokes the UI handler (injected viaconfig.setFallbackHandler
).App.tsx
) handles user interaction (e.g., Pro Quota Dialog) and returns aFallbackIntent
.handleFallback
interprets this intent:'retry'
: Callsconfig.setFallbackMode(true)
and signalsretryWithBackoff
to immediately retry the request. The Model Resolver ensures the retry uses the fallback model.'stop'
: Callsconfig.setFallbackMode(true)
but stops the current request.'auth'
: Stops the current request without activating fallback mode.Crucially, the Core package now owns the responsibility of changing the fallback state; the UI only signals intent.
Reviewer Test Plan
To test this, you will need to simulate quota errors. The easiest way is to temporarily modify
packages/core/src/core/geminiChat.ts
in thesendMessage
method to always throw an error that looks like a 429.Find the
apiCall
function and replace its content with:Scenario 1: Generic Quota Fallback (Automatic)
rateLimitExceeded
reason in your mock error.gemini-flash
model. The model indicator in the UI should update.Scenario 2: Pro Quota Fallback (Interactive)
userRateLimitExceeded
reason in your mock error.Scenario 2a: Continue with Fallback
gemini-flash
model. You should see a valid response.Scenario 2b: Authenticate
/auth
dialog immediately opens. The original prompt is not retried.Testing Matrix
Linked issues / bugs
Resolves #7810