Skip to content

Conversation

yyhhyyyyyy
Copy link
Collaborator

@yyhhyyyyyy yyhhyyyyyy commented Aug 28, 2025

correct Gemini 2.5 Flash Image Preview model capabilities and token limits

Summary by CodeRabbit

  • New Features

    • Added Gemini 2.5 Flash Image Preview model with vision/image input support across providers.
  • Improvements

    • Restored standard Gemini 2.5 Flash as a separate selectable model with original token/context limits.
    • Updated capability handling: function calling is not offered on the Image Preview model; “thinking/reasoning” is only shown for eligible Gemini 2.5 variants.
    • Refreshed model lists and matching to ensure accurate selection and capabilities across providers.

Copy link
Contributor

coderabbitai bot commented Aug 28, 2025

Walkthrough

Adds Gemini 2.5 Flash Image Preview model to defaults and provider mappings, reintroduces distinct Gemini 2.5 Flash config, and updates geminiProvider logic to exclude flash-image-preview from function calling and reasoning/thinking detection.

Changes

Cohort / File(s) Summary
Default model settings
src/main/presenter/configPresenter/modelDefaultSettings.ts
Inserts google/gemini-2.5-flash-image-preview default with vision true, functionCall/reasoning false, temperature 0.7, maxTokens/contextLength 32768; adds match aliases.
Provider model mappings
src/main/presenter/configPresenter/providerModelSettings.ts
Adds Gemini and OpenRouter entries for google/gemini-2.5-flash-image-preview (tokens/context 32768, vision true, functionCall/reasoning false). Restores a separate models/gemini-2.5-flash with prior larger token/context and functionCall/reasoning true.
Provider capability logic
src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
Updates logic: excludes gemma-3 and flash-image-preview from function calling; treats Gemini 2.5 reasoning as supported only when not flash-image-preview.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client
  participant GP as geminiProvider
  participant CFG as Capability/Config Logic
  participant API as Gemini API

  C->>GP: requestGeneration(modelId, options)
  GP->>CFG: determineCapabilities(modelId)
  Note right of CFG: If model includes "flash-image-preview"<br/>• functionCall = false<br/>• reasoning/thinking = false
  Note right of CFG: If model is Gemini 2.5 and not image-preview<br/>• reasoning/thinking = true

  CFG-->>GP: capabilities (vision, functionCall, reasoning)
  GP->>GP: build generation config based on capabilities
  GP->>API: send generate content
  API-->>GP: response
  GP-->>C: result
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • zerob13

Poem

A bunny taps the config tree, hop-hop—
New flash image-preview gets a spot on top.
Old Flash returns, with tokens vast,
While functions rest when previews pass.
Reasoning hutch now neatly split—
Two paths diverge, and both legit.
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/gemini-2.5-flash-image-preview-config

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbit in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbit in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbit gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbit read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbit help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbit ignore or @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbit summary or @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbit or @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts (1)

880-920: Bug: tools are attached even when the selected model doesn’t support native function calls.

coreStream always adds tools if mcpTools.length > 0, which will break on flash-image-preview models declared as functionCall: false. Gate this on modelConfig.functionCall and (optionally) fall back to prompt wrapping.

Apply this diff:

-    // 添加工具配置
-    if (geminiTools && geminiTools.length > 0) {
+    // Add tools only if the model supports native function calling
+    if (modelConfig.functionCall && geminiTools && geminiTools.length > 0) {
       requestParams.config = {
         ...requestParams.config,
         tools: geminiTools,
         toolConfig: {
           functionCallingConfig: {
             mode: FunctionCallingConfigMode.AUTO // 允许模型自动决定是否调用工具
           }
         }
       }
+    } else if (!modelConfig.functionCall && mcpTools.length > 0) {
+      // Optional: prompt wrapping for non-native tool call models
+      // requestParams.contents = this.prepareFunctionCallPrompt(messages, mcpTools, formattedParts.contents)
+      console.warn('Model does not support native function calling; skipping tool attachment:', modelId)
     }

If you want prompt wrapping, I can add a minimal prepareFunctionCallPrompt helper.

🧹 Nitpick comments (6)
src/main/presenter/configPresenter/modelDefaultSettings.ts (1)

366-373: Minor consistency nit: unify maxTokens with provider mappings.

Here 2.5 Flash uses 65535, while provider mappings and provider code use 65536. Pick one to avoid off-by-one confusion.

Apply this diff to make it 65536 here:

-    maxTokens: 65535,
+    maxTokens: 65536,
src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts (4)

379-385: Avoid accidental thinkingBudget on image-preview.

supportsThinkingBudget matches 'gemini-2.5-flash', which also matches 'gemini-2.5-flash-image-preview'. Tighten the check.

-  private supportsThinkingBudget(modelId: string): boolean {
-    return (
-      modelId.includes('gemini-2.5-pro') ||
-      modelId.includes('gemini-2.5-flash') ||
-      modelId.includes('gemini-2.5-flash-lite')
-    )
-  }
+  private supportsThinkingBudget(modelId: string): boolean {
+    return (
+      (modelId.includes('gemini-2.5-pro') ||
+        modelId.includes('gemini-2.5-flash') ||
+        modelId.includes('gemini-2.5-flash-lite')) &&
+      !modelId.includes('flash-image-preview')
+    )
+  }

52-78: Static fallback list is missing the new image-preview model.

If the models.list() call fails/returns empty, users won’t see google/gemini-2.5-flash-image-preview. Add it to GEMINI_MODELS.

     {
       id: 'models/gemini-2.5-flash',
       name: 'Gemini 2.5 Flash',
       group: 'default',
       providerId: 'gemini',
       isCustom: false,
       contextLength: 1048576,
       maxTokens: 65536,
       vision: true,
       functionCall: true,
       reasoning: true
     },
+    {
+      id: 'google/gemini-2.5-flash-image-preview',
+      name: 'Gemini 2.5 Flash Image Preview',
+      group: 'default',
+      providerId: 'gemini',
+      isCustom: false,
+      contextLength: 32768,
+      maxTokens: 32768,
+      vision: true,
+      functionCall: false,
+      reasoning: false
+    },

400-406: Consider forcing IMAGE response modality for image-preview.

Some GenAI endpoints require responseModalities to include IMAGE to deliver inline image parts. Safe-guard here for 'flash-image-preview'.

   if (modelId && this.models) {
     const model = this.models.find((m) => m.id === modelId)
     if (model && model.type === ModelType.ImageGeneration) {
       generationConfig.responseModalities = [Modality.TEXT, Modality.IMAGE]
+    } else if (modelId.includes('flash-image-preview')) {
+      generationConfig.responseModalities = [Modality.TEXT, Modality.IMAGE]
     }
   }

Please confirm with the latest Google GenAI SDK docs whether this is necessary for the image-preview model.


262-266: Logs and user strings should be English per repo guidelines.

Update console messages and default user-facing strings to English (e.g., “生成对话标题失败” → “Failed to generate conversation title”, “新对话” → “New chat”, “无法生成建议” → “Unable to generate suggestions”).

Would you like a follow-up PR to centralize logging with levels (ERROR/WARN/INFO/DEBUG) and structured fields?

Also applies to: 756-769, 792-799, 847-850

src/main/presenter/configPresenter/providerModelSettings.ts (1)

207-216: Potential match collision with image-preview due to substring 'gemini-2.5-flash'.

Because getProviderSpecificModelConfig uses includes(), 'gemini-2.5-flash' is a substring of 'gemini-2.5-flash-image-preview'. Ordering currently prevents mis-match, but making the match stricter here reduces risk.

-        match: ['models/gemini-2.5-flash', 'gemini-2.5-flash'],
+        match: ['models/gemini-2.5-flash'],

If you need the bare alias, keep it but ensure the image-preview entry stays before this one.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between fa4482f and c22476d.

📒 Files selected for processing (3)
  • src/main/presenter/configPresenter/modelDefaultSettings.ts (1 hunks)
  • src/main/presenter/configPresenter/providerModelSettings.ts (3 hunks)
  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (9)
**/*.{js,jsx,ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/development-setup.mdc)

**/*.{js,jsx,ts,tsx}: 使用 OxLint 进行代码检查
Log和注释使用英文书写

Files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
  • src/main/presenter/configPresenter/modelDefaultSettings.ts
  • src/main/presenter/configPresenter/providerModelSettings.ts
src/{main,renderer}/**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/electron-best-practices.mdc)

src/{main,renderer}/**/*.ts: Use context isolation for improved security
Implement proper inter-process communication (IPC) patterns
Optimize application startup time with lazy loading
Implement proper error handling and logging for debugging

Files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
  • src/main/presenter/configPresenter/modelDefaultSettings.ts
  • src/main/presenter/configPresenter/providerModelSettings.ts
src/main/**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/electron-best-practices.mdc)

Use Electron's built-in APIs for file system and native dialogs

From main to renderer, broadcast events via EventBus using mainWindow.webContents.send()

Files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
  • src/main/presenter/configPresenter/modelDefaultSettings.ts
  • src/main/presenter/configPresenter/providerModelSettings.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/error-logging.mdc)

**/*.{ts,tsx}: 始终使用 try-catch 处理可能的错误
提供有意义的错误信息
记录详细的错误日志
优雅降级处理
日志应包含时间戳、日志级别、错误代码、错误描述、堆栈跟踪(如适用)、相关上下文信息
日志级别应包括 ERROR、WARN、INFO、DEBUG
不要吞掉错误
提供用户友好的错误信息
实现错误重试机制
避免记录敏感信息
使用结构化日志
设置适当的日志级别

Enable and adhere to strict TypeScript type checking

Files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
  • src/main/presenter/configPresenter/modelDefaultSettings.ts
  • src/main/presenter/configPresenter/providerModelSettings.ts
src/main/presenter/llmProviderPresenter/providers/*.ts

📄 CodeRabbit inference engine (.cursor/rules/llm-agent-loop.mdc)

src/main/presenter/llmProviderPresenter/providers/*.ts: Each file in src/main/presenter/llmProviderPresenter/providers/*.ts should handle interaction with a specific LLM API, including request/response formatting, tool definition conversion, native/non-native tool call management, and standardizing output streams to a common event format.
Provider implementations must use a coreStream method that yields standardized stream events to decouple the main loop from provider-specific details.
The coreStream method in each Provider must perform a single streaming API request per conversation round and must not contain multi-round tool call loop logic.
Provider files should implement helper methods such as formatMessages, convertToProviderTools, parseFunctionCalls, and prepareFunctionCallPrompt as needed for provider-specific logic.
All provider implementations must parse provider-specific data chunks and yield standardized events for text, reasoning, tool calls, usage, errors, stop reasons, and image data.
When a provider does not support native function calling, it must prepare messages using prompt wrapping (e.g., prepareFunctionCallPrompt) before making the API call.
When a provider supports native function calling, MCP tools must be converted to the provider's format (e.g., using convertToProviderTools) and included in the API request.
Provider implementations should aggregate and yield usage events as part of the standardized stream.
Provider implementations should yield image data events in the standardized format when applicable.
Provider implementations should yield reasoning events in the standardized format when applicable.
Provider implementations should yield tool call events (tool_call_start, tool_call_chunk, tool_call_end) in the standardized format.
Provider implementations should yield stop events with appropriate stop_reason in the standardized format.
Provider implementations should yield error events in the standardized format...

Files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
src/main/**/*.{ts,js,tsx,jsx}

📄 CodeRabbit inference engine (.cursor/rules/project-structure.mdc)

主进程代码放在 src/main

Files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
  • src/main/presenter/configPresenter/modelDefaultSettings.ts
  • src/main/presenter/configPresenter/providerModelSettings.ts
src/**/*.{ts,tsx,vue}

📄 CodeRabbit inference engine (CLAUDE.md)

Use English for all logs and comments

Files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
  • src/main/presenter/configPresenter/modelDefaultSettings.ts
  • src/main/presenter/configPresenter/providerModelSettings.ts
src/main/presenter/**/*.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Maintain one presenter per functional domain in src/main/presenter/

Files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
  • src/main/presenter/configPresenter/modelDefaultSettings.ts
  • src/main/presenter/configPresenter/providerModelSettings.ts
src/main/presenter/configPresenter/**/*.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Centralize configuration logic under configPresenter/

Files:

  • src/main/presenter/configPresenter/modelDefaultSettings.ts
  • src/main/presenter/configPresenter/providerModelSettings.ts
🧠 Learnings (4)
📚 Learning: 2025-07-21T01:46:52.880Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : When a provider supports native function calling, MCP tools must be converted to the provider's format (e.g., using `convertToProviderTools`) and included in the API request.

Applied to files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
📚 Learning: 2025-07-21T01:46:52.880Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : Provider files should implement helper methods such as `formatMessages`, `convertToProviderTools`, `parseFunctionCalls`, and `prepareFunctionCallPrompt` as needed for provider-specific logic.

Applied to files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
📚 Learning: 2025-07-21T01:46:52.880Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : When a provider does not support native function calling, it must prepare messages using prompt wrapping (e.g., `prepareFunctionCallPrompt`) before making the API call.

Applied to files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
📚 Learning: 2025-08-26T14:13:46.601Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-26T14:13:46.601Z
Learning: Applies to src/main/presenter/configPresenter/providers.ts : Add new provider configuration entries in configPresenter/providers.ts

Applied to files:

  • src/main/presenter/configPresenter/providerModelSettings.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build-check (x64)
🔇 Additional comments (4)
src/main/presenter/configPresenter/modelDefaultSettings.ts (1)

351-361: Adds correct entry for Gemini 2.5 Flash Image Preview; alignments look good.

Capabilities (vision: true, functionCall: false, reasoning: false) and 32k token/context limits match the intent of this PR. Please verify the 32k figures against current Google docs to avoid drift.

Would you confirm the latest official limits for google/gemini-2.5-flash-image-preview and whether any alt aliases (e.g., “image-preview-02”) should be added to match[]?

src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts (1)

194-201: Good: explicitly excludes flash-image-preview from native function calling and reasoning detection.

This aligns with the new model’s capabilities and prevents misclassification downstream.

src/main/presenter/configPresenter/providerModelSettings.ts (2)

185-194: Gemini image-preview mapping looks correct.

Capabilities and 32k limits align with defaults and provider logic. Please confirm the 32k figures with Google’s latest docs.


2328-2338: OpenRouter: image-preview mapping added and consistent.

Settings mirror the Gemini provider entry; looks good.

@zerob13 zerob13 merged commit 76eb459 into dev Aug 28, 2025
2 checks passed
neoragex2002 pushed a commit to neoragex2002/deepchat that referenced this pull request Aug 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants