Skip to content

Conversation

neoragex2002
Copy link
Contributor

@neoragex2002 neoragex2002 commented Aug 27, 2025

老版geminiProvider在处理完新版 API 的 thought 字段后,没有停止,而是继续用旧的、不健壮的 <think> 标签匹配逻辑去处理正文内容,导致了正文被错误地识别和分割。

图片

Summary by CodeRabbit

  • New Features
    • Real-time streaming with smoother reasoning display and content updates.
    • Configurable safety settings mapped to harm categories and thresholds.
    • Thinking budget support on compatible models to balance speed and quality.
    • Enhanced tool-calling and multi-part message handling for richer interactions.
    • More accurate token usage reporting with resilient fallbacks.
    • Improved compatibility for streaming image generation responses.

Copy link
Contributor

coderabbitai bot commented Aug 27, 2025

Walkthrough

Introduces dual-mode streaming for Gemini responses with detection of a new thought format, updates buffering and end-of-stream behavior accordingly, adds normalized response processing, expands message formatting for tool calls and multi-part content, integrates safety settings mapping, supports thinking budgets for eligible models, and maintains token usage reporting.

Changes

Cohort / File(s) Summary
Gemini provider streaming and config updates
src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
Adds new thought-format detection in streaming; switches to real-time text emission when detected; preserves thought accumulation; adjusts leftover buffering; normalizes responses via processGeminiResponse; extends formatGeminiMessages for tool_calls/multi-part; maps safety settings; adds thinking budget support in generation config; maintains/fallbacks token usage; supports inline image streaming.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant UI as UI
  participant Provider as GeminiProvider
  participant Gemini as Gemini API

  UI->>Provider: streamRequest(messages, config)
  Provider->>Provider: formatGeminiMessages(messages)
  Provider->>Provider: getGenerationConfig(+thinkingBudget?, +thinkingConfig)
  Provider->>Provider: getFormattedSafetySettings()
  Provider->>Gemini: startStream(formattedMessages, genConfig, safety)

  loop on chunk
    Gemini-->>Provider: chunk(parts[])
    alt new thought format detected (part.thought === true)
      Provider->>Provider: isNewThoughtFormatDetected = true
      Provider-->>UI: emit text(part.text) in real time
      Provider->>Provider: accumulate thought segments
    else old format (uses <think> buffering)
      Provider->>Provider: buffer until </think>
      Provider-->>UI: emit reasoning_content then content
    end
  end

  Provider->>Provider: finalize
  alt new format
    Provider->>Provider: suppress leftover buffered content
  else old format
    Provider-->>UI: emit leftover buffered content
  end

  Provider->>Provider: processGeminiResponse(collect text, thoughts, usage)
  Provider-->>UI: complete({content, reasoning_content, usage})
Loading
sequenceDiagram
  autonumber
  participant Caller as Caller
  participant Provider as GeminiProvider

  Caller->>Provider: formatGeminiMessages(messages)
  Note over Provider: Translates tool_calls, user/model parts, inlineData
  Caller->>Provider: getFormattedSafetySettings()
  Note over Provider: Maps harm categories and thresholds
  Caller->>Provider: getGenerationConfig(modelId)
  alt model supports thinking budget
    Provider-->>Caller: config + thinkingBudget + thinkingConfig.includeThoughts
  else
    Provider-->>Caller: config without thinkingBudget
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • Bugfix/gemini thinking #639 — Also modifies the Gemini provider’s response processing, reasoning/thinking handling, and generation config, overlapping with this PR’s areas.

Poem

I twitch my ears at streams that flow,
New thoughts hop in, real-time they go.
Old tags burrow, buffered tight,
Safety fences set just right.
Budgeted thinking, nibble by nibble—
Tool calls thump, responses quibble.
Carrots for code; ship with a scribble! 🥕

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts (2)

361-366: Bug: comparing enum values to string literals — safety thresholds never filtered.

threshold is HarmBlockThreshold (an enum), but it’s compared to strings, so the condition is always true for non-zero enums. Fix the comparisons to enum members.

Apply this diff:

-        if (
-          threshold &&
-          category &&
-          threshold !== 'BLOCK_NONE' &&
-          threshold !== 'HARM_BLOCK_THRESHOLD_UNSPECIFIED'
-        ) {
+        if (
+          category &&
+          threshold !== undefined &&
+          threshold !== HarmBlockThreshold.BLOCK_NONE &&
+          threshold !== HarmBlockThreshold.HARM_BLOCK_THRESHOLD_UNSPECIFIED
+        ) {

929-1090: Stream error handling: add try/catch and yield standardized error events.

coreStream must yield error events instead of throwing. Wrap the streaming request + loop; on failure, emit error and stop('error').

Apply this diff:

-    // 发送流式请求
-    const result = await this.genAI.models.generateContentStream(requestParams)
-
-    // 状态变量
+    // 发送流式请求
+    try {
+      const result = await this.genAI.models.generateContentStream(requestParams)
+
+      // 状态变量
       let buffer = ''
       let isInThinkTag = false
       let toolUseDetected = false
       let usageMetadata: GenerateContentResponseUsageMetadata | undefined
       let isNewThoughtFormatDetected = false
@@
-    for await (const chunk of result) {
+      for await (const chunk of result) {
         // 处理用量统计
         if (chunk.usageMetadata) {
           usageMetadata = chunk.usageMetadata
         }
@@
-      }
-    }
-
-    if (usageMetadata) {
+      }
+      if (usageMetadata) {
         yield {
           type: 'usage',
           usage: {
             prompt_tokens: usageMetadata.promptTokenCount || 0,
             completion_tokens: usageMetadata.candidatesTokenCount || 0,
             total_tokens: usageMetadata.totalTokenCount || 0
           }
         }
-    }
-
-    // 处理剩余缓冲区内容
-    if (!isNewThoughtFormatDetected && buffer) {
+      }
+      // 处理剩余缓冲区内容
+      if (!isNewThoughtFormatDetected && buffer) {
         if (isInThinkTag) {
           yield {
             type: 'reasoning',
             reasoning_content: buffer
           }
         } else {
           yield {
             type: 'text',
             content: buffer
           }
         }
-    }
-
-    // 发送停止事件
-    yield { type: 'stop', stop_reason: toolUseDetected ? 'tool_use' : 'complete' }
+      }
+      // 发送停止事件
+      yield { type: 'stop', stop_reason: toolUseDetected ? 'tool_use' : 'complete' }
+    } catch (error) {
+      console.error('Gemini stream error:', error)
+      yield {
+        type: 'error',
+        error_message: error instanceof Error ? error.message : String(error)
+      }
+      yield { type: 'stop', stop_reason: 'error' }
+      return
+    }
🧹 Nitpick comments (4)
src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts (4)

946-946: Reduce noisy debug logging in hot loop.

JSON.stringify of candidates on every chunk is expensive and floods logs. Gate behind a DEBUG flag or remove.

Apply this diff:

-      console.log('chunk.candidates', JSON.stringify(chunk.candidates, null, 2))
+      if (process.env.DEEPCHAT_DEBUG_GEMINI === '1') {
+        console.debug('chunk.candidates', JSON.stringify(chunk.candidates)?.slice(0, 5_000))
+      }

693-733: Deduplicate response parsing.

completions reimplements the same thought/text extraction as processGeminiResponse. Call processGeminiResponse(result) to reduce drift.

-      // 处理响应内容,支持新格式的思考内容
-      let textContent = ''
-      let thoughtContent = ''
-      ...
-      resultResp.content = textContent
-      if (thoughtContent) {
-        resultResp.reasoning_content = thoughtContent
-      }
-
-      return resultResp
+      return this.processGeminiResponse(result)

262-265: Logs/comments language and structure.

Guidelines require English logs/comments and structured logging with levels. Convert Chinese log lines and consider INFO/WARN/ERROR with context. Avoid leaking sensitive data.

Also applies to: 307-309, 755-768, 847-849, 1151-1157


979-980: Optional: keep reading after tool call to collect usage.

Breaking immediately after tool_call may miss late usageMetadata chunks. Consider continuing the loop while suppressing further content events, then emit usage and stop(tool_use).

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 00b3851 and 9d063bb.

📒 Files selected for processing (1)
  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts (4 hunks)
🧰 Additional context used
📓 Path-based instructions (8)
**/*.{js,jsx,ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/development-setup.mdc)

**/*.{js,jsx,ts,tsx}: 使用 OxLint 进行代码检查
Log和注释使用英文书写

Files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
src/{main,renderer}/**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/electron-best-practices.mdc)

src/{main,renderer}/**/*.ts: Use context isolation for improved security
Implement proper inter-process communication (IPC) patterns
Optimize application startup time with lazy loading
Implement proper error handling and logging for debugging

Files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
src/main/**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/electron-best-practices.mdc)

Use Electron's built-in APIs for file system and native dialogs

From main to renderer, broadcast events via EventBus using mainWindow.webContents.send()

Files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/error-logging.mdc)

**/*.{ts,tsx}: 始终使用 try-catch 处理可能的错误
提供有意义的错误信息
记录详细的错误日志
优雅降级处理
日志应包含时间戳、日志级别、错误代码、错误描述、堆栈跟踪(如适用)、相关上下文信息
日志级别应包括 ERROR、WARN、INFO、DEBUG
不要吞掉错误
提供用户友好的错误信息
实现错误重试机制
避免记录敏感信息
使用结构化日志
设置适当的日志级别

Enable and adhere to strict TypeScript type checking

Files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
src/main/presenter/llmProviderPresenter/providers/*.ts

📄 CodeRabbit inference engine (.cursor/rules/llm-agent-loop.mdc)

src/main/presenter/llmProviderPresenter/providers/*.ts: Each file in src/main/presenter/llmProviderPresenter/providers/*.ts should handle interaction with a specific LLM API, including request/response formatting, tool definition conversion, native/non-native tool call management, and standardizing output streams to a common event format.
Provider implementations must use a coreStream method that yields standardized stream events to decouple the main loop from provider-specific details.
The coreStream method in each Provider must perform a single streaming API request per conversation round and must not contain multi-round tool call loop logic.
Provider files should implement helper methods such as formatMessages, convertToProviderTools, parseFunctionCalls, and prepareFunctionCallPrompt as needed for provider-specific logic.
All provider implementations must parse provider-specific data chunks and yield standardized events for text, reasoning, tool calls, usage, errors, stop reasons, and image data.
When a provider does not support native function calling, it must prepare messages using prompt wrapping (e.g., prepareFunctionCallPrompt) before making the API call.
When a provider supports native function calling, MCP tools must be converted to the provider's format (e.g., using convertToProviderTools) and included in the API request.
Provider implementations should aggregate and yield usage events as part of the standardized stream.
Provider implementations should yield image data events in the standardized format when applicable.
Provider implementations should yield reasoning events in the standardized format when applicable.
Provider implementations should yield tool call events (tool_call_start, tool_call_chunk, tool_call_end) in the standardized format.
Provider implementations should yield stop events with appropriate stop_reason in the standardized format.
Provider implementations should yield error events in the standardized format...

Files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
src/main/**/*.{ts,js,tsx,jsx}

📄 CodeRabbit inference engine (.cursor/rules/project-structure.mdc)

主进程代码放在 src/main

Files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
src/**/*.{ts,tsx,vue}

📄 CodeRabbit inference engine (CLAUDE.md)

Use English for all logs and comments

Files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
src/main/presenter/**/*.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Maintain one presenter per functional domain in src/main/presenter/

Files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
🧠 Learnings (7)
📓 Common learnings
Learnt from: neoragex2002
PR: ThinkInAIXYZ/deepchat#550
File: src/main/presenter/mcpPresenter/inMemoryServers/meetingServer.ts:258-324
Timestamp: 2025-06-21T15:53:34.641Z
Learning: User neoragex2002 prefers concise, brief responses rather than chatty or verbose communication. Avoid lengthy explanations, excessive enthusiasm, or ASCII art in responses.
📚 Learning: 2025-07-21T01:46:52.880Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : All provider implementations must parse provider-specific data chunks and yield standardized events for text, reasoning, tool calls, usage, errors, stop reasons, and image data.

Applied to files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
📚 Learning: 2025-07-21T01:46:52.880Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : Provider implementations must use a `coreStream` method that yields standardized stream events to decouple the main loop from provider-specific details.

Applied to files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
📚 Learning: 2025-07-21T01:46:52.880Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : Provider implementations should yield text events in the standardized format.

Applied to files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
📚 Learning: 2025-07-21T01:46:52.880Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : Provider implementations should yield reasoning events in the standardized format when applicable.

Applied to files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
📚 Learning: 2025-08-26T14:13:46.601Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-26T14:13:46.601Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : Each LLM provider must implement provider-specific API interactions, convert MCP tools, and normalize streaming responses

Applied to files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
📚 Learning: 2025-08-26T14:13:46.601Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-26T14:13:46.601Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : Implement a coreStream method for new providers following the standardized event interface

Applied to files:

  • src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build-check (x64)
🔇 Additional comments (6)
src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts (6)

937-938: Fix: Gating legacy parsing behind a new-thought flag is correct.

Introducing isNewThoughtFormatDetected prevents the old matcher from touching new-format streams. This directly addresses the parsing bug described in the PR.


989-996: Set the flag at detection site (part.thought) and emit reasoning immediately — good.

Detecting thought in-chunk and switching mode avoids mid-stream mis-splits. No further buffering is applied once the flag flips.


1021-1058: Bypass legacy path once new-thought is detected.

Directly yielding text and skipping the buffer/ logic eliminates body corruption. Logic is sound and minimal.


1073-1085: End-of-stream: suppress leftover buffer when new-thought was present.

Prevents accidental emission of stale buffered chunks. Matches the intended behavior.


443-489: Harden JSON parsing in geminiProvider.ts and verify functionResponse schema

• In src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts (around lines 443–489), wrap the JSON.parse call in a try/catch so that invalid JSON in toolCall.function.arguments won’t crash the provider.
• Confirm the exact shape the Gemini Node SDK expects for a function response—whether it’s
{ functionResponse: { name: string; response: any } }
or
{ functionResponse: { name: string; response: { result: any } } }—and adjust the code below accordingly.

Suggested (optional) diff for the parsing guard; once you’ve confirmed the SDK schema, update the inner functionResponse.response reference if needed:

--- a/src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
+++ b/src/main/presenter/llmProviderPresenter/providers/geminiProvider.ts
@@ -460,7 +460,18 @@ for (const toolCall of message.tool_calls || []) {
                 role: 'model',
                 parts: [
                   {
-                    functionCall: {
-                      name: toolCall.function.name,
-                      args: JSON.parse(toolCall.function.arguments || '{}')
-                    }
+                    functionCall: {
+                      name: toolCall.function.name,
+                      args: (() => {
+                        try {
+                          return toolCall.function.arguments
+                            ? JSON.parse(toolCall.function.arguments)
+                            : {};
+                        } catch (err) {
+                          console.warn(
+                            '[GeminiProvider] Invalid tool_call.arguments JSON, defaulting to {}',
+                            err
+                          );
+                          return {};
+                        }
+                      })()
+                    }
                   }
                 ]
               });

After updating the parsing, please verify and align the functionResponse block to match the SDK’s expected schema.


646-651: No change needed: systemInstruction belongs inside config

The official Node.js SDK shows systemInstruction passed under the config object, not as a top-level field:

  • In the Google AI for Developers Node.js example, generateContent is called with config: { systemInstruction: "…" } (not a top-level property) (ai.google.dev)

Please leave the existing placement unchanged.

Likely an incorrect or invalid review comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants