-
Notifications
You must be signed in to change notification settings - Fork 516
feat: optimize image markdown streaming performance with smart buffering #792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (6)
src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts (1)
1121-1304
: Tone down verbose logs and redact large payloads in parseFunctionCallsUnconditional logs and dumping full content can leak PII and spam production logs. Gate debug logs and truncate error payloads.
- console.error( - `[parseFunctionCalls] Failed to parse content for match ${index} even with jsonrepair:`, - repairError, - 'Original content:', - content, - 'Repaired content attempt:', - repairedJson ?? 'N/A' - ) // Log final failure + const redact = (s?: string) => + s ? (s.length > 2000 ? s.slice(0, 2000) + '…' : s) : 'N/A' + console.error( + `[parseFunctionCalls] Failed to parse content for match ${index} even with jsonrepair:`, + repairError, + 'Original content (truncated):', + redact(content), + 'Repaired content attempt (truncated):', + redact(repairedJson || undefined) + ) return null // Skip this malformed call @@ - console.log(`[parseFunctionCalls] Returning ${toolCalls.length} parsed tool calls.`) // Log final count + if (process.env.NODE_ENV === 'development') { + console.debug(`[parseFunctionCalls] parsed tool calls: ${toolCalls.length}`) + }src/main/presenter/threadPresenter/index.ts (5)
57-72
: Adaptive buffering is only partially integrated; comments not in English; one typo
- The new adaptiveBuffer/flush pipeline isn’t wired into the live path: content still flows via processContentDirectly -> DB updates, while flushAdaptiveBuffer/processBufferedContent/processLargeContentAsynchronously remain unused.
- Several comments are in Chinese and one typo: “每次处琅5个”.
Actions:
- Either wire adaptive buffering into the hot path or remove the unused code to reduce maintenance overhead.
- Translate comments to English and fix the typo.
- // 统一的自适应内容处理 + // Unified adaptive content buffering @@ - // 精确追踪已发送内容的位置 - sentPosition: number // 已发送到渲染器的内容位置 + // Track exact sent position sent to renderer + sentPosition: number @@ - // 分块大内容 - 使用更小的分块避免UI阻塞 + // Chunk large content - use smaller chunk size to avoid UI jank @@ - let maxChunkSize = 4096 // 默认4KB + let maxChunkSize = 4096 // Default 4KB @@ - // 对于图片base64内容,使用非常小的分块 + // Use smaller chunks for base64-heavy content @@ - // 对于超长内容,进一步减小分块 + // Further reduce chunk size for extremely long content @@ - // 智能判断是否需要分块处理 - 优化阈值判断 + // Heuristic to decide if chunking is needed - tuned thresholds @@ - // 处理缓冲的内容 - 优化异步处理 + // Process buffered content - optimized async path @@ - // 异步处理大内容 - 避免阻塞主进程 + // Process large content asynchronously - avoid blocking main process @@ - // 批量处理分块,每次处琅5个 + // Process chunks in batches of 5 @@ - // 处理普通内容 + // Handle normal content @@ - // 统一的数据库和渲染器更新 (当前未使用) + // Unified DB and renderer update (currently unused) @@ - // 简化的直接内容处理 + // Direct content processing (buffer-aware) @@ - /** - * 直接处理内容的方法 - */ + /** + * Direct content processing + */ @@ - /** - * 分块处理大内容 - */ + /** + * Chunk large content (append-only path right now) + */Optional wiring (minimal risk): prefer buffering only for large payloads, but avoid double-sending by keeping renderer notifications centralized in handleLLMAgentResponse. Replace processContentDirectly body with:
- // 检查是否需要分块处理 - if (this.shouldSplitContent(content)) { - await this.processLargeContentInChunks(eventId, content, currentTime) - } else { - await this.processNormalContent(eventId, content, currentTime) - } + // Buffer large payloads; small payloads go direct + if (this.shouldSplitContent(content)) { + const state = this.generatingMessages.get(eventId) + if (!state) return + const now = Date.now() + state.adaptiveBuffer ??= { + content: '', + lastUpdateTime: now, + updateCount: 0, + totalSize: 0, + isLargeContent: true, + sentPosition: 0 + } + state.adaptiveBuffer.content += content + state.adaptiveBuffer.totalSize += content.length + state.adaptiveBuffer.lastUpdateTime = now + // Throttle flushes + if (!state.throttleTimeout) { + state.throttleTimeout = setTimeout(async () => { + state.throttleTimeout = undefined + await this.flushAdaptiveBuffer(eventId) + }, 80) + } + if (!state.flushTimeout) { + state.flushTimeout = setTimeout(async () => { + await this.flushAdaptiveBuffer(eventId) + }, 300) + } + return + } + await this.processNormalContent(eventId, content, currentTime)And keep renderer emits only in handleLLMAgentResponse (no extra sends from the buffer path).
Additionally, to avoid duplicate renderer sends if you later switch to the async chunk sender, change processBufferedContent to use the DB-only path:
- if (buffer?.isLargeContent) { - await this.processLargeContentAsynchronously(eventId, content, currentTime) - return - } + if (buffer?.isLargeContent) { + await this.processLargeContentInChunks(eventId, content, currentTime) + return + }Also applies to: 368-387, 389-397, 399-417, 419-500, 502-527, 548-551, 868-870, 4185-4192, 4207-4238
529-546
: Deduplicate finalizeLastBlock logicThere’s a near-identical finalizeLastBlock in handleLLMAgentResponse. Prefer calling this helper to avoid drift.
- // 处理工具调用... - if (tool_call) { + // 处理工具调用... + if (tool_call) { + // Use class-level helper instead of local finalize + this.finalizeLastBlock(state)
4207-4238
: Function name vs behaviorprocessLargeContentInChunks doesn’t actually chunk; it appends and updates DB. Either:
- implement real chunking using splitLargeContent and (optionally) batched renderer updates, or
- rename to processLargeContentAppendOnly to reflect behavior.
- private async processLargeContentInChunks( + private async processLargeContentAppendOnly( eventId: string, content: string, currentTime: number ): Promise<void> {And update the single call site accordingly.
399-417
: Guard against redundant DB writesIf content is empty or only whitespace after upstream filtering, skip editMessage to reduce write amplification.
- // 正常内容处理 - await this.processNormalContent(eventId, content, currentTime) + // Normal content path + if (content && content.trim().length > 0) { + await this.processNormalContent(eventId, content, currentTime) + }
332-361
: flushAdaptiveBuffer: clear throttle timeout tooYou clear flushTimeout but may leave a pending throttleTimeout racing another flush. Consider clearing both.
// 清理超时 if (state.flushTimeout) { clearTimeout(state.flushTimeout) state.flushTimeout = undefined } + if (state.throttleTimeout) { + clearTimeout(state.throttleTimeout) + state.throttleTimeout = undefined + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts
(2 hunks)src/main/presenter/threadPresenter/index.ts
(8 hunks)
🧰 Additional context used
📓 Path-based instructions (8)
**/*.{js,jsx,ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/development-setup.mdc)
**/*.{js,jsx,ts,tsx}
: 使用 OxLint 进行代码检查
Log和注释使用英文书写
Files:
src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts
src/main/presenter/threadPresenter/index.ts
src/{main,renderer}/**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/electron-best-practices.mdc)
src/{main,renderer}/**/*.ts
: Use context isolation for improved security
Implement proper inter-process communication (IPC) patterns
Optimize application startup time with lazy loading
Implement proper error handling and logging for debugging
Files:
src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts
src/main/presenter/threadPresenter/index.ts
src/main/**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/electron-best-practices.mdc)
Use Electron's built-in APIs for file system and native dialogs
From main to renderer, broadcast events via EventBus using mainWindow.webContents.send()
Files:
src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts
src/main/presenter/threadPresenter/index.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.cursor/rules/error-logging.mdc)
**/*.{ts,tsx}
: 始终使用 try-catch 处理可能的错误
提供有意义的错误信息
记录详细的错误日志
优雅降级处理
日志应包含时间戳、日志级别、错误代码、错误描述、堆栈跟踪(如适用)、相关上下文信息
日志级别应包括 ERROR、WARN、INFO、DEBUG
不要吞掉错误
提供用户友好的错误信息
实现错误重试机制
避免记录敏感信息
使用结构化日志
设置适当的日志级别Enable and adhere to strict TypeScript type checking
Files:
src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts
src/main/presenter/threadPresenter/index.ts
src/main/presenter/llmProviderPresenter/providers/*.ts
📄 CodeRabbit inference engine (.cursor/rules/llm-agent-loop.mdc)
src/main/presenter/llmProviderPresenter/providers/*.ts
: Each file insrc/main/presenter/llmProviderPresenter/providers/*.ts
should handle interaction with a specific LLM API, including request/response formatting, tool definition conversion, native/non-native tool call management, and standardizing output streams to a common event format.
Provider implementations must use acoreStream
method that yields standardized stream events to decouple the main loop from provider-specific details.
ThecoreStream
method in each Provider must perform a single streaming API request per conversation round and must not contain multi-round tool call loop logic.
Provider files should implement helper methods such asformatMessages
,convertToProviderTools
,parseFunctionCalls
, andprepareFunctionCallPrompt
as needed for provider-specific logic.
All provider implementations must parse provider-specific data chunks and yield standardized events for text, reasoning, tool calls, usage, errors, stop reasons, and image data.
When a provider does not support native function calling, it must prepare messages using prompt wrapping (e.g.,prepareFunctionCallPrompt
) before making the API call.
When a provider supports native function calling, MCP tools must be converted to the provider's format (e.g., usingconvertToProviderTools
) and included in the API request.
Provider implementations should aggregate and yield usage events as part of the standardized stream.
Provider implementations should yield image data events in the standardized format when applicable.
Provider implementations should yield reasoning events in the standardized format when applicable.
Provider implementations should yield tool call events (tool_call_start
,tool_call_chunk
,tool_call_end
) in the standardized format.
Provider implementations should yield stop events with appropriatestop_reason
in the standardized format.
Provider implementations should yield error events in the standardized format...
Files:
src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts
src/main/**/*.{ts,js,tsx,jsx}
📄 CodeRabbit inference engine (.cursor/rules/project-structure.mdc)
主进程代码放在
src/main
Files:
src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts
src/main/presenter/threadPresenter/index.ts
src/**/*.{ts,tsx,vue}
📄 CodeRabbit inference engine (CLAUDE.md)
Use English for all logs and comments
Files:
src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts
src/main/presenter/threadPresenter/index.ts
src/main/presenter/**/*.ts
📄 CodeRabbit inference engine (CLAUDE.md)
Maintain one presenter per functional domain in src/main/presenter/
Files:
src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts
src/main/presenter/threadPresenter/index.ts
🧠 Learnings (12)
📚 Learning: 2025-07-21T01:46:52.880Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : All provider implementations must parse provider-specific data chunks and yield standardized events for text, reasoning, tool calls, usage, errors, stop reasons, and image data.
Applied to files:
src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts
📚 Learning: 2025-07-21T01:46:52.880Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : Provider implementations should yield image data events in the standardized format when applicable.
Applied to files:
src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts
📚 Learning: 2025-07-21T01:46:52.880Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : Provider files should implement helper methods such as `formatMessages`, `convertToProviderTools`, `parseFunctionCalls`, and `prepareFunctionCallPrompt` as needed for provider-specific logic.
Applied to files:
src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts
📚 Learning: 2025-07-21T01:46:52.880Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : When a provider supports native function calling, MCP tools must be converted to the provider's format (e.g., using `convertToProviderTools`) and included in the API request.
Applied to files:
src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts
📚 Learning: 2025-08-26T14:13:46.601Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-26T14:13:46.601Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : Each LLM provider must implement provider-specific API interactions, convert MCP tools, and normalize streaming responses
Applied to files:
src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts
📚 Learning: 2025-07-21T01:46:52.880Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/index.ts : The main Agent loop should buffer text content, handle tool call events, format tool results for the next LLM call, and manage conversation continuation logic.
Applied to files:
src/main/presenter/threadPresenter/index.ts
📚 Learning: 2025-08-26T14:13:46.601Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-26T14:13:46.601Z
Learning: Applies to src/main/presenter/llmProviderPresenter/index.ts : Agent Loop layer must manage conversation flow, execute tools via McpPresenter, and standardize events to the frontend
Applied to files:
src/main/presenter/threadPresenter/index.ts
📚 Learning: 2025-07-21T01:46:52.880Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/index.ts : `src/main/presenter/llmProviderPresenter/index.ts` should manage the overall Agent loop, conversation history, tool execution via `McpPresenter`, and frontend communication via `eventBus`.
Applied to files:
src/main/presenter/threadPresenter/index.ts
📚 Learning: 2025-07-21T01:46:52.880Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/index.ts : The main Agent loop in `llmProviderPresenter/index.ts` should handle multi-round LLM calls and tool usage, maintaining conversation state and controlling the loop with `needContinueConversation` and `toolCallCount`.
Applied to files:
src/main/presenter/threadPresenter/index.ts
📚 Learning: 2025-07-21T01:46:52.880Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/index.ts : The main Agent loop should send standardized `STREAM_EVENTS` (`RESPONSE`, `END`, `ERROR`) to the frontend via `eventBus`.
Applied to files:
src/main/presenter/threadPresenter/index.ts
📚 Learning: 2025-07-21T01:46:52.880Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/llm-agent-loop.mdc:0-0
Timestamp: 2025-07-21T01:46:52.880Z
Learning: Applies to src/main/presenter/llmProviderPresenter/providers/*.ts : The `coreStream` method in each Provider must perform a single streaming API request per conversation round and must not contain multi-round tool call loop logic.
Applied to files:
src/main/presenter/threadPresenter/index.ts
📚 Learning: 2025-07-21T01:45:54.229Z
Learnt from: CR
PR: ThinkInAIXYZ/deepchat#0
File: .cursor/rules/electron-best-practices.mdc:0-0
Timestamp: 2025-07-21T01:45:54.229Z
Learning: Applies to src/renderer/src/composables/usePresenter.ts : The IPC in the renderer process is implemented in usePresenter.ts, allowing direct calls to the presenter-related interfaces exposed by the main process
Applied to files:
src/main/presenter/threadPresenter/index.ts
🧬 Code graph analysis (2)
src/main/presenter/llmProviderPresenter/providers/openAICompatibleProvider.ts (1)
src/main/presenter/index.ts (1)
presenter
(188-188)
src/main/presenter/threadPresenter/index.ts (3)
src/main/eventbus.ts (1)
eventBus
(151-151)src/main/events.ts (1)
STREAM_EVENTS
(51-55)src/renderer/src/events.ts (1)
STREAM_EVENTS
(38-42)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build-check (x64)
🔇 Additional comments (3)
src/main/presenter/threadPresenter/index.ts (3)
136-143
: Good: buffer flush and cleanup on error/end/cancel pathsConsistent cleanup reduces dangling timers and memory. No changes requested.
Also applies to: 187-200, 274-281, 2649-2655
389-397
: Thresholds: consider telemetry-driven tuning8KB/5KB are reasonable defaults. If available, log observed content sizes and adjust thresholds per device to minimize UI jank without over-buffering.
Would you like me to add lightweight metrics (percentiles) to tune these thresholds at runtime?
57-72
: No action needed: adaptiveBuffer is actively usedWe’ve confirmed that the
adaptiveBuffer
field is referenced and utilized throughoutthreadPresenter/index.ts
, so it should not be removed:• Checks for
state.adaptiveBuffer
and calls toawait this.flushAdaptiveBuffer(...)
appear around lines 137–139, 275–277, and 2649–2651.
• InsideflushAdaptiveBuffer
(lines 337–357), the buffer is read, processed, and cleared.
• TheprocessLargeContentAsynchronously
method (invoked at line 411 and defined at lines 420–427) also relies onbuffer.isLargeContent
.Please disregard the previous suggestion to drop or re-wire
adaptiveBuffer
.Likely an incorrect or invalid review comment.
add buffer for base64 image markdown tag
Summary by CodeRabbit
New Features
Bug Fixes
Refactor