修复cli2api gemini3系列模型异常输出思维链的bug #4230

muyouzhi6 · 2025-12-27T16:59:54Z

修复cli2api gemini3系列模型异常输出思维链的bug

Modifications / 改动点

修改 gemini_source.py 中的 _query_stream 方法，遍历 chunk.candidates[0].content.parts 并过滤 thought=True 的 parts，而不是直接使用 chunk.text。

This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

修改前：

修改后：

Checklist / 检查清单

😊 如果 PR 中有新加入的功能，已经通过 Issue / 邮件等方式和作者讨论过。/ If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
👀 我的更改经过了良好的测试，并已在上方提供了“验证步骤”和“运行截图”。/ My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
🤓 我确保没有引入新依赖库，或者引入了新依赖库的同时将其添加到了 requirements.txt 和 pyproject.toml 文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
😮 我的更改没有引入恶意代码。/ My changes do not introduce malicious code.

Summary by Sourcery

在 cli2api 中处理 Gemini 流式响应时，将内部推理内容与用户可见输出分离。

Bug 修复：

防止 Gemini 3 系列模型在 cli2api 的流式用户可见输出中泄露思考/推理内容。

增强功能：

规范化对流式响应中 Gemini 内容片段的处理，将推理内容与最终答案文本分别进行聚合。

Original summary in English

Summary by Sourcery

Handle Gemini streaming responses by separating internal reasoning content from user-visible output in cli2api.

Bug Fixes:

Prevent Gemini 3 series models from leaking thought/reasoning content into the streamed user-visible output in cli2api.

Enhancements:

Normalize processing of Gemini content parts in streaming responses by aggregating reasoning and final answer text separately.

Summary by Sourcery

防止 Gemini 流式响应在 cli2api 输出中暴露内部推理内容，同时保留正常的用户可见文本。

错误修复：

在内容处理和流式响应中忽略被标记为 thought 的 Gemini 内容部分，避免将思维链（chain-of-thought）泄露到用户可见的输出中。

功能增强：

改进 Gemini 流式处理逻辑，通过从内容部分中分别聚合推理文本和最终答案文本；当内容部分缺失时，回退到使用 chunk.text。

Original summary in English

Summary by Sourcery

Prevent Gemini streaming responses from exposing internal reasoning content in cli2api outputs while preserving normal user-visible text.

Bug Fixes:

Ignore Gemini content parts marked as thought in both content processing and streaming responses to avoid leaking chain-of-thought into user-visible output.

Enhancements:

Improve Gemini streaming handling by separately aggregating reasoning and final answer text from content parts, with a fallback to chunk.text when parts are absent.

Summary by Sourcery

在 cli2api 中处理 Gemini 的流式响应，在保留用户可见文本的同时避免泄露内部推理内容。

Bug Fixes:

通过忽略在流式响应和最终响应中被标记为 thought 的部分，防止 Gemini 3 系列的流式响应在用户可见输出中包含链式思维（chain-of-thought）推理内容。

Enhancements:

引入结构化的 ChunkView 辅助工具和 split_chunk_content 实用函数，以防御性方式解析 Gemini 的流式数据块，将推理文本与可见文本分离；当相关部分缺失时提供回退行为。

Original summary in English

Summary by Sourcery

Handle Gemini streaming responses in cli2api to avoid leaking internal reasoning content while preserving user-visible text.

Bug Fixes:

Prevent Gemini 3 series streaming responses from including chain-of-thought reasoning content in user-visible output by ignoring parts marked as thought in both streaming and final responses.

Enhancements:

Introduce a structured ChunkView helper and split_chunk_content utility to defensively parse Gemini streaming chunks and separate reasoning text from visible text, with fallback behavior when parts are absent.

sourcery-ai

Hey - 我发现了 3 个问题，并给出了一些总体层面的反馈：

目前推理内容的抽取逻辑同时实现于 _split_chunk_content 和 _process_content_parts 中；建议在 _process_content_parts 中复用 ChunkView.reasoning_text（或一个共享的辅助函数），以避免流式与非流式路径之间的行为发生偏差。
在 _query_stream 中，当 chunk_view.parts is None 时，这些 chunk 会被完全跳过，但其实 _split_chunk_content 仍然可以从 chunk.text 中推导出 visible_text；建议将提前返回的判断条件从 parts is None 改为基于 visible_text/has_function_call，以避免丢弃有效的纯文本 chunk。

给 AI Agent 的提示词

请根据以下代码评审评论进行修改：

## 总体评论
- 目前推理内容的抽取逻辑同时实现于 `_split_chunk_content` 和 `_process_content_parts` 中；建议在 `_process_content_parts` 中复用 `ChunkView.reasoning_text`（或一个共享的辅助函数），以避免流式与非流式路径之间的行为发生偏差。
- 在 `_query_stream` 中，当 `chunk_view.parts is None` 时，这些 chunk 会被完全跳过，但其实 `_split_chunk_content` 仍然可以从 `chunk.text` 中推导出 `visible_text`；建议将提前返回的判断条件从 `parts is None` 改为基于 `visible_text`/`has_function_call`，以避免丢弃有效的纯文本 chunk。

## 逐条评论

### Comment 1
<location> `astrbot/core/provider/sources/gemini_source.py:389-398` </location>
<code_context>
+    def _split_chunk_content(
</code_context>

<issue_to_address>
**suggestion:** 在更多位置复用 ChunkView，避免重复实现推理/文本抽取逻辑。

`_split_chunk_content` 已经集中处理了 `reasoning_text` 和 `visible_text` 的拆分逻辑，包括对 `thought`、`text` 以及 `function_call` 的防御性处理。在 `_process_content_parts` 中，你又基于 `result_parts` 重新实现了部分逻辑。如果让 `_process_content_parts` 接受一个 `ChunkView`（或者使用一个小的辅助函数，接收 `parts` 并返回 `(reasoning_text, visible_text)`），就可以把这段逻辑集中到一个地方，同时保持流式与非流式路径的行为一致。
</issue_to_address>

### Comment 2
<location> `astrbot/core/provider/sources/gemini_source.py:386-395` </location>
<code_context>
+                finish_reason=None,
+            )

-        thought_buf: list[str] = [
-            (p.text or "") for p in candidate.content.parts if p.thought
-        ]
</code_context>

<issue_to_address>
**suggestion:** 使用一个共享的辅助函数来抽取推理文本，而不是在这里内联列表推导逻辑。

这里的推理抽取逻辑与 `_split_chunk_content` 中的逻辑重复（在 `thought` 为 true 时收集 `p.text`）。为了保持行为一致并降低维护成本，可以在这里复用 `_split_chunk_content`/`ChunkView`，或者抽出一个类似 `_extract_reasoning_from_parts(parts)` 的辅助函数，然后在流式路径和 `_process_content_parts` 中都使用它。
</issue_to_address>

### Comment 3
<location> `astrbot/core/provider/sources/gemini_source.py:36` </location>
<code_context>
 logging.getLogger("google_genai.types").addFilter(SuppressNonTextPartsWarning())


+@dataclass
+class ChunkView:
+    """流式响应 chunk 的结构化视图对象
</code_context>

<issue_to_address>
**issue (complexity):** 建议将新的 chunk 帮助类收窄为只承载结构化数据，并集中推理抽取逻辑，以减少可选状态数量，同时让文本/思考的处理逻辑更靠近其使用位置。

你可以在保留当前所有新行为的前提下，通过扁平化流程并移除大部分可选字段来改进结构：

1. 将“视图”辅助类收窄为纯结构（不包含推理/可见文本）。
2. 在两处复用一个统一的“推理抽取”辅助函数。
3. 将文本/思考的拆分逻辑保留在 `_query_stream` 中，靠近真正使用它们的地方。

### 1. 用更精简的结构化辅助类替换 `ChunkView`

与其让 `ChunkView` 承载多个可选字段、混合多种职责，不如保留一个只做最小防御性检查和结构提取的小辅助类：

```python
@dataclass
class ChunkBasics:
    candidate: types.Candidate
    parts: list[types.Part]
    finish_reason: types.FinishReason | None
    has_function_call: bool


def _get_chunk_basics(
    self, chunk: types.GenerateContentResponse
) -> ChunkBasics | None:
    if not chunk.candidates:
        logger.warning(f"收到的 chunk 中 candidates 为空: {chunk}")
        return None

    candidate = chunk.candidates[0]
    content = candidate.content
    if not content or not content.parts:
        logger.warning(f"收到的 chunk 中 content 为空: {chunk}")
        return None

    parts = content.parts
    has_function_call = any(
        part.function_call for part in parts if hasattr(part, "function_call")
    )

    return ChunkBasics(
        candidate=candidate,
        parts=parts,
        finish_reason=getattr(candidate, "finish_reason", None),
        has_function_call=has_function_call,
    )
```

这样可以从“视图层”中移除 `reasoning_text`、`visible_text` 和大多数 `getattr` 调用，将状态压缩为两种：`None` 或一个完全可用的对象。

### 2. 将基于 parts 的推理抽取逻辑集中起来

目前你在两个地方计算推理内容，并且逻辑略有不同。可以把这一逻辑集中到一个辅助函数中，让 `_query_stream` 和 `_process_content_parts` 共同使用：

```python
def _extract_reasoning_from_parts(self, parts: list[types.Part]) -> str:
    thought_buf: list[str] = [
        (p.text or "") for p in parts if getattr(p, "thought", False)
    ]
    return "".join(thought_buf).strip()
```

然后在 `_process_content_parts` 中：

```python
reasoning = self._extract_reasoning_from_parts(result_parts)
if reasoning:
    llm_response.reasoning_content = reasoning

for part in result_parts:
    if getattr(part, "thought", False):
        continue
    ...
```

### 3. 将推理/可见文本的拆分逻辑局部化在 `_query_stream` 中

有了 `ChunkBasics` 和 `_extract_reasoning_from_parts`，`_query_stream` 就可以在不依赖 `ChunkView.reasoning_text` / `.visible_text` 的情况下变得更简单、更清晰：

```python
async for chunk in result:
    llm_response = LLMResponse("assistant", is_chunk=True)

    basics = self._get_chunk_basics(chunk)
    if basics is None:
        continue

    if basics.has_function_call:
        llm_response = LLMResponse("assistant", is_chunk=False)
        llm_response.raw_completion = chunk
        llm_response.result_chain = self._process_content_parts(
            basics.candidate, llm_response
        )
        llm_response.id = chunk.response_id
        if chunk.usage_metadata:
            llm_response.usage = self._extract_usage(chunk.usage_metadata)
        yield llm_response
        return

    has_content = False

    # reasoning
    reasoning = self._extract_reasoning_from_parts(basics.parts)
    if reasoning:
        has_content = True
        accumulated_reasoning += reasoning
        llm_response.reasoning_content = reasoning

    # visible text (非 thought 的 text)
    visible_text_parts = [
        p.text or ""
        for p in basics.parts
        if not getattr(p, "thought", False) and getattr(p, "text", None)
    ]
    visible_text = "".join(visible_text_parts).strip()
    if not visible_text and getattr(chunk, "text", None):
        visible_text = chunk.text  # 保留对 chunk.text 的回退逻辑

    if visible_text:
        has_content = True
        accumulated_text += visible_text
        llm_response.result_chain = MessageChain(
            chain=[Comp.Plain(visible_text)]
        )

    if has_content:
        yield llm_response

    if basics.finish_reason:
        if basics.parts:
            final_response = LLMResponse("assistant", is_chunk=False)
            final_response.raw_completion = chunk
            final_response.result_chain = self._process_content_parts(
                basics.candidate, final_response
            )
            ...
        break
```

这样可以：

- 保持相同的函数调用检测行为；
- 保持推理/可见文本的分离（包括 `chunk.text` 的回退逻辑）；
- 为“thought 与 visible”语义提供唯一可信来源。

但同时：

- 移除了过于通用的 `ChunkView`（需要跟踪的可选状态更少）；
- 将展示层面的决策局部化到 `_query_stream` / `_process_content_parts`；
- 减少了 `getattr` 噪音，让代码更加贴近底层的 Gemini 类型。
</issue_to_address>

Sourcery 对开源项目免费 —— 如果你觉得我们的评审有帮助，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点击 👍 或 👎，我会根据这些反馈改进后续评审。}

Original comment in English

Hey - I've found 3 issues, and left some high level feedback:

The reasoning extraction logic is now implemented both in _split_chunk_content and _process_content_parts; consider reusing ChunkView.reasoning_text (or a shared helper) in _process_content_parts to avoid divergence between streaming and non‑streaming paths.
In _query_stream, chunks with chunk_view.parts is None are skipped entirely even though _split_chunk_content can still derive visible_text from chunk.text; consider basing the early‑return check on visible_text/has_function_call instead of parts is None to avoid dropping valid text-only chunks.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The reasoning extraction logic is now implemented both in `_split_chunk_content` and `_process_content_parts`; consider reusing `ChunkView.reasoning_text` (or a shared helper) in `_process_content_parts` to avoid divergence between streaming and non‑streaming paths.
- In `_query_stream`, chunks with `chunk_view.parts is None` are skipped entirely even though `_split_chunk_content` can still derive `visible_text` from `chunk.text`; consider basing the early‑return check on `visible_text`/`has_function_call` instead of `parts is None` to avoid dropping valid text-only chunks.

## Individual Comments

### Comment 1
<location> `astrbot/core/provider/sources/gemini_source.py:389-398` </location>
<code_context>
+    def _split_chunk_content(
</code_context>

<issue_to_address>
**suggestion:** Reuse ChunkView in more places to avoid duplicating reasoning/text extraction logic.

`_split_chunk_content` already centralizes the logic for separating `reasoning_text` and `visible_text`, including the defensive handling of `thought`, `text`, and `function_call`. In `_process_content_parts` you reimplement part of this against `result_parts`. If `_process_content_parts` accepted a `ChunkView` (or used a small helper that takes `parts` and returns `(reasoning_text, visible_text)`), this logic could live in one place and keep streaming and non-streaming behavior consistent.
</issue_to_address>

### Comment 2
<location> `astrbot/core/provider/sources/gemini_source.py:386-395` </location>
<code_context>
+                finish_reason=None,
+            )

-        thought_buf: list[str] = [
-            (p.text or "") for p in candidate.content.parts if p.thought
-        ]
</code_context>

<issue_to_address>
**suggestion:** Extract reasoning text using a shared helper instead of inlining list comprehension logic here.

This reasoning extraction duplicates the logic in `_split_chunk_content` (collecting `p.text` when `thought` is true). To keep behavior consistent and easier to maintain, either reuse `_split_chunk_content`/`ChunkView` here or introduce a helper like `_extract_reasoning_from_parts(parts)` and call it both in the streaming path and `_process_content_parts`.
</issue_to_address>

### Comment 3
<location> `astrbot/core/provider/sources/gemini_source.py:36` </location>
<code_context>
 logging.getLogger("google_genai.types").addFilter(SuppressNonTextPartsWarning())


+@dataclass
+class ChunkView:
+    """流式响应 chunk 的结构化视图对象
</code_context>

<issue_to_address>
**issue (complexity):** Consider narrowing the new chunk helper to just structural data and centralizing reasoning extraction to reduce optional state and keep the text/thought logic local to where it’s used.

You can keep all the new behavior while flattening the flow and removing most of the optional fields by:

1. Narrowing the “view” helper to pure structure (no reasoning/visible text).
2. Reusing a single “reasoning extraction” helper in both places.
3. Keeping the text/thought split in `_query_stream`, close to where it is used.

### 1. Replace `ChunkView` with a narrower structural helper

Instead of a broad `ChunkView` carrying multiple optional fields and mixing concerns, keep a small helper that only does minimal defensive checks and structure extraction:

```python
@dataclass
class ChunkBasics:
    candidate: types.Candidate
    parts: list[types.Part]
    finish_reason: types.FinishReason | None
    has_function_call: bool


def _get_chunk_basics(
    self, chunk: types.GenerateContentResponse
) -> ChunkBasics | None:
    if not chunk.candidates:
        logger.warning(f"收到的 chunk 中 candidates 为空: {chunk}")
        return None

    candidate = chunk.candidates[0]
    content = candidate.content
    if not content or not content.parts:
        logger.warning(f"收到的 chunk 中 content 为空: {chunk}")
        return None

    parts = content.parts
    has_function_call = any(
        part.function_call for part in parts if hasattr(part, "function_call")
    )

    return ChunkBasics(
        candidate=candidate,
        parts=parts,
        finish_reason=getattr(candidate, "finish_reason", None),
        has_function_call=has_function_call,
    )
```

This drops `reasoning_text`, `visible_text` and most `getattr` usage from the “view” layer and keeps the number of possible states small (`None` vs a fully usable object).

### 2. Centralize reasoning extraction on parts

You currently compute reasoning twice with slightly different logic. You can centralize this into a helper that both `_query_stream` and `_process_content_parts` use:

```python
def _extract_reasoning_from_parts(self, parts: list[types.Part]) -> str:
    thought_buf: list[str] = [
        (p.text or "") for p in parts if getattr(p, "thought", False)
    ]
    return "".join(thought_buf).strip()
```

Then in `_process_content_parts`:

```python
reasoning = self._extract_reasoning_from_parts(result_parts)
if reasoning:
    llm_response.reasoning_content = reasoning

for part in result_parts:
    if getattr(part, "thought", False):
        continue
    ...
```

### 3. Keep reasoning/visible text split local in `_query_stream`

With `ChunkBasics` and `_extract_reasoning_from_parts`, `_query_stream` can be simplified and made more explicit, without `ChunkView.reasoning_text` / `.visible_text`:

```python
async for chunk in result:
    llm_response = LLMResponse("assistant", is_chunk=True)

    basics = self._get_chunk_basics(chunk)
    if basics is None:
        continue

    if basics.has_function_call:
        llm_response = LLMResponse("assistant", is_chunk=False)
        llm_response.raw_completion = chunk
        llm_response.result_chain = self._process_content_parts(
            basics.candidate, llm_response
        )
        llm_response.id = chunk.response_id
        if chunk.usage_metadata:
            llm_response.usage = self._extract_usage(chunk.usage_metadata)
        yield llm_response
        return

    has_content = False

    # reasoning
    reasoning = self._extract_reasoning_from_parts(basics.parts)
    if reasoning:
        has_content = True
        accumulated_reasoning += reasoning
        llm_response.reasoning_content = reasoning

    # visible text (非 thought 的 text)
    visible_text_parts = [
        p.text or ""
        for p in basics.parts
        if not getattr(p, "thought", False) and getattr(p, "text", None)
    ]
    visible_text = "".join(visible_text_parts).strip()
    if not visible_text and getattr(chunk, "text", None):
        visible_text = chunk.text  # 保留对 chunk.text 的回退逻辑

    if visible_text:
        has_content = True
        accumulated_text += visible_text
        llm_response.result_chain = MessageChain(
            chain=[Comp.Plain(visible_text)]
        )

    if has_content:
        yield llm_response

    if basics.finish_reason:
        if basics.parts:
            final_response = LLMResponse("assistant", is_chunk=False)
            final_response.raw_completion = chunk
            final_response.result_chain = self._process_content_parts(
                basics.candidate, final_response
            )
            ...
        break
```

This keeps:

- The same function-call detection behavior.
- The reasoning/visible-text separation (including `chunk.text` fallback).
- A single source of truth for “thought vs visible” semantics.

But it:

- Removes the over-general `ChunkView` (less optional state to track).
- Localizes presentation decisions to `_query_stream` / `_process_content_parts`.
- Reduces `getattr` noise and brings code closer to the underlying Gemini types.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/provider/sources/gemini_source.py

Soulter · 2025-12-29T06:27:12Z

我觉得可以把 thought=true 的内容放到 llm_response 中的 reasoning_content 里面？

muyouzhi6 · 2025-12-29T06:58:34Z

我觉得可以把 thought=true 的内容放到 llm_response 中的 reasoning_content 里面？

是的，已经是这样实现的。

Soulter · 2025-12-29T16:20:48Z

我理解应该直接在 _process_content_parts 的 for 里面跳过 part.text 不为空且 part.thought 为 true 的 part 就行了，感觉不需要这么麻烦

Update gemini_source.py

7d0b93e

auto-assign bot requested review from Fridemn and anka-afk December 27, 2025 16:59

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Dec 27, 2025

sourcery-ai bot reviewed Dec 27, 2025

View reviewed changes

astrbot/core/provider/sources/gemini_source.py Show resolved Hide resolved

astrbot/core/provider/sources/gemini_source.py Show resolved Hide resolved

astrbot/core/provider/sources/gemini_source.py Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

修复cli2api gemini3系列模型异常输出思维链的bug #4230

修复cli2api gemini3系列模型异常输出思维链的bug #4230

Uh oh!

muyouzhi6 commented Dec 27, 2025 •

edited by sourcery-ai bot

Loading

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Soulter commented Dec 29, 2025

Uh oh!

muyouzhi6 commented Dec 29, 2025

Uh oh!

Soulter commented Dec 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

修复cli2api gemini3系列模型异常输出思维链的bug #4230

Are you sure you want to change the base?

修复cli2api gemini3系列模型异常输出思维链的bug #4230

Uh oh!

Conversation

muyouzhi6 commented Dec 27, 2025 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Modifications / 改动点

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Soulter commented Dec 29, 2025

Uh oh!

muyouzhi6 commented Dec 29, 2025

Uh oh!

Soulter commented Dec 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

muyouzhi6 commented Dec 27, 2025 •

edited by sourcery-ai bot

Loading