-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
修复cli2api gemini3系列模型异常输出思维链的bug #4230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey - 我发现了 3 个问题,并给出了一些总体层面的反馈:
- 目前推理内容的抽取逻辑同时实现于
_split_chunk_content和_process_content_parts中;建议在_process_content_parts中复用ChunkView.reasoning_text(或一个共享的辅助函数),以避免流式与非流式路径之间的行为发生偏差。 - 在
_query_stream中,当chunk_view.parts is None时,这些 chunk 会被完全跳过,但其实_split_chunk_content仍然可以从chunk.text中推导出visible_text;建议将提前返回的判断条件从parts is None改为基于visible_text/has_function_call,以避免丢弃有效的纯文本 chunk。
给 AI Agent 的提示词
请根据以下代码评审评论进行修改:
## 总体评论
- 目前推理内容的抽取逻辑同时实现于 `_split_chunk_content` 和 `_process_content_parts` 中;建议在 `_process_content_parts` 中复用 `ChunkView.reasoning_text`(或一个共享的辅助函数),以避免流式与非流式路径之间的行为发生偏差。
- 在 `_query_stream` 中,当 `chunk_view.parts is None` 时,这些 chunk 会被完全跳过,但其实 `_split_chunk_content` 仍然可以从 `chunk.text` 中推导出 `visible_text`;建议将提前返回的判断条件从 `parts is None` 改为基于 `visible_text`/`has_function_call`,以避免丢弃有效的纯文本 chunk。
## 逐条评论
### Comment 1
<location> `astrbot/core/provider/sources/gemini_source.py:389-398` </location>
<code_context>
+ def _split_chunk_content(
</code_context>
<issue_to_address>
**suggestion:** 在更多位置复用 ChunkView,避免重复实现推理/文本抽取逻辑。
`_split_chunk_content` 已经集中处理了 `reasoning_text` 和 `visible_text` 的拆分逻辑,包括对 `thought`、`text` 以及 `function_call` 的防御性处理。在 `_process_content_parts` 中,你又基于 `result_parts` 重新实现了部分逻辑。如果让 `_process_content_parts` 接受一个 `ChunkView`(或者使用一个小的辅助函数,接收 `parts` 并返回 `(reasoning_text, visible_text)`),就可以把这段逻辑集中到一个地方,同时保持流式与非流式路径的行为一致。
</issue_to_address>
### Comment 2
<location> `astrbot/core/provider/sources/gemini_source.py:386-395` </location>
<code_context>
+ finish_reason=None,
+ )
- thought_buf: list[str] = [
- (p.text or "") for p in candidate.content.parts if p.thought
- ]
</code_context>
<issue_to_address>
**suggestion:** 使用一个共享的辅助函数来抽取推理文本,而不是在这里内联列表推导逻辑。
这里的推理抽取逻辑与 `_split_chunk_content` 中的逻辑重复(在 `thought` 为 true 时收集 `p.text`)。为了保持行为一致并降低维护成本,可以在这里复用 `_split_chunk_content`/`ChunkView`,或者抽出一个类似 `_extract_reasoning_from_parts(parts)` 的辅助函数,然后在流式路径和 `_process_content_parts` 中都使用它。
</issue_to_address>
### Comment 3
<location> `astrbot/core/provider/sources/gemini_source.py:36` </location>
<code_context>
logging.getLogger("google_genai.types").addFilter(SuppressNonTextPartsWarning())
+@dataclass
+class ChunkView:
+ """流式响应 chunk 的结构化视图对象
</code_context>
<issue_to_address>
**issue (complexity):** 建议将新的 chunk 帮助类收窄为只承载结构化数据,并集中推理抽取逻辑,以减少可选状态数量,同时让文本/思考的处理逻辑更靠近其使用位置。
你可以在保留当前所有新行为的前提下,通过扁平化流程并移除大部分可选字段来改进结构:
1. 将“视图”辅助类收窄为纯结构(不包含推理/可见文本)。
2. 在两处复用一个统一的“推理抽取”辅助函数。
3. 将文本/思考的拆分逻辑保留在 `_query_stream` 中,靠近真正使用它们的地方。
### 1. 用更精简的结构化辅助类替换 `ChunkView`
与其让 `ChunkView` 承载多个可选字段、混合多种职责,不如保留一个只做最小防御性检查和结构提取的小辅助类:
```python
@dataclass
class ChunkBasics:
candidate: types.Candidate
parts: list[types.Part]
finish_reason: types.FinishReason | None
has_function_call: bool
def _get_chunk_basics(
self, chunk: types.GenerateContentResponse
) -> ChunkBasics | None:
if not chunk.candidates:
logger.warning(f"收到的 chunk 中 candidates 为空: {chunk}")
return None
candidate = chunk.candidates[0]
content = candidate.content
if not content or not content.parts:
logger.warning(f"收到的 chunk 中 content 为空: {chunk}")
return None
parts = content.parts
has_function_call = any(
part.function_call for part in parts if hasattr(part, "function_call")
)
return ChunkBasics(
candidate=candidate,
parts=parts,
finish_reason=getattr(candidate, "finish_reason", None),
has_function_call=has_function_call,
)
```
这样可以从“视图层”中移除 `reasoning_text`、`visible_text` 和大多数 `getattr` 调用,将状态压缩为两种:`None` 或一个完全可用的对象。
### 2. 将基于 parts 的推理抽取逻辑集中起来
目前你在两个地方计算推理内容,并且逻辑略有不同。可以把这一逻辑集中到一个辅助函数中,让 `_query_stream` 和 `_process_content_parts` 共同使用:
```python
def _extract_reasoning_from_parts(self, parts: list[types.Part]) -> str:
thought_buf: list[str] = [
(p.text or "") for p in parts if getattr(p, "thought", False)
]
return "".join(thought_buf).strip()
```
然后在 `_process_content_parts` 中:
```python
reasoning = self._extract_reasoning_from_parts(result_parts)
if reasoning:
llm_response.reasoning_content = reasoning
for part in result_parts:
if getattr(part, "thought", False):
continue
...
```
### 3. 将推理/可见文本的拆分逻辑局部化在 `_query_stream` 中
有了 `ChunkBasics` 和 `_extract_reasoning_from_parts`,`_query_stream` 就可以在不依赖 `ChunkView.reasoning_text` / `.visible_text` 的情况下变得更简单、更清晰:
```python
async for chunk in result:
llm_response = LLMResponse("assistant", is_chunk=True)
basics = self._get_chunk_basics(chunk)
if basics is None:
continue
if basics.has_function_call:
llm_response = LLMResponse("assistant", is_chunk=False)
llm_response.raw_completion = chunk
llm_response.result_chain = self._process_content_parts(
basics.candidate, llm_response
)
llm_response.id = chunk.response_id
if chunk.usage_metadata:
llm_response.usage = self._extract_usage(chunk.usage_metadata)
yield llm_response
return
has_content = False
# reasoning
reasoning = self._extract_reasoning_from_parts(basics.parts)
if reasoning:
has_content = True
accumulated_reasoning += reasoning
llm_response.reasoning_content = reasoning
# visible text (非 thought 的 text)
visible_text_parts = [
p.text or ""
for p in basics.parts
if not getattr(p, "thought", False) and getattr(p, "text", None)
]
visible_text = "".join(visible_text_parts).strip()
if not visible_text and getattr(chunk, "text", None):
visible_text = chunk.text # 保留对 chunk.text 的回退逻辑
if visible_text:
has_content = True
accumulated_text += visible_text
llm_response.result_chain = MessageChain(
chain=[Comp.Plain(visible_text)]
)
if has_content:
yield llm_response
if basics.finish_reason:
if basics.parts:
final_response = LLMResponse("assistant", is_chunk=False)
final_response.raw_completion = chunk
final_response.result_chain = self._process_content_parts(
basics.candidate, final_response
)
...
break
```
这样可以:
- 保持相同的函数调用检测行为;
- 保持推理/可见文本的分离(包括 `chunk.text` 的回退逻辑);
- 为“thought 与 visible”语义提供唯一可信来源。
但同时:
- 移除了过于通用的 `ChunkView`(需要跟踪的可选状态更少);
- 将展示层面的决策局部化到 `_query_stream` / `_process_content_parts`;
- 减少了 `getattr` 噪音,让代码更加贴近底层的 Gemini 类型。
</issue_to_address>帮我变得更有用!请在每条评论上点击 👍 或 👎,我会根据这些反馈改进后续评审。
Original comment in English
Hey - I've found 3 issues, and left some high level feedback:
- The reasoning extraction logic is now implemented both in
_split_chunk_contentand_process_content_parts; consider reusingChunkView.reasoning_text(or a shared helper) in_process_content_partsto avoid divergence between streaming and non‑streaming paths. - In
_query_stream, chunks withchunk_view.parts is Noneare skipped entirely even though_split_chunk_contentcan still derivevisible_textfromchunk.text; consider basing the early‑return check onvisible_text/has_function_callinstead ofparts is Noneto avoid dropping valid text-only chunks.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The reasoning extraction logic is now implemented both in `_split_chunk_content` and `_process_content_parts`; consider reusing `ChunkView.reasoning_text` (or a shared helper) in `_process_content_parts` to avoid divergence between streaming and non‑streaming paths.
- In `_query_stream`, chunks with `chunk_view.parts is None` are skipped entirely even though `_split_chunk_content` can still derive `visible_text` from `chunk.text`; consider basing the early‑return check on `visible_text`/`has_function_call` instead of `parts is None` to avoid dropping valid text-only chunks.
## Individual Comments
### Comment 1
<location> `astrbot/core/provider/sources/gemini_source.py:389-398` </location>
<code_context>
+ def _split_chunk_content(
</code_context>
<issue_to_address>
**suggestion:** Reuse ChunkView in more places to avoid duplicating reasoning/text extraction logic.
`_split_chunk_content` already centralizes the logic for separating `reasoning_text` and `visible_text`, including the defensive handling of `thought`, `text`, and `function_call`. In `_process_content_parts` you reimplement part of this against `result_parts`. If `_process_content_parts` accepted a `ChunkView` (or used a small helper that takes `parts` and returns `(reasoning_text, visible_text)`), this logic could live in one place and keep streaming and non-streaming behavior consistent.
</issue_to_address>
### Comment 2
<location> `astrbot/core/provider/sources/gemini_source.py:386-395` </location>
<code_context>
+ finish_reason=None,
+ )
- thought_buf: list[str] = [
- (p.text or "") for p in candidate.content.parts if p.thought
- ]
</code_context>
<issue_to_address>
**suggestion:** Extract reasoning text using a shared helper instead of inlining list comprehension logic here.
This reasoning extraction duplicates the logic in `_split_chunk_content` (collecting `p.text` when `thought` is true). To keep behavior consistent and easier to maintain, either reuse `_split_chunk_content`/`ChunkView` here or introduce a helper like `_extract_reasoning_from_parts(parts)` and call it both in the streaming path and `_process_content_parts`.
</issue_to_address>
### Comment 3
<location> `astrbot/core/provider/sources/gemini_source.py:36` </location>
<code_context>
logging.getLogger("google_genai.types").addFilter(SuppressNonTextPartsWarning())
+@dataclass
+class ChunkView:
+ """流式响应 chunk 的结构化视图对象
</code_context>
<issue_to_address>
**issue (complexity):** Consider narrowing the new chunk helper to just structural data and centralizing reasoning extraction to reduce optional state and keep the text/thought logic local to where it’s used.
You can keep all the new behavior while flattening the flow and removing most of the optional fields by:
1. Narrowing the “view” helper to pure structure (no reasoning/visible text).
2. Reusing a single “reasoning extraction” helper in both places.
3. Keeping the text/thought split in `_query_stream`, close to where it is used.
### 1. Replace `ChunkView` with a narrower structural helper
Instead of a broad `ChunkView` carrying multiple optional fields and mixing concerns, keep a small helper that only does minimal defensive checks and structure extraction:
```python
@dataclass
class ChunkBasics:
candidate: types.Candidate
parts: list[types.Part]
finish_reason: types.FinishReason | None
has_function_call: bool
def _get_chunk_basics(
self, chunk: types.GenerateContentResponse
) -> ChunkBasics | None:
if not chunk.candidates:
logger.warning(f"收到的 chunk 中 candidates 为空: {chunk}")
return None
candidate = chunk.candidates[0]
content = candidate.content
if not content or not content.parts:
logger.warning(f"收到的 chunk 中 content 为空: {chunk}")
return None
parts = content.parts
has_function_call = any(
part.function_call for part in parts if hasattr(part, "function_call")
)
return ChunkBasics(
candidate=candidate,
parts=parts,
finish_reason=getattr(candidate, "finish_reason", None),
has_function_call=has_function_call,
)
```
This drops `reasoning_text`, `visible_text` and most `getattr` usage from the “view” layer and keeps the number of possible states small (`None` vs a fully usable object).
### 2. Centralize reasoning extraction on parts
You currently compute reasoning twice with slightly different logic. You can centralize this into a helper that both `_query_stream` and `_process_content_parts` use:
```python
def _extract_reasoning_from_parts(self, parts: list[types.Part]) -> str:
thought_buf: list[str] = [
(p.text or "") for p in parts if getattr(p, "thought", False)
]
return "".join(thought_buf).strip()
```
Then in `_process_content_parts`:
```python
reasoning = self._extract_reasoning_from_parts(result_parts)
if reasoning:
llm_response.reasoning_content = reasoning
for part in result_parts:
if getattr(part, "thought", False):
continue
...
```
### 3. Keep reasoning/visible text split local in `_query_stream`
With `ChunkBasics` and `_extract_reasoning_from_parts`, `_query_stream` can be simplified and made more explicit, without `ChunkView.reasoning_text` / `.visible_text`:
```python
async for chunk in result:
llm_response = LLMResponse("assistant", is_chunk=True)
basics = self._get_chunk_basics(chunk)
if basics is None:
continue
if basics.has_function_call:
llm_response = LLMResponse("assistant", is_chunk=False)
llm_response.raw_completion = chunk
llm_response.result_chain = self._process_content_parts(
basics.candidate, llm_response
)
llm_response.id = chunk.response_id
if chunk.usage_metadata:
llm_response.usage = self._extract_usage(chunk.usage_metadata)
yield llm_response
return
has_content = False
# reasoning
reasoning = self._extract_reasoning_from_parts(basics.parts)
if reasoning:
has_content = True
accumulated_reasoning += reasoning
llm_response.reasoning_content = reasoning
# visible text (非 thought 的 text)
visible_text_parts = [
p.text or ""
for p in basics.parts
if not getattr(p, "thought", False) and getattr(p, "text", None)
]
visible_text = "".join(visible_text_parts).strip()
if not visible_text and getattr(chunk, "text", None):
visible_text = chunk.text # 保留对 chunk.text 的回退逻辑
if visible_text:
has_content = True
accumulated_text += visible_text
llm_response.result_chain = MessageChain(
chain=[Comp.Plain(visible_text)]
)
if has_content:
yield llm_response
if basics.finish_reason:
if basics.parts:
final_response = LLMResponse("assistant", is_chunk=False)
final_response.raw_completion = chunk
final_response.result_chain = self._process_content_parts(
basics.candidate, final_response
)
...
break
```
This keeps:
- The same function-call detection behavior.
- The reasoning/visible-text separation (including `chunk.text` fallback).
- A single source of truth for “thought vs visible” semantics.
But it:
- Removes the over-general `ChunkView` (less optional state to track).
- Localizes presentation decisions to `_query_stream` / `_process_content_parts`.
- Reduces `getattr` noise and brings code closer to the underlying Gemini types.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
我觉得可以把 thought=true 的内容放到 llm_response 中的 reasoning_content 里面? |
是的,已经是这样实现的。 |
|
我理解应该直接在 _process_content_parts 的 for 里面跳过 part.text 不为空且 part.thought 为 true 的 part 就行了,感觉不需要这么麻烦 |
Modifications / 改动点
修改 gemini_source.py 中的 _query_stream 方法,遍历 chunk.candidates[0].content.parts 并过滤 thought=True 的 parts,而不是直接使用 chunk.text。Screenshots or Test Results / 运行截图或测试结果
修改前:
![PixPin_2025-12-27_22-15-51]()
![PixPin_2025-12-27_22-14-00]()
修改后:
Checklist / 检查清单
requirements.txt和pyproject.toml文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations inrequirements.txtandpyproject.toml.Summary by Sourcery
在
cli2api中处理 Gemini 流式响应时,将内部推理内容与用户可见输出分离。Bug 修复:
cli2api的流式用户可见输出中泄露思考/推理内容。增强功能:
Original summary in English
Summary by Sourcery
Handle Gemini streaming responses by separating internal reasoning content from user-visible output in cli2api.
Bug Fixes:
Enhancements:
Summary by Sourcery
防止 Gemini 流式响应在 cli2api 输出中暴露内部推理内容,同时保留正常的用户可见文本。
错误修复:
功能增强:
chunk.text。Original summary in English
Summary by Sourcery
Prevent Gemini streaming responses from exposing internal reasoning content in cli2api outputs while preserving normal user-visible text.
Bug Fixes:
Enhancements:
Summary by Sourcery
在 cli2api 中处理 Gemini 的流式响应,在保留用户可见文本的同时避免泄露内部推理内容。
Bug Fixes:
Enhancements:
ChunkView辅助工具和split_chunk_content实用函数,以防御性方式解析 Gemini 的流式数据块,将推理文本与可见文本分离;当相关部分缺失时提供回退行为。Original summary in English
Summary by Sourcery
Handle Gemini streaming responses in cli2api to avoid leaking internal reasoning content while preserving user-visible text.
Bug Fixes:
Enhancements: