-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[misc] feat: trace rollout generation and tool calls using weave #2345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@chenhaiq Haiquan, could you help to check wether this is also supported in SGLang? I am checking with our team also. Thanks! |
sure,I will support sglang |
It supports sglang async mode, please have a code review. |
| rollout_n += 1 | ||
| else: | ||
| rollout_n = 0 | ||
| trajectory_info.append({"step": step, "sample_index": index[i], "rollout_n": rollout_n}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the meaning of rollout_n?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rollout_n is used to associate trace entry with actor_rollout_ref.rollout.n.
For example, when actor_rollout_ref.rollout.n=2, the _run_agent_loop method will be executed 2 times.
Both of the 2 _run_agent_loop calls have their own trace entry. Setting a rollout_n is used to tell which time it is traced from.
| yield | ||
|
|
||
|
|
||
| def rollout_trace_op(func): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this function can trace every function, include tooling, agent loop run, ...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It can trace any method in an class instance.
| .. code:: bash | ||
| +trainer.rollout_trace.token2text=True # default to False for better performance | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chenhaiq could u add the visualization result from this PR to the documnetation page as well? please upload the image to https://github.com/eric-haibin-lin/verl-community/tree/main/docs
…cengine#2345) ### What does this PR do? Provide rollout generation and tool calls details in wandb weave to help debugging agentic RL. 2 new interfaces: 1. rollout_trace_attr contextmanager: used to mark sample_index、step、rollout_n and experience name for a trajectory. 2. rollout_trace_op decorator:mark the method to trace. It must be a method of an instance. related issue volcengine#2188 ### Checklist Before Starting - [X] Search for similar PRs. Paste at least one query link here: ... - [X] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test <img width="1910" alt="截屏2025-07-03 下午4 09 58" src="https://github.com/user-attachments/assets/ff30bbca-f9c8-434f-a3c2-0e333d16fa68" /> <img width="1895" alt="截屏2025-07-03 下午4 11 27" src="https://github.com/user-attachments/assets/0b9ed8db-58a7-4769-88fb-bda204dc9fc8" /> ### API and Usage Example options: +trainer.rollout_trace.backend=weave: only wandb weave is support in this PR. Leave the reset of trace tool to the community. +trainer.rollout_trace.token2text=False: whether append decoded text in result of run method. ### High-Level Design n/a ### Specific Changes Only works for async rollout from agent loop. No effect for sync rollout. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [x] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [x] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
…cengine#2345) ### What does this PR do? Provide rollout generation and tool calls details in wandb weave to help debugging agentic RL. 2 new interfaces: 1. rollout_trace_attr contextmanager: used to mark sample_index、step、rollout_n and experience name for a trajectory. 2. rollout_trace_op decorator:mark the method to trace. It must be a method of an instance. related issue volcengine#2188 ### Checklist Before Starting - [X] Search for similar PRs. Paste at least one query link here: ... - [X] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test <img width="1910" alt="截屏2025-07-03 下午4 09 58" src="https://github.com/user-attachments/assets/ff30bbca-f9c8-434f-a3c2-0e333d16fa68" /> <img width="1895" alt="截屏2025-07-03 下午4 11 27" src="https://github.com/user-attachments/assets/0b9ed8db-58a7-4769-88fb-bda204dc9fc8" /> ### API and Usage Example options: +trainer.rollout_trace.backend=weave: only wandb weave is support in this PR. Leave the reset of trace tool to the community. +trainer.rollout_trace.token2text=False: whether append decoded text in result of run method. ### High-Level Design n/a ### Specific Changes Only works for async rollout from agent loop. No effect for sync rollout. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [x] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [x] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
…cengine#2345) ### What does this PR do? Provide rollout generation and tool calls details in wandb weave to help debugging agentic RL. 2 new interfaces: 1. rollout_trace_attr contextmanager: used to mark sample_index、step、rollout_n and experience name for a trajectory. 2. rollout_trace_op decorator:mark the method to trace. It must be a method of an instance. related issue volcengine#2188 ### Checklist Before Starting - [X] Search for similar PRs. Paste at least one query link here: ... - [X] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test <img width="1910" alt="截屏2025-07-03 下午4 09 58" src="https://github.com/user-attachments/assets/ff30bbca-f9c8-434f-a3c2-0e333d16fa68" /> <img width="1895" alt="截屏2025-07-03 下午4 11 27" src="https://github.com/user-attachments/assets/0b9ed8db-58a7-4769-88fb-bda204dc9fc8" /> ### API and Usage Example options: +trainer.rollout_trace.backend=weave: only wandb weave is support in this PR. Leave the reset of trace tool to the community. +trainer.rollout_trace.token2text=False: whether append decoded text in result of run method. ### High-Level Design n/a ### Specific Changes Only works for async rollout from agent loop. No effect for sync rollout. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [x] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [x] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
…cengine#2345) ### What does this PR do? Provide rollout generation and tool calls details in wandb weave to help debugging agentic RL. 2 new interfaces: 1. rollout_trace_attr contextmanager: used to mark sample_index、step、rollout_n and experience name for a trajectory. 2. rollout_trace_op decorator:mark the method to trace. It must be a method of an instance. related issue volcengine#2188 ### Checklist Before Starting - [X] Search for similar PRs. Paste at least one query link here: ... - [X] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test <img width="1910" alt="截屏2025-07-03 下午4 09 58" src="https://github.com/user-attachments/assets/ff30bbca-f9c8-434f-a3c2-0e333d16fa68" /> <img width="1895" alt="截屏2025-07-03 下午4 11 27" src="https://github.com/user-attachments/assets/0b9ed8db-58a7-4769-88fb-bda204dc9fc8" /> ### API and Usage Example options: +trainer.rollout_trace.backend=weave: only wandb weave is support in this PR. Leave the reset of trace tool to the community. +trainer.rollout_trace.token2text=False: whether append decoded text in result of run method. ### High-Level Design n/a ### Specific Changes Only works for async rollout from agent loop. No effect for sync rollout. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [x] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [x] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
…cengine#2345) ### What does this PR do? Provide rollout generation and tool calls details in wandb weave to help debugging agentic RL. 2 new interfaces: 1. rollout_trace_attr contextmanager: used to mark sample_index、step、rollout_n and experience name for a trajectory. 2. rollout_trace_op decorator:mark the method to trace. It must be a method of an instance. related issue volcengine#2188 ### Checklist Before Starting - [X] Search for similar PRs. Paste at least one query link here: ... - [X] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test <img width="1910" alt="截屏2025-07-03 下午4 09 58" src="https://github.com/user-attachments/assets/ff30bbca-f9c8-434f-a3c2-0e333d16fa68" /> <img width="1895" alt="截屏2025-07-03 下午4 11 27" src="https://github.com/user-attachments/assets/0b9ed8db-58a7-4769-88fb-bda204dc9fc8" /> ### API and Usage Example options: +trainer.rollout_trace.backend=weave: only wandb weave is support in this PR. Leave the reset of trace tool to the community. +trainer.rollout_trace.token2text=False: whether append decoded text in result of run method. ### High-Level Design n/a ### Specific Changes Only works for async rollout from agent loop. No effect for sync rollout. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [x] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [x] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
…cengine#2345) ### What does this PR do? Provide rollout generation and tool calls details in wandb weave to help debugging agentic RL. 2 new interfaces: 1. rollout_trace_attr contextmanager: used to mark sample_index、step、rollout_n and experience name for a trajectory. 2. rollout_trace_op decorator:mark the method to trace. It must be a method of an instance. related issue volcengine#2188 ### Checklist Before Starting - [X] Search for similar PRs. Paste at least one query link here: ... - [X] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test <img width="1910" alt="截屏2025-07-03 下午4 09 58" src="https://github.com/user-attachments/assets/ff30bbca-f9c8-434f-a3c2-0e333d16fa68" /> <img width="1895" alt="截屏2025-07-03 下午4 11 27" src="https://github.com/user-attachments/assets/0b9ed8db-58a7-4769-88fb-bda204dc9fc8" /> ### API and Usage Example options: +trainer.rollout_trace.backend=weave: only wandb weave is support in this PR. Leave the reset of trace tool to the community. +trainer.rollout_trace.token2text=False: whether append decoded text in result of run method. ### High-Level Design n/a ### Specific Changes Only works for async rollout from agent loop. No effect for sync rollout. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [x] Read the [Contribute Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide). - [x] Apply [pre-commit checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [x] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [x] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [x] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).


What does this PR do?
Provide rollout generation and tool calls details in wandb weave to help debugging agentic RL.
2 new interfaces:
related issue #2188
Checklist Before Starting
[{modules}] {type}: {description}(This will be checked by the CI){modules}includefsdp,megatron,sglang,vllm,rollout,trainer,ci,training_utils,recipe,hardware,deployment,ray,worker,single_controller,misc,perf,model,algo,env,tool,ckpt,doc,data,like[megatron, fsdp, doc]{type}is infeat,fix,refactor,chore,test[BREAKING]to the beginning of the title.[BREAKING][fsdp, megatron] feat: dynamic batchingTest
API and Usage Example
options:
+trainer.rollout_trace.backend=weave: only wandb weave is support in this PR. Leave the reset of trace tool to the community.
+trainer.rollout_trace.token2text=False: whether append decoded text in result of run method.
High-Level Design
n/a
Specific Changes
Only works for async rollout from agent loop. No effect for sync rollout.
Checklist Before Submitting
Important
Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.
pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=alwaysci-requestchannel in theverlSlack workspace.