Skip to content

Conversation

@chenhaiq
Copy link
Collaborator

@chenhaiq chenhaiq commented Jul 3, 2025

What does this PR do?

Provide rollout generation and tool calls details in wandb weave to help debugging agentic RL.

2 new interfaces:

  1. rollout_trace_attr contextmanager: used to mark sample_index、step、rollout_n and experience name for a trajectory.
  2. rollout_trace_op decorator:mark the method to trace. It must be a method of an instance.

related issue #2188

Checklist Before Starting

  • Search for similar PRs. Paste at least one query link here: ...
  • Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
    • {modules} include fsdp, megatron, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data
    • If this PR involves multiple modules, separate them with , like [megatron, fsdp, doc]
    • {type} is in feat, fix, refactor, chore, test
    • If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
    • Example: [BREAKING][fsdp, megatron] feat: dynamic batching

Test

截屏2025-07-03 下午4 09 58 截屏2025-07-03 下午4 11 27

API and Usage Example

options:
+trainer.rollout_trace.backend=weave: only wandb weave is support in this PR. Leave the reset of trace tool to the community.
+trainer.rollout_trace.token2text=False: whether append decoded text in result of run method.

High-Level Design

n/a

Specific Changes

Only works for async rollout from agent loop. No effect for sync rollout.

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

@zhaochenyang20
Copy link
Collaborator

@chenhaiq Haiquan, could you help to check wether this is also supported in SGLang? I am checking with our team also. Thanks!

@chenhaiq
Copy link
Collaborator Author

chenhaiq commented Jul 4, 2025

@chenhaiq Haiquan, could you help to check wether this is also supported in SGLang? I am checking with our team also. Thanks!

sure,I will support sglang

@chenhaiq
Copy link
Collaborator Author

chenhaiq commented Jul 7, 2025

@chenhaiq Haiquan, could you help to check wether this is also supported in SGLang? I am checking with our team also. Thanks!

It supports sglang async mode, please have a code review.

@chenhaiq chenhaiq requested a review from eric-haibin-lin July 7, 2025 06:20
rollout_n += 1
else:
rollout_n = 0
trajectory_info.append({"step": step, "sample_index": index[i], "rollout_n": rollout_n})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the meaning of rollout_n?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rollout_n is used to associate trace entry with actor_rollout_ref.rollout.n.
For example, when actor_rollout_ref.rollout.n=2, the _run_agent_loop method will be executed 2 times.
Both of the 2 _run_agent_loop calls have their own trace entry. Setting a rollout_n is used to tell which time it is traced from.

yield


def rollout_trace_op(func):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this function can trace every function, include tooling, agent loop run, ...?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can trace any method in an class instance.

@wuxibin89 wuxibin89 merged commit ec4433c into volcengine:main Jul 8, 2025
54 of 57 checks passed
.. code:: bash
+trainer.rollout_trace.token2text=True # default to False for better performance
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chenhaiq could u add the visualization result from this PR to the documnetation page as well? please upload the image to https://github.com/eric-haibin-lin/verl-community/tree/main/docs

@chenhaiq
Copy link
Collaborator Author

chenhaiq commented Jul 9, 2025

corresponding feature in mlflow:

截屏2025-07-09 下午6 19 58 截屏2025-07-09 下午6 11 27

lkc233 pushed a commit to lkc233/verl that referenced this pull request Jul 10, 2025
…cengine#2345)

### What does this PR do?

Provide rollout generation and tool calls details in wandb weave to help
debugging agentic RL.

2 new interfaces:
1. rollout_trace_attr contextmanager: used to mark
sample_index、step、rollout_n and experience name for a trajectory.
2. rollout_trace_op decorator:mark the method to trace. It must be a
method of an instance.


related issue volcengine#2188

### Checklist Before Starting

- [X] Search for similar PRs. Paste at least one query link here: ...
- [X] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
  - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
  - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`

### Test

<img width="1910" alt="截屏2025-07-03 下午4 09 58"
src="https://github.com/user-attachments/assets/ff30bbca-f9c8-434f-a3c2-0e333d16fa68"
/>


<img width="1895" alt="截屏2025-07-03 下午4 11 27"
src="https://github.com/user-attachments/assets/0b9ed8db-58a7-4769-88fb-bda204dc9fc8"
/>



### API and Usage Example
options:
+trainer.rollout_trace.backend=weave: only wandb weave is support in
this PR. Leave the reset of trace tool to the community.
+trainer.rollout_trace.token2text=False: whether append decoded text in
result of run method.

### High-Level Design

n/a

### Specific Changes

Only works for async rollout from agent loop. No effect for sync
rollout.

### Checklist Before Submitting

> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [x] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [x] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
ArronHZG pushed a commit to imh966/verl that referenced this pull request Jul 10, 2025
…cengine#2345)

### What does this PR do?

Provide rollout generation and tool calls details in wandb weave to help
debugging agentic RL.

2 new interfaces:
1. rollout_trace_attr contextmanager: used to mark
sample_index、step、rollout_n and experience name for a trajectory.
2. rollout_trace_op decorator:mark the method to trace. It must be a
method of an instance.


related issue volcengine#2188

### Checklist Before Starting

- [X] Search for similar PRs. Paste at least one query link here: ...
- [X] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
  - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
  - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`

### Test

<img width="1910" alt="截屏2025-07-03 下午4 09 58"
src="https://github.com/user-attachments/assets/ff30bbca-f9c8-434f-a3c2-0e333d16fa68"
/>


<img width="1895" alt="截屏2025-07-03 下午4 11 27"
src="https://github.com/user-attachments/assets/0b9ed8db-58a7-4769-88fb-bda204dc9fc8"
/>



### API and Usage Example
options:
+trainer.rollout_trace.backend=weave: only wandb weave is support in
this PR. Leave the reset of trace tool to the community.
+trainer.rollout_trace.token2text=False: whether append decoded text in
result of run method.

### High-Level Design

n/a

### Specific Changes

Only works for async rollout from agent loop. No effect for sync
rollout.

### Checklist Before Submitting

> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [x] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [x] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
oseyosey pushed a commit to oseyosey/verl that referenced this pull request Jul 28, 2025
…cengine#2345)

### What does this PR do?

Provide rollout generation and tool calls details in wandb weave to help
debugging agentic RL.

2 new interfaces:
1. rollout_trace_attr contextmanager: used to mark
sample_index、step、rollout_n and experience name for a trajectory.
2. rollout_trace_op decorator:mark the method to trace. It must be a
method of an instance.


related issue volcengine#2188

### Checklist Before Starting

- [X] Search for similar PRs. Paste at least one query link here: ...
- [X] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
  - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
  - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`

### Test

<img width="1910" alt="截屏2025-07-03 下午4 09 58"
src="https://github.com/user-attachments/assets/ff30bbca-f9c8-434f-a3c2-0e333d16fa68"
/>


<img width="1895" alt="截屏2025-07-03 下午4 11 27"
src="https://github.com/user-attachments/assets/0b9ed8db-58a7-4769-88fb-bda204dc9fc8"
/>



### API and Usage Example
options:
+trainer.rollout_trace.backend=weave: only wandb weave is support in
this PR. Leave the reset of trace tool to the community.
+trainer.rollout_trace.token2text=False: whether append decoded text in
result of run method.

### High-Level Design

n/a

### Specific Changes

Only works for async rollout from agent loop. No effect for sync
rollout.

### Checklist Before Submitting

> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [x] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [x] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
Juniper1021 pushed a commit to Juniper1021/verl that referenced this pull request Aug 7, 2025
…cengine#2345)

### What does this PR do?

Provide rollout generation and tool calls details in wandb weave to help
debugging agentic RL.

2 new interfaces:
1. rollout_trace_attr contextmanager: used to mark
sample_index、step、rollout_n and experience name for a trajectory.
2. rollout_trace_op decorator:mark the method to trace. It must be a
method of an instance.


related issue volcengine#2188

### Checklist Before Starting

- [X] Search for similar PRs. Paste at least one query link here: ...
- [X] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
  - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
  - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`

### Test

<img width="1910" alt="截屏2025-07-03 下午4 09 58"
src="https://github.com/user-attachments/assets/ff30bbca-f9c8-434f-a3c2-0e333d16fa68"
/>


<img width="1895" alt="截屏2025-07-03 下午4 11 27"
src="https://github.com/user-attachments/assets/0b9ed8db-58a7-4769-88fb-bda204dc9fc8"
/>



### API and Usage Example
options:
+trainer.rollout_trace.backend=weave: only wandb weave is support in
this PR. Leave the reset of trace tool to the community.
+trainer.rollout_trace.token2text=False: whether append decoded text in
result of run method.

### High-Level Design

n/a

### Specific Changes

Only works for async rollout from agent loop. No effect for sync
rollout.

### Checklist Before Submitting

> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [x] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [x] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
whatadayG pushed a commit to whatadayG/verl that referenced this pull request Sep 5, 2025
…cengine#2345)

### What does this PR do?

Provide rollout generation and tool calls details in wandb weave to help
debugging agentic RL.

2 new interfaces:
1. rollout_trace_attr contextmanager: used to mark
sample_index、step、rollout_n and experience name for a trajectory.
2. rollout_trace_op decorator:mark the method to trace. It must be a
method of an instance.


related issue volcengine#2188

### Checklist Before Starting

- [X] Search for similar PRs. Paste at least one query link here: ...
- [X] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
  - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
  - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`

### Test

<img width="1910" alt="截屏2025-07-03 下午4 09 58"
src="https://github.com/user-attachments/assets/ff30bbca-f9c8-434f-a3c2-0e333d16fa68"
/>


<img width="1895" alt="截屏2025-07-03 下午4 11 27"
src="https://github.com/user-attachments/assets/0b9ed8db-58a7-4769-88fb-bda204dc9fc8"
/>



### API and Usage Example
options:
+trainer.rollout_trace.backend=weave: only wandb weave is support in
this PR. Leave the reset of trace tool to the community.
+trainer.rollout_trace.token2text=False: whether append decoded text in
result of run method.

### High-Level Design

n/a

### Specific Changes

Only works for async rollout from agent loop. No effect for sync
rollout.

### Checklist Before Submitting

> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [x] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [x] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
chenjiaoAngel added a commit to chenjiaoAngel/verl that referenced this pull request Nov 14, 2025
…cengine#2345)

### What does this PR do?

Provide rollout generation and tool calls details in wandb weave to help
debugging agentic RL.

2 new interfaces:
1. rollout_trace_attr contextmanager: used to mark
sample_index、step、rollout_n and experience name for a trajectory.
2. rollout_trace_op decorator:mark the method to trace. It must be a
method of an instance.


related issue volcengine#2188

### Checklist Before Starting

- [X] Search for similar PRs. Paste at least one query link here: ...
- [X] Format the PR title as `[{modules}] {type}: {description}` (This
will be checked by the CI)
- `{modules}` include `fsdp`, `megatron`, `sglang`, `vllm`, `rollout`,
`trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`,
`ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`,
`env`, `tool`, `ckpt`, `doc`, `data`
- If this PR involves multiple modules, separate them with `,` like
`[megatron, fsdp, doc]`
  - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test`
- If this PR breaks any API (CLI arguments, config, function signature,
etc.), add `[BREAKING]` to the beginning of the title.
  - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching`

### Test

<img width="1910" alt="截屏2025-07-03 下午4 09 58"
src="https://github.com/user-attachments/assets/ff30bbca-f9c8-434f-a3c2-0e333d16fa68"
/>


<img width="1895" alt="截屏2025-07-03 下午4 11 27"
src="https://github.com/user-attachments/assets/0b9ed8db-58a7-4769-88fb-bda204dc9fc8"
/>



### API and Usage Example
options:
+trainer.rollout_trace.backend=weave: only wandb weave is support in
this PR. Leave the reset of trace tool to the community.
+trainer.rollout_trace.token2text=False: whether append decoded text in
result of run method.

### High-Level Design

n/a

### Specific Changes

Only works for async rollout from agent loop. No effect for sync
rollout.

### Checklist Before Submitting

> [!IMPORTANT]
> Please check all the following items before requesting a review,
otherwise the reviewer might deprioritize this PR for review.

- [x] Read the [Contribute
Guide](https://github.com/volcengine/verl?tab=readme-ov-file#contribution-guide).
- [x] Apply [pre-commit
checks](https://github.com/volcengine/verl?tab=readme-ov-file#code-linting-and-formatting):
`pre-commit install && pre-commit run --all-files --show-diff-on-failure
--color=always`
- [x] Add / Update [the
documentation](https://github.com/volcengine/verl/tree/main/docs).
- [x] Add unit or end-to-end test(s) to [the CI
workflow](https://github.com/volcengine/verl/tree/main/.github/workflows)
to cover all the code. If not feasible, explain why: ...
- [x] Once your PR is ready for CI, send a message in [the `ci-request`
channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the
`verl` Slack
workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants