The Large Language Model (LLM) should output in the following format:
Reasoning process + Search planning results - DAG + [Search results] + Final generated result
- PyTorch: Install PyTorch version 2.6.
- trl: Install from GitHub:
pip install git+https://github.com/huggingface/trl.git
- lagent:
cd lagent pip install -e .
Data clustering uses data/clusters.py
, and dataset generation and filtering use data/qa_gen.py
.
To train the model, run the following script:
sh train.sh