reinforcement-finetuning

Here are 5 public repositories matching this topic...

ucla-mobility / AutoVLA

Official implementation of paper "AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning"

autonomous-driving vision-language-action reinforcement-finetuning grpo

Updated Jun 18, 2025

CJReinforce / PURE

Star

Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"

reinforcement-learning mathematics rl reasoning r1 o1 llm reinforcement-finetuning

Updated Jul 17, 2025
Python

deepmodeling / CrystalFormer

Star

Space Group Informed Transformer for Crystalline Materials Generation

crystal transformer generative-model autoregressive reinforcement-finetuning

Updated Jul 19, 2025
Jupyter Notebook

MARFT stands for Multi-Agent Reinforcement Fine-Tuning. This repository implements an LLM-based multi-agent reinforcement fine-tuning framework for general agentic tasks, providing a foundational MARFT framework.

multiagent multi-agent-systems multi-agent-reinforcement-learning large-language-models reinforcement-finetuning multi-agent-reinforcement-fine-tuning

Updated Jul 14, 2025
Python

vishivishvish / dlai-rft-llms-grpo

Star

Reinforcement Fine-tuning LLMs with GRPO | Deeplearning.ai

reinforcement-learning llm reinforcement-finetuning grpo

Updated Jun 6, 2025

Improve this page

Add a description, image, and links to the reinforcement-finetuning topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the reinforcement-finetuning topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reinforcement-finetuning

Here are 5 public repositories matching this topic...

ucla-mobility / AutoVLA

CJReinforce / PURE

deepmodeling / CrystalFormer

jwliao-ai / MARFT

vishivishvish / dlai-rft-llms-grpo

Improve this page

Add this topic to your repo