Solving the LunarLander-v3 Environment: A 60-hour approach for the final term report in Foundation of Artificial Intelligence, VNU-UET (May 2025).
-
Updated
Jun 8, 2025 - Jupyter Notebook
Solving the LunarLander-v3 Environment: A 60-hour approach for the final term report in Foundation of Artificial Intelligence, VNU-UET (May 2025).
Open-source implementation/adaptation of DeepSeek GRPO applied to classic Reinforcement Learning control problems. Example on LunarLander-V3.
Implementation of PPO algorithm - https://arxiv.org/pdf/1707.06347
Add a description, image, and links to the lunarlander-v3 topic page so that developers can more easily learn about it.
To associate your repository with the lunarlander-v3 topic, visit your repo's landing page and select "manage topics."