EmbodiedGen is a generative engine to create diverse and interactive 3D worlds composed of high-quality 3D assets(mesh & 3DGS) with plausible physics, leveraging generative AI to address the challenges of generalization in embodied intelligence related research. It composed of six key modules:
Image-to-3D
,Text-to-3D
,Texture Generation
,Articulated Object Generation
,Scene Generation
andLayout Generation
.
- ๐ผ๏ธ Image-to-3D
- ๐ Text-to-3D
- ๐จ Texture Generation
- ๐ 3D Scene Generation
- โ๏ธ Articulated Object Generation
- ๐๏ธ Layout (Interactive 3D Worlds) Generation
git clone https://github.com/HorizonRobotics/EmbodiedGen.git
cd EmbodiedGen
git checkout v0.1.1
git submodule update --init --recursive --progress
conda create -n embodiedgen python=3.10.13 -y
conda activate embodiedgen
bash install.sh
Update the API key in file: embodied_gen/utils/gpt_config.yaml
.
You can choose between two backends for the GPT agent:
gpt-4o
(Recommended) โ Use this if you have access to Azure OpenAI.qwen2.5-vl
โ An alternative with free usage via OpenRouter, apply a free key here and updateapi_key
inembodied_gen/utils/gpt_config.yaml
(50 free requests per day)
Generate physically plausible 3D asset URDF from single input image, offering high-quality support for digital twin systems.
Run the image-to-3D generation service locally. Models downloaded automatically on first run, please be patient.
# Run in foreground
python apps/image_to_3d.py
# Or run in the background
CUDA_VISIBLE_DEVICES=0 nohup python apps/image_to_3d.py > /dev/null 2>&1 &
Generate physically plausible 3D assets from image input via the command-line API.
img3d-cli --image_path apps/assets/example_image/sample_04.jpg apps/assets/example_image/sample_19.jpg \
--n_retry 2 --output_root outputs/imageto3d
# See result(.urdf/mesh.obj/mesh.glb/gs.ply) in ${output_root}/sample_xx/result
Create 3D assets from text descriptions for a wide range of geometry and styles.
Deploy the text-to-3D generation service locally.
Text-to-image model based on the Kolors model, supporting Chinese and English prompts. Models downloaded automatically on first run, please be patient.
python apps/text_to_3d.py
Text-to-image model based on SD3.5 Medium, English prompts only.
Usage requires agreement to the model license(click accept), models downloaded automatically. (ps: models with more permissive licenses found in embodied_gen/models/image_comm_model.py
)
For large-scale 3D assets generation, set --n_pipe_retry=2
to ensure high end-to-end 3D asset usability through automatic quality check and retries. For more diverse results, do not set --seed_img
.
text3d-cli --prompts "small bronze figurine of a lion" "A globe with wooden base" "wooden table with embroidery" \
--n_image_retry 2 --n_asset_retry 2 --n_pipe_retry 1 --seed_img 0 \
--output_root outputs/textto3d
Text-to-image model based on the Kolors model.
bash embodied_gen/scripts/textto3d.sh \
--prompts "small bronze figurine of a lion" "A globe with wooden base and latitude and longitude lines" "ๆฉ่ฒ็ตๅจๆ้ป๏ผๆ็ฃจๆ็ป่" \
--output_root outputs/textto3d_k
Generate visually rich textures for 3D mesh.
Run the texture generation service locally.
Models downloaded automatically on first run, see download_kolors_weights
, geo_cond_mv
.
python apps/texture_edit.py
Support Chinese and English prompts.
bash embodied_gen/scripts/texture_gen.sh \
--mesh_path "apps/assets/example_texture/meshes/robot_text.obj" \
--prompt "ไธพ็็ๅญ็ๅๅฎ้ฃๆ ผๆบๅจไบบ๏ผๅคง็ผ็๏ผ็ๅญไธๅ็โHelloโ็ๆๅญ" \
--output_root "outputs/texture_gen/robot_text"
bash embodied_gen/scripts/texture_gen.sh \
--mesh_path "apps/assets/example_texture/meshes/horse.obj" \
--prompt "A gray horse head with flying mane and brown eyes" \
--output_root "outputs/texture_gen/gray_horse"
๐ง Coming Soon
๐ง Coming Soon
๐ง Coming Soon
![]() |
![]() |
![]() |
![]() |
pip install .[dev] && pre-commit install
python -m pytest # Pass all unit-test are required.
If you use EmbodiedGen in your research or projects, please cite:
@misc{wang2025embodiedgengenerative3dworld,
title={EmbodiedGen: Towards a Generative 3D World Engine for Embodied Intelligence},
author={Xinjie Wang and Liu Liu and Yu Cao and Ruiqi Wu and Wenkang Qin and Dehui Wang and Wei Sui and Zhizhong Su},
year={2025},
eprint={2506.10600},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2506.10600},
}
EmbodiedGen builds upon the following amazing projects and models: ๐ Trellis | ๐ Hunyuan-Delight | ๐ Segment Anything | ๐ Rembg | ๐ RMBG-1.4 | ๐ Stable Diffusion x4 | ๐ Real-ESRGAN | ๐ Kolors | ๐ ChatGLM3 | ๐ Aesthetic Score | ๐ Pano2Room | ๐ Diffusion360 | ๐ Kaolin | ๐ diffusers | ๐ gsplat | ๐ QWEN-2.5VL | ๐ GPT4o | ๐ SD3.5
This project is licensed under the Apache License 2.0. See the LICENSE
file for details.