This project provides Python scripts to systematically map and analyze Hugging Face model ecosystems by building complete family trees of models and their fine-tuned derivatives.
Using the Hugging Face Model Hub, we recursively discover parent-child relationships between models, collect comprehensive metadata, and export the data in structured formats for further analysis.
This repo links to the resulting datasets and hosts the scripts used for our analyses. This dataset enables researchers to trace model lineages, measure the influence of base models, and understand the propagation patterns within the open-source AI model ecosystem.
Before setting up the environment, ensure you have:
- Python 3.8 or higher installed
- pip package manager (usually included with Python) Install the required dependencies:
pip install -r requirements.txt
If you're adding this project to an existing environment or need to update the requirements:
- First activate your virtual environment:
# For venv: source venv/bin/activate # Linux/Mac .\venv\Scripts\activate # Windows # For Conda: conda activate your_env_name
- Then generate an updated requirements file:
pip freeze > requirements.txt