FastVID: Dynamic Density Pruning for Fast Video Large Language Models

We propose FastVID, a novel training-free pruning framework that employs Dynamic Temporal Segmentation to partition videos into temporally ordered segments and Density Spatiotemporal Pruning to retain global segment information and key details. On LLaVA-OneVision-7B, FastVID effectively prunes 90.3% of video tokens, reduces FLOPs to 8.3%, and accelerates the prefilling stage by 7.1x, while maintaining 98.0% of the original accuracy.

TODO:

Currently, the implementation provided is a parallelized version for computing density scores, as described in the Efficiency Comparison section on page 8 of the main paper. We also plan to release a more readable reference implementation to facilitate better understanding and ease of customization.

FastVID on LLaVA-OneVision
FastVID on LLaVA-Video
FastVID on Qwen2-VL

Installation

To set up the environment:

cd scripts
bash create_env.sh

Evaluation

To evaluate FastVID on LLaVA-OneVision-7B:

cd scripts
bash eval.sh

Acknowledgement

This project builds upon the following open-source works: LLaVA-NeXT and lmms-eval.

Citation

@article{shen2025fastvid,
  title={FastVID: Dynamic Density Pruning for Fast Video Large Language Models},
  author={Shen, Leqi and Gong, Guoqiang and He, Tao and Zhang, Yifeng and Liu, Pengzhang and Zhao, Sicheng and Ding, Guiguang},
  journal={arXiv preprint arXiv:2503.11187},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LLaVA-NeXT		LLaVA-NeXT
imgs		imgs
lmms-eval		lmms-eval
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FastVID: Dynamic Density Pruning for Fast Video Large Language Models

TODO:

Installation

Evaluation

Acknowledgement

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

LunarShen/FastVID

Folders and files

Latest commit

History

Repository files navigation

FastVID: Dynamic Density Pruning for Fast Video Large Language Models

TODO:

Installation

Evaluation

Acknowledgement

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages