Skip to content

GitHub User's stars

Welcome to the Vchitect homepage. Vchitect is mainly developed by Shanghai AI Laboratory. We keep working in the field of video generation, open-sourcing the models, benchmark suites, and efficient training tools.

🔥 Updates

Vchitect 2.0

  • [09/2024] We release Vchitect 2.0, including the model and the training system
    • Model:
      • Vchitect-2.0 is a high-quality video generative model with 2 billion parameters, supporting resolutions up to 720x480 and video durations of 10-20 seconds.
      • VEnhancer is a generative space-time enhancement framework. It integrates super-resolution, frame interpolation, and video refinement to elevate the video quality to 2K resolution at 24 FPS.
    • System:
      • LiteGen is a lightweight and highly efficient training framework for diffusion tasks. It supports sequence lengths of up to 1.63 million tokens using 8x NVIDIA A100 GPU cards during the training of the Vchitect-2.0 model.
      • FasterCache is a training-free method for accelerating video sampling in diffusion transformers.
    • Evaluation:
      • VBench is a comprehensive benchmark suite for video generative models, covering 56 text-to-video generation models.
      • VBench++ further supports image-to-video evaluation, with an Image Suite of high-resolution images and adaptive aspect ratios. It also focuses on trustworthiness of video generative models like fairness, bias, and safety.
      • Evaluation Agent is an efficient evaluation paradigm for visual generative models. It uses dynamic multi-round evaluations, reducing evaluation time to 10% of traditional methods while ensuring efficiency, customization, and explainability.

🎁 Model

  • 🎉 [new] Vchitect-2.0: A high-quality video generation video with resolutions up to 720x480 and video durations of 10-20 seconds.
  • 🎉 [new] VEnhancer: A generative space-time enhancement framework that can improve the existing T2V results.

🚀 System

  • 🎉 [new] FasterCache: A training-free method for accelerating video sampling in diffusion transformers.
  • 🎉 [new] LiteGen: A light-weight and high-efficient training framework for accelerating diffusion tasks.

🏔️ Evaluation

  • 🎉 [new] Evaluation Agent: An efficient and promptable evaluation paradigm for visual generative models.
  • 🎉 [new] VBench++: A comprehensive benchmark suite for video generative models with image-to-video and trustworthiness evaluation support.
  • 🎉 [new] VBench: A comprehensive benchmark suite for video generative models

Latte

  • Latte: Latent Diffusion Transformer for Video Generation

Vchitect 1.0

  • LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models
  • SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction
  • VideoBooth: Diffusion-based Video Generation with Image Prompts
  • Vlogger: A generic AI system for generating a minute-level video blog (i.e., vlog) of user descriptions.
  • Optix: Memory Efficient Training Framework for Large Video Generation Model

Pinned Loading

  1. LaVie LaVie Public

    [IJCV 2024] LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models

    Python 934 64

  2. SEINE SEINE Public

    [ICLR 2024] SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction

    Python 941 65

  3. Latte Latte Public

    [TMLR 2025] Latte: Latent Diffusion Transformer for Video Generation.

    Python 1.8k 188

  4. Vchitect-2.0 Vchitect-2.0 Public

    Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models

    Python 916 22

  5. VEnhancer VEnhancer Public

    Official codes of VEnhancer: Generative Space-Time Enhancement for Video Generation

    Python 546 29

  6. VBench VBench Public

    [CVPR2024 Highlight] VBench - We Evaluate Video Generation

    Python 1.1k 61

Repositories

Showing 10 of 27 repositories
  • VBench Public

    [CVPR2024 Highlight] VBench - We Evaluate Video Generation

    Vchitect/VBench’s past year of commit activity
    Python 1,081 Apache-2.0 61 33 2 Updated Jul 8, 2025
  • ShotBench Public

    ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models

    Vchitect/ShotBench’s past year of commit activity
    Python 36 1 4 0 Updated Jul 7, 2025
  • Vchitect/ShotBench-project’s past year of commit activity
    HTML 0 0 0 0 Updated Jun 30, 2025
  • Evaluation-Agent Public

    [ACL2025 Oral] Evaluate Image/Video Generation like Humans - Fast, Explainable, Flexible

    Vchitect/Evaluation-Agent’s past year of commit activity
    Python 70 4 1 0 Updated Jun 27, 2025
  • RAPO Public

    [CVPR 2025] The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation

    Vchitect/RAPO’s past year of commit activity
    Python 72 0 0 0 Updated Jun 23, 2025
  • TACA Public

    [ICCV25] TACA: Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

    Vchitect/TACA’s past year of commit activity
    Python 27 3 2 0 Updated Jun 10, 2025
  • DCM Public

    DCM: Dual-Expert Consistency Model for Efficient and High-Quality Video Generation

    Vchitect/DCM’s past year of commit activity
    Python 175 10 13 0 Updated Jun 8, 2025
  • Vlogger Public template

    [CVPR2024] Make Your Dream A Vlog

    Vchitect/Vlogger’s past year of commit activity
    Python 427 Apache-2.0 46 15 0 Updated May 19, 2025
  • VBench-project Public

    Project Page of [CVPR2024 Highlight] VBench - We Evaluate Video Generation https://vchitect.github.io/VBench-project/

    Vchitect/VBench-project’s past year of commit activity
    JavaScript 0 0 0 0 Updated Apr 12, 2025
  • Vchitect/VBench-2.0-project’s past year of commit activity
    JavaScript 0 0 0 0 Updated Apr 8, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Most used topics

Loading…