[Core] Optimize LoRA weight loading #25403

jeejeelee · 2025-09-22T16:14:33Z

Purpose

Gemmi's description is very detailed and can slightly reduce the loading time for LoRA weights.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Jee Jee Li <[email protected]>

gemini-code-assist

Code Review

This pull request optimizes LoRA weight loading by changing the memory layout of LoRA tensors to avoid costly transpose operations. The changes are consistently applied across all relevant layers and utility functions. The new convention for lora_a is (rank, input_dim) and for lora_b is (output_dim, rank), which matches how they are often stored in checkpoints, thus removing the need for transposition during loading. The slicing and copying logic has been updated accordingly. The changes are correct and contribute to better performance. I have no high or critical severity comments.

Signed-off-by: Jee Jee Li <[email protected]>

Isotr0py

LGTM!

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Naman Lalit <[email protected]>

jeejeelee added 2 commits September 22, 2025 10:02

Done

4e4be3b

Signed-off-by: Jee Jee Li <[email protected]>

Move forward

3c40d7b

Signed-off-by: Jee Jee Li <[email protected]>

gemini-code-assist bot reviewed Sep 22, 2025

View reviewed changes

FMT

4fc3209

Signed-off-by: Jee Jee Li <[email protected]>

jeejeelee force-pushed the optimize-lora-loading branch from 0b49a76 to 4fc3209 Compare September 22, 2025 16:16

jeejeelee added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 22, 2025

Fix test

8f1b7b7

Signed-off-by: Jee Jee Li <[email protected]>

jeejeelee force-pushed the optimize-lora-loading branch from 82bc9e6 to 8f1b7b7 Compare September 23, 2025 02:14

jeejeelee added 4 commits September 23, 2025 10:15

Merge branch 'main' into optimize-lora-loading

358ef1d

Fix test

fad5ba6

Signed-off-by: Jee Jee Li <[email protected]>

Fix format

048d6b5

Signed-off-by: Jee Jee Li <[email protected]>

Fix

af92bb2

Signed-off-by: Jee Jee Li <[email protected]>

jeejeelee force-pushed the optimize-lora-loading branch from 096aaa9 to af92bb2 Compare September 23, 2025 06:41

Merge branch 'main' into optimize-lora-loading

5e3cd44

jeejeelee requested review from Isotr0py and DarkLight1337 September 23, 2025 06:45

Fix comments

97d7d92

Signed-off-by: Jee Jee Li <[email protected]>

Isotr0py approved these changes Sep 23, 2025

View reviewed changes

Isotr0py merged commit 273690a into vllm-project:main Sep 23, 2025
45 checks passed

jeejeelee deleted the optimize-lora-loading branch September 23, 2025 12:37

namanlalitnyu pushed a commit to namanlalitnyu/vllm that referenced this pull request Sep 24, 2025

[Core] Optimize LoRA weight loading (vllm-project#25403)

baf7e54

Signed-off-by: Jee Jee Li <[email protected]> Signed-off-by: Naman Lalit <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Core] Optimize LoRA weight loading #25403

[Core] Optimize LoRA weight loading #25403

Uh oh!

jeejeelee commented Sep 22, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Isotr0py left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[Core] Optimize LoRA weight loading #25403

[Core] Optimize LoRA weight loading #25403

Uh oh!

Conversation

jeejeelee commented Sep 22, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Isotr0py left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jeejeelee commented Sep 22, 2025 •

edited by github-actions bot

Loading