Add support for bmm for fbgemm config #2337

jerryzh168 · 2025-06-08T01:02:41Z

Summary:
att, this PR adds support for running quantized bmm, the quantized bmm kernel for int4 and fp8 (with dynamic activation quantization) requires transpose of weights in order to run, so added transpose_input to the convert function to transpose the weights first

Test Plan:
python test/dtypes/test_fbgemm_fp8.py -k test_bmm
python test/dtypes/test_fbgemm_int4.py -k test_bmm

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: att, this PR adds support for running quantized bmm, the quantized bmm kernel for int4 and fp8 (with dynamic activation quantization) requires transpose of weights in order to run, so added transpose_input to the convert function to transpose the weights first Test Plan: python test/dtypes/test_fbgemm_fp8.py -k test_bmm python test/dtypes/test_fbgemm_int4.py -k test_bmm Reviewers: Subscribers: Tasks: Tags:

pytorch-bot · 2025-06-08T01:02:44Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2337

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit bf1b017 with merge base 4235837 ():

NEW FAILURE - The following job has failed:

Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch --index-url https://downloa... / linux-job (gh)
test/integration/test_integration.py::TestSubclass::test_int4_weight_only_quant_subclass_grouped_5_cuda

This comment was automatically generated by Dr. CI and updates every 15 minutes.

drisspg · 2025-06-08T03:25:47Z

torchao/dtypes/fbgemm_fp8_tensor.py

@@ -88,6 +89,12 @@ def from_float(
            dtype=torch.float,
            device=w.device,
        )
+        if transpose_input:
+            if w.ndim == 3:


nit: maybe just w.transpose(-1, -2)

drisspg · 2025-06-08T03:27:15Z

torchao/dtypes/fbgemm_fp8_tensor.py

+
+    # not used
+    num_tokens = torch.empty([input_tensor.size(0)], device=input_tensor.device)
+    xq, x_scale = torch.ops.fbgemm.quantize_fp8_per_row(


This ot use num_tokens feels weird, maybe make an issue on fbgemm? or update the op to not need

drisspg · 2025-06-08T03:27:40Z

torchao/dtypes/fbgemm_int4_tensor.py

    ):
        assert len(block_size) == w.ndim, (
            f"Expecting the length of block_size to be equal to the dimension of the weight, got {block_size=} and {w.ndim=}"
        )
        if int4_row_quantize_zp is None:
            raise ImportError("Requires fbgemm-gpu-genai >= 1.2.0")

+        if transpose_input:


drisspg · 2025-06-08T03:28:38Z

torchao/dtypes/fbgemm_int4_tensor.py

+        args[0],
+        args[1],
+    )
+    if not input_tensor.is_floating_point():


nit: Is this guard needed? Like is this a common situation to run into

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 8, 2025

jerryzh168 requested a review from drisspg June 8, 2025 01:02

jerryzh168 added the topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) label Jun 8, 2025

drisspg reviewed Jun 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for bmm for fbgemm config #2337

Add support for bmm for fbgemm config #2337

jerryzh168 commented Jun 8, 2025

Uh oh!

pytorch-bot bot commented Jun 8, 2025 •

edited

Loading

Uh oh!

drisspg Jun 8, 2025

Uh oh!

drisspg Jun 8, 2025

Uh oh!

drisspg Jun 8, 2025

Uh oh!

drisspg Jun 8, 2025

Uh oh!

Uh oh!

Add support for bmm for fbgemm config #2337

Are you sure you want to change the base?

Add support for bmm for fbgemm config #2337

Conversation

jerryzh168 commented Jun 8, 2025

Uh oh!

pytorch-bot bot commented Jun 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2337

❌ 1 New Failure

Uh oh!

drisspg Jun 8, 2025

Choose a reason for hiding this comment

Uh oh!

drisspg Jun 8, 2025

Choose a reason for hiding this comment

Uh oh!

drisspg Jun 8, 2025

Choose a reason for hiding this comment

Uh oh!

drisspg Jun 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pytorch-bot bot commented Jun 8, 2025 •

edited

Loading