forked from NVIDIA/cutlass
-
Couldn't load subscription status.
- Fork 64
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
FP8 scaled MM is slow
Steps/Code to reproduce bug
Code example is at https://github.com/codeplaysoftware/cutlass-sycl/blob/sycl-develop/examples/sycl/08_bmg_gemm_f8/08_bmg_gemm_f8_scaling.cpp
Expected behavior
It should exhibit better performance
Environment details (please complete the following information):
Linux (BMG/PVC)
cc @aacostadiaz
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working