Add Infrastructure for SHGEMV #5485

Mousius · 2025-10-07T13:59:34Z

This adds all the relevant bits and pieces to add a shgemv path as well as a future hgemm/hgemv path in a similar model to sb and b interfaces.

I've also fixed a few bits and pieces around shgemm which didn't build in a few situations.

This adds all the relevant bits and pieces to add a `shgemv` path as well as a future `hgemm`/`hgemv` path in a similar model to `sb` and `b` interfaces. I've also fixed a few bits and pieces around `shgemm` which didn't build in a few situations.

martin-frbg · 2025-10-07T15:08:19Z

Thanks (pity about the duplicate work though)

Mousius · 2025-10-07T15:10:26Z

Thanks (pity about the duplicate work though)

Yeah, sorry about that, I was thinking about something else and got a bit carried away seeing what was missing here 🙀

Mousius · 2025-10-07T18:30:53Z

@martin-frbg I will leave hgemm/hgemm_batch/hgemm_strided and hgemv for you to make up for my over eagerness? 🙏

ChipKerchner · 2025-10-07T18:59:29Z

#5480

ChipKerchner · 2025-10-07T19:07:20Z

Getting these compiler warnings....

/proj_sw/user_dev/ckerchner/OpenBLAS/interface/gemm.c:593:76: warning: macro expansion producing 'defined' has undefined behavior [-Wexpansion-to-defined]
  593 | #if defined(GEMM_GEMV_FORWARD) && !defined(GEMM3M) && !defined(COMPLEX) && HFLOAT16_GEMM_GEMV_FORWARD && BFLOAT16_GEMM_GEMV_FORWARD
      |                                                                            ^
/proj_sw/user_dev/ckerchner/OpenBLAS/interface/gemm.c:591:38: note: expanded from macro 'HFLOAT16_GEMM_GEMV_FORWARD'

ChipKerchner · 2025-10-07T19:17:58Z

Should probably be done like this https://stackoverflow.com/questions/42074035/how-to-deal-with-clangs-3-9-wexpansion-to-defined-warning

#if (!defined(BFLOAT16) || (!defined(BGEMM) && defined(SBGEMM_GEMV_FORWARD)) || (defined(BGEMM) && defined(BGEMM_GEMV_FORWARD)))
#define BFLOAT16_GEMM_GEMV_FORWARD 1
#else
#define BFLOAT16_GEMM_GEMV_FORWARD 0
#endif
#if (!defined(HFLOAT16) || (!defined(HGEMM) && defined(SHGEMM_GEMV_FORWARD)) || (defined(HGEMM) && defined(HGEMM_GEMV_FORWARD)))
#define HFLOAT16_GEMM_GEMV_FORWARD 1
#else
#define HFLOAT16_GEMM_GEMV_FORWARD 0
#endif

ChipKerchner · 2025-10-09T12:53:18Z

I see a couple of places in test/Makefile where BUILD_BFLOAT16 has been added but I don't see the same for BUILD_HFLOAT16.

It looks like we have support for SBGEMM but not SHGEMM?

ChipKerchner · 2025-10-09T13:33:24Z

#5497

ChipKerchner · 2025-11-05T13:23:06Z

Your conversion of the outputs for SBGEMM/V seems wrong since you are casting from F16 to BF32 with TO_F32

Mousius · 2025-11-06T10:01:45Z

Your conversion of the outputs for SBGEMM/V seems wrong since you are casting from F16 to BF32 with TO_F32

Can you point me to the line @ChipKerchner ? The block is:

#if defined(BFLOAT16) && defined(BFLOAT16CONVERSION)

#ifdef BGEMM
#define TO_F32(x) (bfloat16tof32(x))
#else
#define TO_F32(x) (bfloat16tof32(x))
#endif

#elif defined(HFLOAT16)

#ifdef HGEMM
#define TO_F32(x) ((float)(x))
#else
#define TO_F32(x) ((float)(x))
#endif

#else

#define TO_F32(x) x

#endif

martin-frbg · 2025-11-06T10:28:35Z

He added review notes to the code changes, but weirdly I can only see them if I click on the corresponding notification in the gh web interface. As far as I can tell, these are indeed cases where the macro does not perform any actual conversion.

ChipKerchner · 2025-11-06T13:21:56Z

I see the TO_F32 conversions here:

#ifdef BGEMM
#define ALPHA bfloat16tof32(alpha)
#define BETA bfloat16tof32(beta)
#define TO_F32(x) (bfloat16tof32(x))
#define TO_OUTPUT(x) (f32tobfloat16(x))
#else
#define ALPHA alpha
#define BETA beta
#define TO_F32(x) (bfloat16tof32(x)).     <- Convert
#define TO_OUTPUT(x) x
#endif

Bad SBGEMV conversion here (a 2nd one in gemv_t):

            y[iy] = TO_OUTPUT(ALPHA * temp + BETA * TO_F32(y[iy]));  <- Converts from BF16 -> F32?

Bad SBGEMM conversion here:

             C0[0] = TO_OUTPUT(TO_F32(C0[0])+res0);  <- Converts from BF16 -> F32?

Mousius · 2025-11-06T13:30:35Z

Argh, guessing we need to add a FROM_OUTPUT here 🤔

martin-frbg · 2025-11-06T13:35:21Z

Interestingly, this does not lead to failures in our tests - at least not on (emulated) RISCV, but AFAICT the code changes in question originate from your earlier "fix bfloat conversion for Neoverse" PR

Mousius · 2025-11-07T08:35:26Z

Interestingly, this does not lead to failures in our tests - at least not on (emulated) RISCV, but AFAICT the code changes in question originate from your earlier "fix bfloat conversion for Neoverse" PR

I don't think so, that didn't touch the generic kernels: https://github.com/OpenMathLib/OpenBLAS/pull/5483/files

I'd imagine only the RISC-V CI is running the generic kernels? As the targets I was working on both have their own variants.

Mousius force-pushed the shgemv-infra branch from 000022d to 37aefaf Compare October 7, 2025 14:28

Add Infrastructure for SHGEMV

37fc3bb

This adds all the relevant bits and pieces to add a `shgemv` path as well as a future `hgemm`/`hgemv` path in a similar model to `sb` and `b` interfaces. I've also fixed a few bits and pieces around `shgemm` which didn't build in a few situations.

Mousius force-pushed the shgemv-infra branch from 37aefaf to 37fc3bb Compare October 7, 2025 15:04

martin-frbg added this to the 0.3.31 milestone Oct 7, 2025

martin-frbg merged commit de43ccc into OpenMathLib:develop Oct 7, 2025
83 of 88 checks passed

Mousius deleted the shgemv-infra branch October 7, 2025 18:49

martin-frbg mentioned this pull request Oct 8, 2025

Rework definitions of ?FLOAT16_GEMM_GEMV_FORWARD to avoid undefined behavior #5491

Merged

ChipKerchner mentioned this pull request Oct 8, 2025

No support for SHGEMV (FP16) #5480

Closed

Mousius mentioned this pull request Oct 10, 2025

Add test for SHGEMM #5499

Merged

Add Infrastructure for SHGEMV #5485

Add Infrastructure for SHGEMV #5485

Uh oh!

Conversation

Mousius commented Oct 7, 2025

Uh oh!

martin-frbg commented Oct 7, 2025

Uh oh!

Mousius commented Oct 7, 2025

Uh oh!

Uh oh!

Mousius commented Oct 7, 2025

Uh oh!

ChipKerchner commented Oct 7, 2025

Uh oh!

ChipKerchner commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ChipKerchner commented Oct 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ChipKerchner commented Oct 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ChipKerchner commented Oct 9, 2025

Uh oh!

ChipKerchner commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Mousius commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

martin-frbg commented Nov 6, 2025

Uh oh!

ChipKerchner commented Nov 6, 2025

Uh oh!

Mousius commented Nov 6, 2025

Uh oh!

martin-frbg commented Nov 6, 2025

Uh oh!

Mousius commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ChipKerchner commented Oct 7, 2025 •

edited

Loading

ChipKerchner commented Oct 7, 2025 •

edited

Loading

ChipKerchner commented Oct 9, 2025 •

edited

Loading

ChipKerchner commented Nov 5, 2025 •

edited

Loading

Mousius commented Nov 6, 2025 •

edited

Loading