Skip to content

Conversation

@tannergooding
Copy link
Member

@tannergooding tannergooding commented Jun 24, 2025

This updates the xarch intrinsic logic to always import nodes as TYP_MASK where supported and to lower them back to the non-mask variants if no other optimizations were allowed to kick-in. This allows better overall use of the hardware for existing intrinsic code paths.

@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jun 24, 2025
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@tannergooding tannergooding added the NO-REVIEW Experimental/testing PR, do NOT review it label Jun 24, 2025
@tannergooding tannergooding removed the NO-REVIEW Experimental/testing PR, do NOT review it label Jul 1, 2025
@tannergooding tannergooding marked this pull request as ready for review July 1, 2025 19:47
@tannergooding
Copy link
Member Author

CC. @dotnet/jit-contrib. This should be ready for review. I could split this up into 2 PRs ([Allow rewriting of hwintrinsic mask ops to their non-mask forms and [Default to using mask ops for V128/V256 on supporting hardware), but I think it's worth doing these two together.

This is one of the last major milestones for the embedded masking support and helps ensure that all vector sizes are getting the expected implicit lightup.

@tannergooding tannergooding requested a review from EgorBo July 1, 2025 19:50
@tannergooding
Copy link
Member Author

The size regressions that are showing are primarily from cases where we decide to fallback to the non-kmask variant and it is comparing against zero. Such cases are now having to emit an extra vxorps since it isn't CSE'd.

In other words, the few regressions are namely due to #70182. We might be able to mitigate some of that by finding an existing CSE with a good VN in range or doing some other backtrack searching tricks, like we've done as workarounds for the issue; but I don't think we should block this PR on that; particularly since many important cases are improved and the vxorps are elided by the register renamer since its just producing 0.

@tannergooding
Copy link
Member Author

/azp run runtime-coreclr jitstress-isas-x86, Fuzzlyn, Antigen, runtime-coreclr jitstress, runtime-coreclr jitstressregs

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@tannergooding
Copy link
Member Author

/azp run runtime-coreclr jitstress-isas-x86, Fuzzlyn, Antigen, runtime-coreclr jitstress, runtime-coreclr jitstressregs

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@tannergooding
Copy link
Member Author

/azp run runtime-coreclr jitstress-isas-x86, Fuzzlyn, Antigen, runtime-coreclr jitstress, runtime-coreclr jitstressregs

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@tannergooding tannergooding merged commit 96977c9 into dotnet:main Jul 8, 2025
156 of 172 checks passed
@tannergooding tannergooding deleted the small-evex-mask branch July 8, 2025 16:51
@github-actions github-actions bot locked and limited conversation to collaborators Aug 8, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants