Commit 02416c2
Enable TensorPrimitives to perform in-place operations (dotnet#92820)
Some operations would produce incorrect results if the same span was passed as both an input and an output. When vectorization was employed but the span's length wasn't a perfect multiple of a vector, we'd do the standard trick of performing one last operation on the last vector's worth of data; however, that relies on the operation being idempotent, and if a previous operation has overwritten input with a new value due to the same memory being used for input and output, some operations won't be idempotent. This fixes that by masking off the already processed elements. It adds tests to validate in-place use works, and it updates the docs to carve out this valid overlapping.1 parent 4088f05 commit 02416c2
File tree
4 files changed
+740
-96
lines changed- src/libraries/System.Numerics.Tensors
- src/System/Numerics/Tensors
- tests
4 files changed
+740
-96
lines changed
0 commit comments