-
Notifications
You must be signed in to change notification settings - Fork 856
Open
Description
The umap package seems to reimplement the same distance metrices as the pynndescent package. However, the umap implementation is buggy in at least one case:
sparse.arr_unioncan return one of its input arrays- the return value of
sparse.arr_unionis used as a writable buffer bysparse.sparse_sum sparse.sparse_diffis a thin wrapper aroundsparse.sparse_sumsparse.sparse_diffis used insparse_euclidean
This leads to two issues:
- The input array may be modified, which is not expected by the user
- If
sparse_euclidean,sparse_diff, orsparse_sumis called by a custom distance metric, which gets its sparse array via keyword arguments, the keword arguments will be stored in a closure, which leads numba to treat them as readonly, leading to type inference failure (see Cannot callneighbors:Failed in nopython mode pipeline (step: nopython frontend)scverse/muon#173).
The pynndescent implementation of sparse_euclidean works without any issues. Given that both packages are maintained by you, umap depends on and uses pynndescent, and the supported distance functions appear to be identical, I think it would make sense if umap would simply re-export the distance functions from pynndescent.
Metadata
Metadata
Assignees
Labels
No labels