Skip to content

ENH: Add nsmallest/nlargest method support for extension array #42737

Closed
@mocquin

Description

@mocquin

When dealing with regular series, one can do :

import numpy as np
import pandas as pd
s = pd.Series(np.arange(10))
s.nsmallest(1) # returns a series contaning "0" as expected

When using an extension array (user-defined in my case), calling the .nsmallest method raises a TypeError (full message below):

import numpy as np
import physipandas
from physipy import m
import pandas as pd
sq = pd.Series(np.arange(10)*m, dtype='physipy[m]')
sq.nsmallest(1)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/var/folders/5k/bf4syt7x1zjbhc6b28srzzym0000gn/T/ipykernel_72417/2048250381.py in <module>
      4 import pandas as pd
      5 sq = pd.Series(np.arange(10)*m, dtype='physipy[m]')
----> 6 sq.nsmallest(1)

/opt/anaconda3/lib/python3.8/site-packages/pandas/core/series.py in nsmallest(self, n, keep)
   3861         dtype: int64
   3862         """
-> 3863         return algorithms.SelectNSeries(self, n=n, keep=keep).nsmallest()
   3864 
   3865     @doc(

/opt/anaconda3/lib/python3.8/site-packages/pandas/core/algorithms.py in nsmallest(self)
   1220 
   1221     def nsmallest(self):
-> 1222         return self.compute("nsmallest")
   1223 
   1224     @staticmethod

/opt/anaconda3/lib/python3.8/site-packages/pandas/core/algorithms.py in compute(self, method)
   1253         dtype = self.obj.dtype
   1254         if not self.is_valid_dtype_n_method(dtype):
-> 1255             raise TypeError(f"Cannot use method '{method}' with dtype {dtype}")
   1256 
   1257         if n <= 0:

TypeError: Cannot use method 'nsmallest' with dtype physipy[m]

This seems to happen because the extension dtype is not "registered" to is_valid_dtype_n_method.
Would it be feasable to support nsmallest/nlargest for extensions ?

pandas version : 1.3.0.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementExtensionArrayExtending pandas with custom dtypes or arrays.Needs DiscussionRequires discussion from core team before further action

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions