Skip to content

BUG: DataFrame.loc silently drops non-existent elements when using MultiIndex #10549

Closed
@tgarc

Description

@tgarc

So here's my setup (using pandas 0.16.2):

>>> midx = pd.MultiIndex.from_product([['bar', 'baz', 'foo', 'qux'], ['one', 'two']],names=['first','second'])
>>> df = pd.DataFrame(np.random.randint(10,size=(8,8)),index=midx)

>>> df 
              0  1  2  3  4  5  6  7
first second                        
bar   one     0  5  5  5  6  2  6  8
      two     2  6  9  0  3  6  7  9
baz   one     9  0  9  9  2  5  7  4
      two     4  8  1  2  9  2  8  1
foo   one     2  7  3  6  5  5  5  2
      two     3  4  6  2  7  7  1  2
qux   one     0  8  5  9  5  5  7  3
      two     7  4  0  7  3  6  8  6

I recently found that I can select multiple levels by indexing with a tuple of tuples

>>> df.loc[( ('bar','baz'),  ), :]
              0  1  2  3  4  5  6  7
first second                        
bar   one     0  5  5  5  6  2  6  8
      two     2  6  9  0  3  6  7  9
baz   one     9  0  9  9  2  5  7  4
      two     4  8  1  2  9  2  8  1

Or even select at multiple depths of levels

>>> df.loc[( ('bar','baz'), ('one',) ), :]
              0  1  2  3  4  5  6  7
first second                        
bar   one     0  5  5  5  6  2  6  8
baz   one     9  0  9  9  2  5  7  4

The issue is this: if I add any levels to the index tuple that don't exist in the dataframe, pandas drops them silently

>>> df.loc[( ('bar','baz','xyz'), ('one',) ), :]
              0  1  2  3  4  5  6  7
first second                        
bar   one     0  5  5  5  6  2  6  8
baz   one     9  0  9  9  2  5  7  4

It seems to me like this should raise an exception since

  1. The shape of the dataframe that is returned in this instance is not what you'd expect
  2. There's no way to unambiguously fill the returned dataframe with NaNs where a level didn't exist (as is done in the case where there is only a single level index)

Metadata

Metadata

Assignees

No one assigned

    Labels

    IndexingRelated to indexing on series/frames, not to indexes themselvesMultiIndexNeeds DiscussionRequires discussion from core team before further action

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions