Closed
Description
So here's my setup (using pandas 0.16.2):
>>> midx = pd.MultiIndex.from_product([['bar', 'baz', 'foo', 'qux'], ['one', 'two']],names=['first','second'])
>>> df = pd.DataFrame(np.random.randint(10,size=(8,8)),index=midx)
>>> df
0 1 2 3 4 5 6 7
first second
bar one 0 5 5 5 6 2 6 8
two 2 6 9 0 3 6 7 9
baz one 9 0 9 9 2 5 7 4
two 4 8 1 2 9 2 8 1
foo one 2 7 3 6 5 5 5 2
two 3 4 6 2 7 7 1 2
qux one 0 8 5 9 5 5 7 3
two 7 4 0 7 3 6 8 6
I recently found that I can select multiple levels by indexing with a tuple of tuples
>>> df.loc[( ('bar','baz'), ), :]
0 1 2 3 4 5 6 7
first second
bar one 0 5 5 5 6 2 6 8
two 2 6 9 0 3 6 7 9
baz one 9 0 9 9 2 5 7 4
two 4 8 1 2 9 2 8 1
Or even select at multiple depths of levels
>>> df.loc[( ('bar','baz'), ('one',) ), :]
0 1 2 3 4 5 6 7
first second
bar one 0 5 5 5 6 2 6 8
baz one 9 0 9 9 2 5 7 4
The issue is this: if I add any levels to the index tuple that don't exist in the dataframe, pandas drops them silently
>>> df.loc[( ('bar','baz','xyz'), ('one',) ), :]
0 1 2 3 4 5 6 7
first second
bar one 0 5 5 5 6 2 6 8
baz one 9 0 9 9 2 5 7 4
It seems to me like this should raise an exception since
- The shape of the dataframe that is returned in this instance is not what you'd expect
- There's no way to unambiguously fill the returned dataframe with NaNs where a level didn't exist (as is done in the case where there is only a single level index)