Skip to content

BUG? merging on column of empty frame with index of right frame #15692

Open
@jorisvandenbossche

Description

@jorisvandenbossche

It is a rather specific corner case, but there has been a change in behaviour when merging an empty frame:

In [1]: pd.__version__
Out[1]: '0.19.2'

In [2]: left = pd.DataFrame(columns=['key', 'col_left'])

In [3]: left
Out[3]: 
Empty DataFrame
Columns: [key, col_left]
Index: []

In [4]: right = pd.DataFrame({'col_right': ['a', 'b', 'c']})

In [5]: right
Out[5]: 
  col_right
0         a
1         b
2         c

In [6]: left.merge(right, left_on='key', right_index=True, how="right")
Out[6]: 
   key col_left col_right
0    0      NaN         a
1    1      NaN         b
2    2      NaN         c

vs

In [10]: pd.__version__
Out[10]: u'0.18.1'

In [11]: left = pd.DataFrame(columns=['key', 'col_left'])

In [12]: left
Out[12]: 
Empty DataFrame
Columns: [key, col_left]
Index: []

In [13]: right = pd.DataFrame({'col_right': ['a', 'b', 'c']})

In [14]: right
Out[14]: 
  col_right
0         a
1         b
2         c

In [15]: left.merge(right, left_on='key', right_index=True, how="right")
Out[15]: 
   key col_left col_right
0  NaN      NaN         a
1  NaN      NaN         b
2  NaN      NaN         c

So with 0.19 the 'key' column has values, in 0.18 this holds NaNs. The key column comes from the empty frame (so it had no values, how can it have values now?), but is merged with the index of the left frame (and this has of course values -> should these end up in the 'key' column of the resulting frame?)
It is such a strange case, that I am actually not sure which of both is the expected behaviour .. (and also not sure if this was an intentional change in behaviour).

Encountered here: geopandas/geopandas#422

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions