Skip to content

Setting values in DataFrame with timezone-aware index fails using .loc #12502

Closed
@kdebrab

Description

@kdebrab

In Pandas 0.17.1, setting values in a DataFrame with .loc does not seem the work well in case of timezone-aware indices in combination with a list of columns:

In [1]: import pandas as pd
   ...: start = pd.Timestamp('2015-7-12', tz='utc')
   ...: end = pd.Timestamp('2015-7-12 12:00', tz='utc')
   ...: timestamps = pd.date_range(start, end, freq='H')
   ...: df1 = pd.DataFrame(index=timestamps, columns=['var'])
   ...: df2 = pd.DataFrame(1.2, index=timestamps, columns=['var'])
   ...: df1.loc[:,['var']] = df2
   ...: df1
 Out[1]: 
                           var
2015-07-12 00:00:00+00:00  NaN
2015-07-12 01:00:00+00:00  NaN
2015-07-12 02:00:00+00:00  NaN
2015-07-12 03:00:00+00:00  NaN
2015-07-12 04:00:00+00:00  NaN
2015-07-12 05:00:00+00:00  NaN
2015-07-12 06:00:00+00:00  NaN
2015-07-12 07:00:00+00:00  NaN
2015-07-12 08:00:00+00:00  NaN
2015-07-12 09:00:00+00:00  NaN
2015-07-12 10:00:00+00:00  NaN
2015-07-12 11:00:00+00:00  NaN
2015-07-12 12:00:00+00:00  NaN

Though it does work when replacing ['var'] by 'var':

In [2]: start = pd.Timestamp('2015-7-12', tz='utc')
   ...: end = pd.Timestamp('2015-7-12 12:00', tz='utc')
   ...: timestamps = pd.date_range(start, end, freq='H')
   ...: df1 = pd.DataFrame(index=timestamps, columns=['var'])
   ...: df2 = pd.DataFrame(1.2, index=timestamps, columns=['var'])
   ...: df1.loc[:,'var'] = df2
   ...: df1
Out[2]: 
                           var
2015-07-12 00:00:00+00:00  1.2
2015-07-12 01:00:00+00:00  1.2
2015-07-12 02:00:00+00:00  1.2
2015-07-12 03:00:00+00:00  1.2
2015-07-12 04:00:00+00:00  1.2
2015-07-12 05:00:00+00:00  1.2
2015-07-12 06:00:00+00:00  1.2
2015-07-12 07:00:00+00:00  1.2
2015-07-12 08:00:00+00:00  1.2
2015-07-12 09:00:00+00:00  1.2
2015-07-12 10:00:00+00:00  1.2
2015-07-12 11:00:00+00:00  1.2
2015-07-12 12:00:00+00:00  1.2

Also, it works with a non-timezone-aware index:

In [3]: start = pd.Timestamp('2015-7-12')
   ...: end = pd.Timestamp('2015-7-12 12:00')
   ...: timestamps = pd.date_range(start, end, freq='H')
   ...: df1 = pd.DataFrame(index=timestamps, columns=['var'])
   ...: df2 = pd.DataFrame(1.2, index=timestamps, columns=['var'])
   ...: df1.loc[:,['var']] = df2
   ...: df1
Out[3]: 
                     var
2015-07-12 00:00:00  1.2
2015-07-12 01:00:00  1.2
2015-07-12 02:00:00  1.2
2015-07-12 03:00:00  1.2
2015-07-12 04:00:00  1.2
2015-07-12 05:00:00  1.2
2015-07-12 06:00:00  1.2
2015-07-12 07:00:00  1.2
2015-07-12 08:00:00  1.2
2015-07-12 09:00:00  1.2
2015-07-12 10:00:00  1.2
2015-07-12 11:00:00  1.2
2015-07-12 12:00:00  1.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    IndexingRelated to indexing on series/frames, not to indexes themselvesTimezonesTimezone data dtype

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions