Description
Code Sample
>>> import pandas as pd
>>> import pytz
>>> x = pd.DataFrame(data=[10,20,30], index=pd.date_range(start='2018-06-01T00:00:00Z', periods=3, freq='1h', tz=pytz.UTC))
>>> y = pd.DataFrame(data=[10,20,30], index=pd.date_range(start='2018-06-01T02:00:00', periods=3, freq='1h', tz=pytz.timezone('Europe/Brussels')))
>>> x.index[0]
Timestamp('2018-06-01 00:00:00+0000', tz='UTC', freq='H')
>>> y.index[0]
Timestamp('2018-06-01 02:00:00+0200', tz='Europe/Brussels', freq='H')
>>> x.index[0] == y.index[0]
True
>>> x.index == y.index
array([ True, True, True])
>>> x.equals(y)
False
>>> x.index.equals(y.index)
False
Problem description
When handling two dataframes which had the same datetimes in their index but different timezones, I was caught off-guard when they turned out to be considered equal in some cases but not in others, like in the example above.
Is it intentional or a bug that the two equals calls return False?
Expected Output
Output of pd.show_versions()
pandas: 0.23.1
pytest: None
pip: 10.0.1
setuptools: 39.2.0
Cython: None
numpy: 1.14.5
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None