Skip to content

numpy.datetime64 casted to Timestamp when added to DataFrame #17183

Closed
@strazdinsg

Description

@strazdinsg
import numpy
import pandas as pd

# We create a list of strings. 
time_str_arr = ['2017-06-30T13:51:15.854', '2017-06-30T13:51:16.250',
                '2017-06-30T13:51:16.452', '2017-06-30T13:51:16.659']
# Then we create a time array, rounded to 10ms (actually floored, 
# not rounded), everything seems to be fine here.
rounded_time = numpy.array(time_str_arr, dtype="datetime64[10ms]")
rounded_time 

# Then we create a Pandas DataFrame and assign the time array as a 
# column to it. The datetime64 content is destroyed.
d = {'one' : pd.Series([1., 2., 3.], index=['a', 'b', 'c']),
  'two' : pd.Series([1., 2., 3., 4.], index=['a', 'b', 'c', 'd'])}
df = pd.DataFrame(d)
df = df.assign(wrong_time=rounded_time)
df

# The output I get:
#    one two wrong_time
# a   1.0 1.0 1974-10-01 18:11:07.585
# b   2.0 2.0 1974-10-01 18:11:07.625
# c   3.0 3.0 1974-10-01 18:11:07.645
# d   NaN 4.0 1974-10-01 18:11:07.665

# (the timestamps are wrong)

Problem description

I am having a list of timestamps, with millisecond accuracy, encoded as strings. Then I round them to 10ms resolution, that goes well. The bug comes when I add the rounded timestamps to DataFrame as a new column - the values of datetime64 objects get totally destroyed. My suspicion - the numpy.datetime64 is converted to some other datatype in the DataFrame.assign() method. It should maintain the same type.

Output of pd.show_versions()

INSTALLED VERSIONS
commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 78 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.20.1
pytest: 3.0.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: None
IPython: 5.3.0
sphinx: 1.5.6
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.2.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.4.7
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.7.3
bs4: 4.6.0
html5lib: 0.999
sqlalchemy: 1.1.9
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    DatetimeDatetime data dtypeDtype ConversionsUnexpected or buggy dtype conversionsDuplicate ReportDuplicate issue or pull request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions