Skip to content

Series dtype changes when a new row is added #21501

Closed
@asdf8601

Description

@asdf8601

The problem

If we create a Series with a defined dtype and then a new row is added into that Series the dtype changes. I have left an example below:

Example

import pandas as pd
import numpy as np

pd.__version__  # '0.23.1'

# with Series
s = pd.Series([1,2], dtype=np.float64)
print(s.dtype)  # -> float64
s[3] = None
print(s.dtype)  # -> object

# with DataFrames
d = pd.DataFrame([1,2], dtype=np.float64)
print(d.dtypes)  # -> float64
d.loc[3, 0] = None
print(d.dtypes)  # -> float64

However, this doesn't happen when the row is already present:

In[12]: s = pd.Series([1,2]).astype(np.float64)
In[13]: s[3] = None
In[14]: s
Out[14]: 
0       1
1       2
3    None
dtype: object

In[15]: s = s.astype(np.float64)
In[16]: s
Out[16]: 
0    1.0
1    2.0
3    NaN
dtype: float64

# row 3 (position 2) is already present in s
In[18]: s.iloc[2] = None
In[19]: s
Out[19]: 
0    1.0
1    2.0
3    NaN
dtype: float64

In[20]: s.loc[3] = None
In[21]: s
Out[21]: 
0    1.0
1    2.0
3    NaN
dtype: float64

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions