Description
I got bit recently again by the mixed types DtypeWarning while processing a CSV file. I assume that at some point, when StringDtype is not experimental anymore, read_csv()
will use that and won't need object dtype anymore, and so this potential problem source will go away.
In the meantime though, would it be possible to have an option in read_csv()
to use StringDtype instead of the object dtype? Both for early adopters and people who want to try it out... and it would also be a nice migration path for when StringDtype is ready. Then, it would only be a matter of flipping the default for this switch. And for people who need to revert to object dtype for some reason, that would provide them a way to do that too at that time. Thoughts? Or is it still too soon for even experimental usage of StringDtype in read_csv()
?
I'd be willing to create a pull request for this (at last for the Python version of the CSV parser).