Skip to content

Unable to write dataframe to csv via hdfs_client using pandas 1.0.1 #32745

Closed
@ghost

Description

I am trying to save a data frame to csv using the method df.to_csv(writer) by passing hdfs_client's writer but it is throwing an error
"ValueError: Invalid file path or buffer object type: <class 'hdfs.util.AsyncWriter'>"
Code:

with hdfs_client.write('/some/existing/path/in/datalake/dummy.csv', encoding='utf-8') as writer:
    df.to_csv(writer,encoding='utf-8') (edited) 

The issue was also mentioned in #21560
Error:

ValueError                                Traceback (most recent call last)
<ipython-input-10-a363c2611af8> in <module>
      1 with hdfs_client.write('/shared/ml/data/sfd.csv', encoding='utf-8') as writer:
----> 2     df.to_csv(writer,encoding='utf-8')

/opt/conda/lib/python3.7/site-packages/pandas/core/generic.py in to_csv(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, compression, quoting, quotechar, line_terminator, chunksize, date_format, doublequote, escapechar, decimal)
   3200             doublequote=doublequote,
   3201             escapechar=escapechar,
-> 3202             decimal=decimal,
   3203         )
   3204         formatter.save()

/opt/conda/lib/python3.7/site-packages/pandas/io/formats/csvs.py in __init__(self, obj, path_or_buf, sep, na_rep, float_format, cols, header, index, index_label, mode, encoding, compression, quoting, line_terminator, chunksize, quotechar, date_format, doublequote, escapechar, decimal)
     64 
     65         self.path_or_buf, _, _, _ = get_filepath_or_buffer(
---> 66             path_or_buf, encoding=encoding, compression=compression, mode=mode
     67         )
     68         self.sep = sep

/opt/conda/lib/python3.7/site-packages/pandas/io/common.py in get_filepath_or_buffer(filepath_or_buffer, encoding, compression, mode)
    198     if not is_file_like(filepath_or_buffer):
    199         msg = f"Invalid file path or buffer object type: {type(filepath_or_buffer)}"
--> 200         raise ValueError(msg)
    201 
    202     return filepath_or_buffer, None, compression, False

ValueError: Invalid file path or buffer object type: <class 'hdfs.util.AsyncWriter'>

Metadata

Metadata

Assignees

No one assigned

    Labels

    IO HDF5read_hdf, HDFStoreNeeds InfoClarification about behavior needed to assess issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions