Save DataFrame to a managed folder
galapah
Registered Posts: 2 ✭✭✭
I am trying to save a pandas DataFrame to a managed folder in Dataiku.
My code:
import dataiku
import pandas as pd
temp_folder = "reports_TEMP"
path_upload_file = "testfile.csv"
df = pd.DataFrame(range(0,10), columns=["test"])
handle = dataiku.Folder(temp_folder)
with handle.get_writer(path_upload_file) as w:
df.to_csv(w)
and this is the error that I get:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-135-e90a7150097c> in <module>
8 handle = dataiku.Folder(temp_folder)
9 with handle.get_writer(path_upload_file) as w:
---> 10 df.to_csv(w)
/data/dataiku/dataiku-dss-6.0.1/dss_data/code-envs/python/Py_36_flight_risk/lib/python3.6/site-packages/pandas/core/generic.py in to_csv(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, compression, quoting, quotechar, line_terminator, chunksize, date_format, doublequote, escapechar, decimal)
3200 doublequote=doublequote,
3201 escapechar=escapechar,
-> 3202 decimal=decimal,
3203 )
3204 formatter.save()
/data/dataiku/dataiku-dss-6.0.1/dss_data/code-envs/python/Py_36_flight_risk/lib/python3.6/site-packages/pandas/io/formats/csvs.py in __init__(self, obj, path_or_buf, sep, na_rep, float_format, cols, header, index, index_label, mode, encoding, compression, quoting, line_terminator, chunksize, quotechar, date_format, doublequote, escapechar, decimal)
64
65 self.path_or_buf, _, _, self.should_close = get_filepath_or_buffer(
---> 66 path_or_buf, encoding=encoding, compression=compression, mode=mode
67 )
68 self.sep = sep
/data/dataiku/dataiku-dss-6.0.1/dss_data/code-envs/python/Py_36_flight_risk/lib/python3.6/site-packages/pandas/io/common.py in get_filepath_or_buffer(filepath_or_buffer, encoding, compression, mode)
198 if not is_file_like(filepath_or_buffer):
199 msg = f"Invalid file path or buffer object type: {type(filepath_or_buffer)}"
--> 200 raise ValueError(msg)
201
202 return filepath_or_buffer, None, compression, False
ValueError: Invalid file path or buffer object type: <class 'dataiku.core.managed_folder.ManagedFolderWriter'>
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,384 DataikerHi,
You can use the Export to Folder recipe to export a DSS dataset to a managed folder.

If you are looking at this via code you can try using
https://doc.dataiku.com/dss/latest/python-api/managed_folders.html#dataiku.Folder.upload_data The following sample worked fine for me :
# -*- coding: utf-8 -*- import dataiku import pandas as pd, numpy as np from dataiku import pandasutils as pdu # Read recipe inputs base_64 = dataiku.Dataset("base_64") df = base_64.get_dataframe() managed_folder_id = "output" output_folder = dataiku.Folder(managed_folder_id) filename = "my_file.csv" output_folder.upload_data(filename, df.to_csv(index=False).encode("utf-8")) -
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,384 DataikerAdditionally, if you are looking to actually use get_writer() you can use it as such :
import dataiku import pandas as pd temp_folder = "output" path_upload_file = "chunck_written.csv" input_dataset = dataiku.Dataset("dataset_name") handle = dataiku.Folder(temp_folder) df = input_dataset.get_dataframe() with handle.get_writer(path_upload_file) as w: w.write(df.to_csv().encode('utf-8')) -
Thank you, Alex!
I need the second solution - just to test writing into a managed folder for another task.
It works!