Dataiku Named a Gartner Magic Quadrant Leader 2 Years Running! Read More

Unable to write to dataset - TypeError: 'ObjectBlock' object is not iterable

emher
Level 3
Unable to write to dataset - TypeError: 'ObjectBlock' object is not iterable

When i try to write a pandas dataframe to a dataset, i.e.

 

df = pd.DataFrame(...)
ds = dataiku.Dataset(...)
ds.write_with_schema(df, dropAndCreate=True)

 

i get the following error,

 

TypeError: 'ObjectBlock' object is not iterable

 

Have anyone tried something similar? And/or do you know what might be going wrong? Inspecting the dataset, I can see that the schema is written as intended, but no data is written.

EDIT: The error occurs only when I call the code from outside dataiku, if I create a recipe inside dataiku, it works as intended.

0 Kudos
3 Replies
fchataigner2
Dataiker
Dataiker

Hi,

if it works in DSS, then maybe the cause is a discrepancy in package versions, notably of Pandas and/or Numpy. Can you get a `pip list` of the python environment where you get this error. Also, what's the full stacktrace of the error? (ie where is this error raised from)

0 Kudos
emher
Level 3
Author

The full stack trace is,

 

Traceback (most recent call last):   File "/home/emher/Projects/pipeline_test/PFC_python_library/tmp.py", line 35, in <module>     pfc.delivery.deliver(df_ecmwf, df_pfc, row.to_dict())   File "/home/emher/Projects/pipeline_test/PFC_python_library/pfc/delivery.py", line 330, in deliver     log_handler(meta_data, delivery_blob, error)   File "/home/emher/Projects/pipeline_test/PFC_python_library/tmp.py", line 31, in <lambda>     pfc.delivery.log_handler = lambda x, y, z: log_to_dataset(x, y, z, dataset=dataiku.Dataset("delivery_logs"))   File "/home/emher/Projects/pipeline_test/PFC_python_library/pfc/delivery.py", line 402, in log_to_dataset     writer.write_dataframe(df_log)   File "/home/emher/Projects/pipeline_test/PFC_python_library/venv/lib/python3.8/site-packages/dataiku/core/dataset_write.py", line 395, in write_dataframe     dku_pandas_csv.DKUCSVFormatter(df, self.remote_writer,   File "/home/emher/Projects/pipeline_test/PFC_python_library/venv/lib/python3.8/site-packages/dataiku/core/dku_pandas_csv.py", line 197, in save     self._save()   File "/home/emher/Projects/pipeline_test/PFC_python_library/venv/lib/python3.8/site-packages/dataiku/core/dku_pandas_csv.py", line 296, in _save     self._save_chunk(start_i, end_i)   File "/home/emher/Projects/pipeline_test/PFC_python_library/venv/lib/python3.8/site-packages/dataiku/core/dku_pandas_csv.py", line 318, in _save_chunk     for col_loc, col in zip(b.mgr_locs, d): TypeError: 'ObjectBlock' object is not iterable 0 rows successfully written (NBRAVMN4TD) 
 
Hence the error arises within dku_pandas_csv. Locally, my numpy and pandas versions are,
 
numpy==1.20.0
pandas==1.2.1
 
I am not sure what version dataiku uses internally. Do you know where i can see this?
0 Kudos
fchataigner2
Dataiker
Dataiker

Hi,

you're indeed using a very recent Pandas, and python 3.7. DSS' python code actually doesn't handle it yet, so you should revert your pandas to pandas>=1.0,<1.1, and possibly use a python3.6 if that doesn't solve the error

0 Kudos
A banner prompting to get Dataiku DSS