Writing df on chunks with buillt in Dataiku functionality
Hi team,
I try to write in chnunks a data frame with 1000 columns as the memory cant take. I am writing this on a SQL database table. However, I am receiving a schema error. The target table is empty since I just created.
import dataiku import pandas as pd, numpy as np from dataiku import pandasutils as pdu # Read recipe inputs inp = dataiku.Dataset("dto_1") out = dataiku.Dataset("dto_features_unswifted_1") with out.get_writer() as writer: for df in inp.iter_dataframes( chunksize=10500): # Write the processed dataframe writer.write_dataframe(df)
Best Answer
-
Hi @PapaA
!Welcome to the community!
I think you did not import the right file but I guess the error was saying that the output schema had 0 column while the input had 1000.
To fix this error, you must proceed in two times. First, you need to replicate the schema and then, load your data.
import dataiku import pandas as pd, numpy as np from dataiku import pandasutils as pdu # Read recipe inputs inp = dataiku.Dataset("dto_1") out = dataiku.Dataset("dto_features_unswifted_1") out.write_schema_from_dataframe(inp.get_dataframe()) with out.get_writer() as writer: for df in inp.iter_dataframes( chunksize=10500): # Write the processed dataframe writer.write_dataframe(df)
If I did not get the right error you were receiving, could you please verify that you sent the right file please?
Have a great day,
Henri
Answers
-
HI Henric,
After implementing this solution we are still receiving memory issues even though we are using really small chunks. The data frame that we try to write to sql has 900 columns and 70K rows. The machine we are using for Dataiku has 128GB Ram with 16 cores.
Can this behaviour be attributed to something else?
Kr,
Al -
Hey @PapaA
,What kind of SQL dataset are you using?
Searching on the web this error, I think this page could bring you some information on the way to solve it : https://mariadb.com/kb/en/troubleshooting-row-size-too-large-errors-with-innodb/
If you do not find the solution, I'd be happy to help