is there a way to split output file into smaller chunks?
Witw
Registered Posts: 2 ✭
Hi, I have dataiku flow to format the data and generate csv files to azure blob storage.
However it seems like the output files is over 100MB each. We would like to have a control over the output file - spliting data into smaller files - not more than 50MB each file.
Wonder if this is possible to do on dataiku flow?
Thanks
Operating system used: window
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,160 Neuron
Certainly possible, calculate your average row size, then calculate how many rows that amounts to 50 MB and then split the data in that number of rows / files. It would certainly need Python but it's not that complicated to do.