Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
I uploaded data to TableauServer and it took about 8 hours. (This is about 200 million rows and 40 columns of data)
I ran it using the TableauHyperFormat plugin (https://www.dataiku.com/product/plugins/tableau-hyper-export/), and the logs show that I read 5000 rows into DF each, and 2000 rows each The logs appear to be reading 5,000 rows into the DF and executing writes of 2,000 rows each.
Is it possible to edit the parameters of the plugin to increase the upload speed by editing the amount of data written at a time?
Also, any other ideas on how to speed up the upload?
Thank you in advance.
Looking at the plugin code, I can confirm that the writes are indeed executed on 2,000 rows:
The batch size is hardcoded to 2000 here:
You could try to edit this parameter of the plugin, but this might improve only slightly the upload speed.
Thanks for the advice.
I actually rewrote the batch size part to 10000 and implemented it, but did not see that much improvement in speed.
Currently we are only reading 5000 rows each in DF, is it possible to improve this by reading more rows?
I was not powerful enough and did not know what part of the code specifies the rows to be read in the DF.
Any advice on how to increase the amount of rows read at a time would be appreciated.