Survey banner
Switching to Dataiku - a new area to help users who are transitioning from other tools and diving into Dataiku! CHECK IT OUT

Reading dataset in Python recipe is very slow

Level 1
Reading dataset in Python recipe is very slow
I was using the following lines to read dataset into pandas dataframe

data=dataiku.Dataset('dataset name')


it takes almost 3 mins to read in the table. Alternatively if I export the dataset into csv and read in the csv it only takes 12s. I was wondering if there a more efficient way to read the dataset as panda dataframe without creating the intermediate csv file?

2 Replies

How long does it take to export the dataset into CSV?
0 Kudos
Level 2

Hi @Clément_Stenac ,


I'm experiencing the same issue as the user above. It takes around 11 minutes to load a 2.2gb dataset into a dataframe. Running on my laptop it takes around 1min 15s.

Exporting the dataset to csv takes similarly long as importing it to the dataframe.

Any tips on how to speed this up or what storage type to use for quicker loads?



0 Kudos


Labels (1)
A banner prompting to get Dataiku