Join us, on May 27th, for an introduction to the new Dataiku Academy Learn more

Visual bug in "prepare" view precluding sample size configuration

Level 3
Visual bug in "prepare" view precluding sample size configuration

I have trouble with the visual presentation in the "prepare" view. The problem is that the name of the current dataset is displayed with very large characters, hiding other elements (besides being unreadable), and most importantly, it makes it impossible to change the sample size - with the result that some operations cannot be previewed due to lack of memory, and this in turn makes it impossible to root out errors... I attach a screen clip of how it looks; if I attempt to press "Configure sample", the only thing I get is the link to the dataset, which is not very useful.



.



Until this is resolved, is there any other way I can change the sample size for a dataset? I cannot seem to find any "property" or other setting anywhere.



/kjell

0 Kudos
4 Replies
Dataiker
Dataiker

Your browser size is probably a bit too small for Dataiku DSS to feel at ease, hence some elements wrap around 😕



As an alternative to access the sample settings you can use the left drawer:





 



 



To edit the sample size in a preparation script, you can click on design sample, hopefully this is not blocked in your view.



Level 3
Author
Thanks a lot, that was what I needed!
0 Kudos
Level 3
Author
Actually, that was not the help I needed, because when I am editing a script, the sample size setting is not available in the drawer.

What I am trying to do: I have a dataset with one column of JSON data, where I want to expand the different items in the JSON data into separate rows. However, when I try to apply the "fold array" operation, dataIku runs out of memory before the sample is processed - and when editing the script, I cannot change the sample size!

While I can run the script anyway, it then produces tons of errors. Without a way to change the sample size, I cannot find out what is wrong in the preview.

BTW, the same operation works fine on a different data set from the same source - the only difference is that now (when it don't work), the origin of the data set comes from a set of partitioned files, while when it worked, the data set was from just a single file from the same set.
0 Kudos
Dataiker
Dataiker
Edited above answer, hope it helps.
0 Kudos
Labels (1)