When I want to analyse or download my final dataset, it only performs on the sample (first records) defined in the preparation steps, whereas I want it to perform on all my data.
I read the doc but I cannot see something about this, maybe it is my parameters or maybe there is a size limit? (I use a professional licence) Also, the prepared datasets were run on local stream maybe it can explain this?
Hi @leaw and welcome to the Dataiku Community! While you wait for a more complete response, I wanted to at least provide some helpful information.
By default, the first 10,000 records of your dataset are selected for the sample. While this sampling method does not provide the best sample quality, it allows you to get your sample very quickly, whatever the size of your dataset.
The sampling can be configured in the “Sampling” tab. More information about Sampling can be found here: