Only the sample dataset as a result

leaw
leaw Dataiku DSS Core Designer, Registered Posts: 1 ✭✭✭

Hello, I am new to Dataiku.

When I want to analyse or download my final dataset, it only performs on the sample (first records) defined in the preparation steps, whereas I want it to perform on all my data.

I read the doc but I cannot see something about this, maybe it is my parameters or maybe there is a size limit? (I use a professional licence) Also, the prepared datasets were run on local stream maybe it can explain this?

Thank you for your help

Tagged:

Answers

  • CoreyS
    CoreyS Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Registered Posts: 1,150 ✭✭✭✭✭✭✭✭✭

    Hi @leaw
    and welcome to the Dataiku Community! While you wait for a more complete response, I wanted to at least provide some helpful information.

    By default, the first 10,000 records of your dataset are selected for the sample. While this sampling method does not provide the best sample quality, it allows you to get your sample very quickly, whatever the size of your dataset.

    The sampling can be configured in the “Sampling” tab. More information about Sampling can be found here:

    1) Sampling (Documentation)
    2) Concept: Sampling (Academy and Knowledge Base)

    I hope this helps!

Setup Info
    Tags
      Help me…