Scenario to "build all"

jack
jack Dataiku DSS Core Designer, Registered Posts: 13 ✭✭✭✭

Is there an option in scenario to "build all" branches of a project as it is existing in the UX of the flow ?
Dataiku_scenario_build_all.png

To explain my question with context :

I have a project with multiple "download" inputs, a common central part, and multiples outputs folders on differents branches.

flow_explanation_1.png

In have a first step in my scenario building only the common part with all the inputs. I use the option "force rebuild" on the last dataset of my common part in my scenario. 

flow_explanation_2.png


Then i want to build only the branches after the common part (only the datasets & folders inside the green square).
The branches should be built branch after branch (step 2 then step 3, then step 4, ... then step N).
I use the "build required" option, chosing the last folder of each branch.

flow_explanation_3.png



The problem is that Dataiku for step 2,3,4,N rebuild datasets from the beginning of the flow.

So a simple solution should be finding a way to "build all" once and for all but there is maybe a better solution ?

Thanks!

Answers

  • Manuel
    Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭

    Hi Jack,

    The Force Rebuild is smart, so if you have one single step listing all output datasets, DSS will actually determine the dependencies and build all required datasets only once.

    You can check this in the job, while its running or in the log.

    I hope this helps.

    Best regards

  • jack
    jack Dataiku DSS Core Designer, Registered Posts: 13 ✭✭✭✭

    Could be an alternative solution to my problem, thanks !

    If i really want separate steps for each of my steps, no solution ?

  • Manuel
    Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭

    If you still want to build your flow in two-steps, there is something else you can do:

    • In the dataset settings you can control when a dataset is built (see attached)
    • In summary, you can stop a force rebuild in a dataset.
    • So you can have a step that builds downstream from a dataset and another that builds upstream from that dataset (always right to left)

    I hope this helps.

    Best regards

Setup Info
    Tags
      Help me…