We're excited to announce that we're launching the second installment of Dataiku Product Days Register Now

Scenario to "build all"

jack
Level 3
Scenario to "build all"

Is there an option in scenario to "build all" branches of a project as it is existing in the UX of the flow ?
Dataiku_scenario_build_all.png

To explain my question with context :

I have a project with multiple "download" inputs, a common central part, and multiples outputs folders on differents branches. 

 

flow_explanation_1.png

 

In have a first step in my scenario building only the common part with all the inputs. I use the option "force rebuild" on the last dataset of my common part in my scenario. 

flow_explanation_2.png


Then i want to build only the branches after the common part (only the datasets & folders inside the green square).
The branches should be built branch after branch (step 2 then step 3, then step 4, ... then step N).
I use the "build required" option, chosing the last folder of each branch.

flow_explanation_3.png



The problem is that Dataiku for step 2,3,4,N rebuild datasets from the beginning of the flow.

 

So a simple solution should be finding a way to "build all" once and for all but there is maybe a better solution ?

Thanks!

0 Kudos
3 Replies
Manuel
Dataiker
Dataiker

Hi Jack, 

The Force Rebuild is smart, so if you have one single step listing all output datasets, DSS will actually determine the dependencies and build all required datasets only once.

You can check this in the job, while its running or in the log.

I hope this helps.

Best regards

0 Kudos
jack
Level 3
Author

Could be an alternative solution to my problem, thanks !

If i really want separate steps for each of my steps, no solution ?

0 Kudos
Manuel
Dataiker
Dataiker

If you still want to build your flow in two-steps, there is something else you can do:

  • In the dataset settings you can control when a dataset is built (see attached)
  • In summary, you can stop a force rebuild in a dataset.
  • So you can have a step that builds downstream from a dataset and another that builds upstream from that dataset (always right to left)

I hope this helps.

Best regards

0 Kudos
A banner prompting to get Dataiku DSS