Scenario to "build all"
Is there an option in scenario to "build all" branches of a project as it is existing in the UX of the flow ?
To explain my question with context :
I have a project with multiple "download" inputs, a common central part, and multiples outputs folders on differents branches.
In have a first step in my scenario building only the common part with all the inputs. I use the option "force rebuild" on the last dataset of my common part in my scenario.
Then i want to build only the branches after the common part (only the datasets & folders inside the green square).
The branches should be built branch after branch (step 2 then step 3, then step 4, ... then step N).
I use the "build required" option, chosing the last folder of each branch.
The problem is that Dataiku for step 2,3,4,N rebuild datasets from the beginning of the flow.
So a simple solution should be finding a way to "build all" once and for all but there is maybe a better solution ?
Thanks!
Answers
-
Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭
Hi Jack,
The Force Rebuild is smart, so if you have one single step listing all output datasets, DSS will actually determine the dependencies and build all required datasets only once.
You can check this in the job, while its running or in the log.
I hope this helps.
Best regards
-
Could be an alternative solution to my problem, thanks !
If i really want separate steps for each of my steps, no solution ? -
Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭
If you still want to build your flow in two-steps, there is something else you can do:
- In the dataset settings you can control when a dataset is built (see attached)
- In summary, you can stop a force rebuild in a dataset.
- So you can have a step that builds downstream from a dataset and another that builds upstream from that dataset (always right to left)
I hope this helps.
Best regards