Announcing the winners & finalists of the Dataiku Frontrunner Awards 2021! Read their inspiring stories

Build flow between two datasets

0 Kudos

Recursive builds are a great feature. Build flow outputs reachable from here is a great feature. But in large projects, I also often find myself just wanting to constrain Dataiku to build a certain segment of my flow and nothing else. No upstream dependencies. No downstream outputs. Just start at one point, end at another, execute in order with all the dependency-calculation goodness of the Dataiku DAG.

Screenshot 2021-07-28 104417.jpg

For a simple example, I'd love to be able to select the leftmost and rightmost datasets here and with one click "run everything between" these datasets, resulting in my rightmost dataset being built, but nothing upstream of the leftmost dataset being rebuilt (since those take hours to finish and Dataiku's dependency management often triggers them to build even when nothing has changed). The dataset at the bottom wouldn't be rebuilt in this scenario.

For more general cases, it would also be really cool to get a preview of the sections of the flow that will be built every time, and have the option to add and remove items from that each time. That way as an alternative workflow, when selecting all of these and building or when building flow outputs reachable from here or when building recursively, I can just deselect the section I don't want rebuilt.