Build flow between two datasets

Options
natejgardner
natejgardner Neuron, Registered, Neuron 2022, Neuron 2023 Posts: 151 Neuron

Recursive builds are a great feature. Build flow outputs reachable from here is a great feature. But in large projects, I also often find myself just wanting to constrain Dataiku to build a certain segment of my flow and nothing else. No upstream dependencies. No downstream outputs. Just start at one point, end at another, execute in order with all the dependency-calculation goodness of the Dataiku DAG.

Screenshot 2021-07-28 104417.jpg

For a simple example, I'd love to be able to select the leftmost and rightmost datasets here and with one click "run everything between" these datasets, resulting in my rightmost dataset being built, but nothing upstream of the leftmost dataset being rebuilt (since those take hours to finish and Dataiku's dependency management often triggers them to build even when nothing has changed). The dataset at the bottom wouldn't be rebuilt in this scenario.

For more general cases, it would also be really cool to get a preview of the sections of the flow that will be built every time, and have the option to add and remove items from that each time. That way as an alternative workflow, when selecting all of these and building or when building flow outputs reachable from here or when building recursively, I can just deselect the section I don't want rebuilt.

1
1 votes

In the Backlog · Last Updated

Comments

Setup Info
    Tags
      Help me…