The Dataiku Frontrunner Awards have just launched to recognize your achievements! Submit Your Entry

Queue downstream execution if I already started running an upstream recipe

Queue downstream execution if I already started running an upstream recipe

0 Kudos

When developing pipelines, I often find myself kicking off an operation upstream only to wish I'd waited until I finished developing the downstream portion of the pipeline and built recursively. It would be really convenient to be able to simply queue up downstream operations to run after upstream ones complete, that way, if I run a downstream recipe recursively while an upstream recipe is already running, the downstream recipe will start from the upstream dataset as soon as it's done being built. This is especially nice when I have hours-long operations. I might be 2-3 hours into a 5-hour dataset build upstream, but want to run other time consuming portions of the pipeline as soon as it's done. I don't want to lose those hours by aborting the upstream recipe, but I also don't want to wait another two hours to hit the build button for my downstream recipes. With this feature, I'd be able to simply queue up my work, close the laptop for the night, and come back in the morning to a fully-processed dataset. While it's rare, sometimes I do even have multi-day build times for my data flows, and in those cases, this feature would be an enormous time saver.