Creating Scenarios (Coding recipes, Sync recipes)

Dawood154 · February 2023

Hi,

I have a really complex sub-flow for a Data Science use case. I ran various validation and testing which I don't want to run all the time. So here is the scenario:

I have a flow consisting mainly of various python and Pyspark recipes along with some sync and visual recipes. The total components are 22. Now I want to run the pipeline but some parts (which I used for validations). Now I want to run only 14 components/recipes (python, pyspark, sync, etc). I want to add each recipe in sequential order and then execute it manually whenever I want. This way with one click I will be able to execute these 14 components in a specific order

Hive recipe -> pyspark recipe -> python recipe -> sync recipe -> sync recipe etc.

I gone through the documentation but couldn`t find anything directly relevant to this.

Operating system used: Linux

Emma · February 2023

Hey @Dawood154
,

What you're looking for is a Dataiku Scenario; you can find information in the Documentation, Knowledge Base articles, or the Academy.

You can set up your Scenario to work as described with a manual trigger or on a scheduled cadence. Populate the Steps tab with which datasets you would like to build. Remember, the Dataiku Flow is aware of each dataset's dependencies, so you can Preview a Build from the Flow to see what will automatically get included in the build.

I hope that helps,

Emma

Creating Scenarios (Coding recipes, Sync recipes)

Best Answer

Categories

Setup Info

Tags