Creating Scenarios (Coding recipes, Sync recipes)
Hi,
I have a really complex sub-flow for a Data Science use case. I ran various validation and testing which I don't want to run all the time. So here is the scenario:
I have a flow consisting mainly of various python and Pyspark recipes along with some sync and visual recipes. The total components are 22. Now I want to run the pipeline but some parts (which I used for validations). Now I want to run only 14 components/recipes (python, pyspark, sync, etc). I want to add each recipe in sequential order and then execute it manually whenever I want. This way with one click I will be able to execute these 14 components in a specific order
Hive recipe -> pyspark recipe -> python recipe -> sync recipe -> sync recipe etc.
I gone through the documentation but couldn`t find anything directly relevant to this.
Operating system used: Linux
Best Answer
-
Emma Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 52 Dataiker
Hey @Dawood154
,What you're looking for is a Dataiku Scenario; you can find information in the Documentation, Knowledge Base articles, or the Academy.
You can set up your Scenario to work as described with a manual trigger or on a scheduled cadence. Populate the Steps tab with which datasets you would like to build. Remember, the Dataiku Flow is aware of each dataset's dependencies, so you can Preview a Build from the Flow to see what will automatically get included in the build.
I hope that helps,
Emma