conditional flow
Can the flows be controlled on which flow should be run this time? Like I have 3 flows based on the input I want them to run, if the input dataset is empty it should skip it and run the flow where the input data set has data.
Answers
-
Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 317 Dataiker
Hi @xtest6
,
There are a couple of ways that you can do this, but here is one.
If you want to check if a dataset is empty, and only run a "build" step if the dataset is not empty, you could do the following. First, create a custom python step that will perform "compute metrics" on the input dataset, and you'll want to make sure that the record count metric is turned on for the relevant input datasets:This step computes metrics on the input dataset, including the record count. It then gets the COUNT_RECORDS metric, and stores it into a project variable (in this case the variable is "DATASET_NAME_record_count":
import dataiku client = dataiku.api_client() project = client.get_default_project() dataset_name = 'crm_and_web_history_enriched_filtered' mydataset = project.get_dataset(dataset_name) metrics = mydataset.compute_metrics() variables = project.get_variables() for metric in metrics['result']['computed']: if metric['metric']['metricType'] == 'COUNT_RECORDS': # get record count metric variables['standard'][dataset_name + '_record_count'] = int(metric['value']) project.set_variables(variables) break
Then, you can have a subsequent build step that uses conditional logic, and only runs if the relevant dataset has a record count > 0:The syntax here is:
variables["VARIABLE_NAME_IN_QUOTES"] > 0
You can use this logic for each of your build steps, so that each build step only runs when the relevant condition is met.
Let me know if you have any questions about this!
Thank you,
Sarina -
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,165 Neuron
Here is another approach using a metric as well but without having to use any Python code: