Survey banner
The Dataiku Community is moving to a new home! Some short term disruption starting next week: LEARN MORE

conditional flow

Level 1
conditional flow

Can the flows be controlled on which flow should be run this time? Like I have 3 flows based on the input I want them to run, if the input dataset is empty it should skip it and run the flow where the input data set has data.

0 Kudos
2 Replies

Hi @xtest6,

There are a couple of ways that you can do this, but here is one. 

If you want to check if a dataset is empty, and only run a "build" step if the dataset is not empty, you could do the following. First, create a custom python step that will perform "compute metrics" on the input dataset, and you'll want to make sure that the record count metric is turned on for the relevant input datasets:

Screenshot 2024-05-14 at 3.41.49 PM.png

This step computes metrics on the input dataset, including the record count. It then gets the COUNT_RECORDS metric, and stores it into a project variable (in this case the variable is "DATASET_NAME_record_count":

import dataiku 

client = dataiku.api_client()
project = client.get_default_project()

dataset_name = 'crm_and_web_history_enriched_filtered'
mydataset = project.get_dataset(dataset_name)
metrics = mydataset.compute_metrics()

variables = project.get_variables()
for metric in metrics['result']['computed']:
    if metric['metric']['metricType'] == 'COUNT_RECORDS':
        # get record count metric 
        variables['standard'][dataset_name + '_record_count'] = int(metric['value'])

Then, you can have a subsequent build step that uses conditional logic, and only runs if the relevant dataset has a record count > 0:

Screenshot 2024-05-14 at 3.41.39 PM.png

The syntax here is:

variables["VARIABLE_NAME_IN_QUOTES"] > 0

You can use this logic for each of your build steps, so that each build step only runs when the relevant condition is met. 

Let me know if you have any questions about this! 

Thank you,

0 Kudos

Here is another approach using a metric as well but without having to use any Python code:


0 Kudos