Registered users can ask their own questions, contribute to discussions, and be part of the Community!
I have created scenario to run daily wise to run the pipeline every day. in that scenario I have created 4 steps as below.
1)step 1: to get latest data available or not in table if available pick date_time of latest date
2)step 2: set the date_time as project variable.
3)step 3: based on date_time filter the data from table and keep it in test dataset(for ML model prediction).
4)step 4: run the ML model for test data.
Here when I run the scenario, some times latest data may not exist in table, in that case date_time variable populating as null value in project variable and scenario failing at step 3.
Now I want to stop the scenario when date_time value is null and scenario should stop running instead of failing at step 3(step 1 and step 2 should run). i.e whenever data_time is null
This is easy to do by adding a conditional logic to execute your remaining scenario steps.
See this post: https://community.dataiku.com/t5/Using-Dataiku/Conditional-execute-of-scenario-step-without-steps-fa...
More info about this technique here: https://community.dataiku.com/t5/What-s-New/Want-to-Control-the-Execution-of-Scenario-Steps-With-Con...
Personally I would use row count in your dataset as it is a built-in metric. Here is a sample on how to get the row count from a dataset metric:
toNumber(filter(parseJson(stepOutput_Compute_Metrics)['Project_Key.Dataset_Name_NP']['computed'], x, x["metricId"]=="records:COUNT_RECORDS").value)
Post a Question