Improved UX for Senario Variables Setup

tgb417 · ‎11-08-2022

User Story

As a Citizen Data Scientist just starting to advance with Scenarios, that typically uses “shaker” Dataiku formula steps, it would be helpful to have a more intuitive setup to get Data Set Metrics into a place that is usable by scenario steps.

Example of Roadblocks:

Non-Partitioned datasets must have "_NP" added to what you think is the name of the dataset.

Having to write a formula like this in order to get a row count into a variable is not easily intuited.

filter(parseJson(stepOutput_Compute_Metrics)['PROJECT_NAME.DATASET_NAME_NP'].computed, x, x["metricId"] == "records:COUNT_RECORDS")[0].value

There is no obvious help, tutorial, training, knowledge base examples, code snip-its, to get a user through this particularly on non-partitioned datasets. The use examples are fairly limited, and focus on python code based examples.
If a dataset in your flow that is used to create a metric is temporarily empty a variable based on a metric like a min or max date in a certain column will fail. (One can work around this with a SQL, based metric, and something like a coalesce to deal with the null values of an empty dataset.)
Automatic building of a flow that has a dataset that is filtered by a variable is not correctly recognized by the recursive flow build algorithm. (One may be able to work around this issue by explicitly building each data set in the flow in it’s own scenario step.)
There is no user interface to see all of the possible values that one can extract from the computed metrics. Not in tool tips or a dedicated interface. Logs don't help much either. Having to discover a formula like this to put these values in the project variables is also challenging to discover.
```
parseJson(stepOutput_Compute_Metrics)['PROJECT_NAME.DATASET_NAME_NP'].computed
```
Here is another example of similar challenges with Variables in Scenarios
The ability to set both Local and Global Project variables.

--Tom

Marlan · ‎11-08-2022

I wholeheartedly agree with this suggestion. Thanks for writing it up, Tom (@tgb417).

We had a similar need recently but rather than accessing metrics we wanted to access checks. I wrote a Python script to access the results of a Run Checks script. I posted it here asking if there was a better way thinking there had to be one. In fact using a formula would probably have been better but figuring out how to write it would have been no small task.

The use case of checking values of metrics (and checks) and taking appropriate action within a scenario seems like a reasonable and common one that should be easier.

Marlan

tgb417 · ‎11-08-2022

@Marlan ,

Thanks for joining the conversation. It helps to understand that I’m not alone with this challenge. Will take a look at your script also.

Re-reviewing the academy the academy was helpful. But did not cover the formula language “shaker” approach to these questions that would be helpful to folks getting started.

https://academy.dataiku.com/path/advanced-designer/automation-course-1/675879

--Tom

ktgross15 · ‎11-21-2022

Thanks for the feedback @tgb417 and @Marlan , we hear you and will let you know if we have any updates here!

Katie

Turribeach · ‎01-27-2023

Up voted as it is indeed an area that needs improvement. Here is a post showing a use case for Metric values being used in Scenario logic. Note the complexity in extracting the metric value. It's doable but certainly needs more of a "coder" than a "clicker"...

https://community.dataiku.com/t5/What-s-New/Want-to-Control-the-Execution-of-Scenario-Steps-With-Con...

tgb417 · ‎01-27-2023

@Turribeach ,

Thanks for joining this conversation, and thank you for your wonderful post.

--Tom

me2 · ‎07-28-2023

Thank you for explaining. This saved me a lot of time @tgb417.

Improved UX for Senario Variables Setup

Labels

Designer Experience

Consistent display of chart title when hover on chart tab

I want to use Dataiku in Japanese.

Programmatic Git Support (Shell, Python API or Both)

Method to re-order V12 Visual ML override rules