Survey banner
The Dataiku Community is moving to a new home! We are temporary in read only mode: LEARN MORE

Scenario - Defining a variable using a computed metric

Solved!
me2
Level 3
Scenario - Defining a variable using a computed metric

Hi, I am having trouble with the right syntax to extract a 'computed metric' from a dataset.

I found a great post that successfully got me to report (output to reporter) a variable with an arbitrary value.  

https://community.dataiku.com/t5/Using-Dataiku/Conditional-execute-of-scenario-step-without-steps-fa...

Now I'm trying to extract a 'computed metric', "records:COUNT_RECORDS" from a dataset.  Dataiku documents this but I can't successfully retrieve the value.

https://doc.dataiku.com/dss/latest/scenarios/variables.html

I modified the syntax for my case

t${filter(parseJson(stepOutput_the_metrics)[‘projID.computed’].computed, x, x.metricId == ‘records:COUNT_RECORDS’)[0].value}

Where projID I got from theURL of my project.

I get the following error when I run the scenario.

java.lang.IllegalArgumentException
Incorrect formula: '${filter(parseJson(stepOutput_the_metrics)[‘projID.computed’].computed, x, x.metricId == ‘records:COUNT_RECORDS’)[0].value}' : Missing number, string, identifier, regex, or parenthesized expression(Parsing error at offset 0), caused by: ParsingException: Missing number, string, identifier, regex, or parenthesized expression(Parsing error at offset 0)

1) How do I correct the formula?

2) In this case, I am only running the computed metrics for one dataset but if this works I would want computed metrics from various data sets.  How would I modify the syntax so I can retrieve the same metrics from different datasets?

Thank you for your time!

0 Kudos
1 Solution
Turribeach

Hi, rthere are few things that you may be doing wrong. Be aware both project ID and dataset ID are case sensitive. I covered more in detail on how to build the formula to extract a metric on this other post:

https://community.dataiku.com/t5/What-s-New/Want-to-Control-the-Execution-of-Scenario-Steps-With-Con...

If you scroll down you will see how to extract the whole JSON and then build the extraction expression step by step.

 

 

View solution in original post

6 Replies
AlexT
Dataiker

Hi,

You can only retrieve the metrics from the dataset computed in the previous step.
If you computed multiple datasets in that step you need to change the "CaMoYxZE_NP" with the respective dataset ID.

toNumber(filter(parseJson(stepOutput_the_metrics)['SECFILLINGS.CaMoYxZE_NP']['computed'], x, x["metricId"]=="basic:COUNT_RECORDS")[0].value)


If you want to retrieve it from other dataset where metrics were not computed in this particular scenario for example you can use : https://community.dataiku.com/t5/Using-Dataiku/Compute-Metrics-using-Python-API/m-p/24763

Thanks,

Thanks,

0 Kudos
me2
Level 3
Author

Thank you @AlexT 

I tried what you recommended and I am getting the following error.

java.lang.Exception
parseJson failed: Missing value at 0 [character 1 line 1]

I made the following changes from your syntax so it can apply to my case:

'SECFILLINGS.CaMoYxZE_NP' changed to 'projectID.datasetname_NP'

Project ID I get from the URL.

Datasetname, I could not locate a "dataset ID" so I've been using the dataset name.  That is also what I see used in the scenario logs.

I noticed your syntax for the metric is using

basic:COUNT_RECORDS

and I had been using

records:COUNT_RECORDS

I tried both and get the same error.

Also, I have the "Evaluate Variable" toggle switch to ON.

Anything else you recommend I try?

Thank you.

0 Kudos
Turribeach

Hi, rthere are few things that you may be doing wrong. Be aware both project ID and dataset ID are case sensitive. I covered more in detail on how to build the formula to extract a metric on this other post:

https://community.dataiku.com/t5/What-s-New/Want-to-Control-the-Execution-of-Scenario-Steps-With-Con...

If you scroll down you will see how to extract the whole JSON and then build the extraction expression step by step.

 

 

Turribeach

I believe the right formula for non-partitioned datasets will be:

toNumber(filter(parseJson(stepOutput_REPLACE_WITH_COMPUTE_METRICS_STEP_NAME_WITH_NO_SPACES)['REPLACE_WITH_PROJECT_ID.REPLACE_WITHDATASET_ID_NP']['computed'], x, x["metricId"]=="records:COUNT_RECORDS")[0].value)

 

0 Kudos
me2
Level 3
Author

It finally worked and it turns out I was doing 2 things wrong.

My final equation...

toNumber(filter(parseJson(stepOutput_Compute_Metrics)['ProjID.DatasetID_NP']['computed'], x, x["metricId"]=="records:COUNT_RECORDS")[0].value)

@AlexTmentioned I need to add the dataset ID.  So I looked at the logs and dataset ID is the name of the dataset.  In my case it was simply 'test'.  'ProjID.DatasetID_NP'.  Thank you Alex!

@Turribeachyou kept stated to add the compute metrics step "stepOuput_REPLACE_WITH_COMPUTE_METRICS_STEP_NAME_WITH_NO_SPACES" and I did the mistake of leaving the default step name "Step #2".  When I looked at the logs, the step name was "the_metrics".  I tried variations of the two with no sucess... then it occurred to me to rename the Computer Metrics steps from the default to a name with no spaces.  So I renamed it "Compute_Metrics".  I updated the variable formula in "Define variable" step and SUCCESS!!!

I am now able to send that scenario variable over to Reporter.

Thank you!!

0 Kudos
Turribeach

Glad you got there in the end. Certainly this is an area Dataiku should improve as suggested by this idea. On the positive side now you have the tools and the knowledge to use this feature going forward.

0 Kudos