Scenario - Defining a variable using a computed metric

me2
me2 Registered Posts: 50 ✭✭✭✭✭

Hi, I am having trouble with the right syntax to extract a 'computed metric' from a dataset.

I found a great post that successfully got me to report (output to reporter) a variable with an arbitrary value.

https://community.dataiku.com/t5/Using-Dataiku/Conditional-execute-of-scenario-step-without-steps-failing-or/m-p/35969#M13316

Now I'm trying to extract a 'computed metric', "records:COUNT_RECORDS" from a dataset. Dataiku documents this but I can't successfully retrieve the value.

https://doc.dataiku.com/dss/latest/scenarios/variables.html

I modified the syntax for my case

t${filter(parseJson(stepOutput_the_metrics)[‘projID.computed’].computed, x, x.metricId == ‘records:COUNT_RECORDS’)[0].value}

Where projID I got from theURL of my project.

I get the following error when I run the scenario.

java.lang.IllegalArgumentException
Incorrect formula: '${filter(parseJson(stepOutput_the_metrics)[‘projID.computed’].computed, x, x.metricId == ‘records:COUNT_RECORDS’)[0].value}' : Missing number, string, identifier, regex, or parenthesized expression(Parsing error at offset 0), caused by: ParsingException: Missing number, string, identifier, regex, or parenthesized expression(Parsing error at offset 0)

1) How do I correct the formula?

2) In this case, I am only running the computed metrics for one dataset but if this works I would want computed metrics from various data sets. How would I modify the syntax so I can retrieve the same metrics from different datasets?

Thank you for your time!

Tagged:

Best Answer

Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,225 Dataiker
    edited July 17

    Hi,

    You can only retrieve the metrics from the dataset computed in the previous step.
    If you computed multiple datasets in that step you need to change the "CaMoYxZE_NP" with the respective dataset ID.

    toNumber(filter(parseJson(stepOutput_the_metrics)['SECFILLINGS.CaMoYxZE_NP']['computed'], x, x["metricId"]=="basic:COUNT_RECORDS")[0].value)


    If you want to retrieve it from other dataset where metrics were not computed in this particular scenario for example you can use : https://community.dataiku.com/t5/Using-Dataiku/Compute-Metrics-using-Python-API/m-p/24763

    Thanks,

    Thanks,

  • me2
    me2 Registered Posts: 50 ✭✭✭✭✭

    Thank you @AlexT

    I tried what you recommended and I am getting the following error.

    java.lang.Exception
    parseJson failed: Missing value at 0 [character 1 line 1]

    I made the following changes from your syntax so it can apply to my case:

    'SECFILLINGS.CaMoYxZE_NP' changed to 'projectID.datasetname_NP'

    Project ID I get from the URL.

    Datasetname, I could not locate a "dataset ID" so I've been using the dataset name. That is also what I see used in the scenario logs.

    I noticed your syntax for the metric is using

    basic:COUNT_RECORDS

    and I had been using

    records:COUNT_RECORDS

    I tried both and get the same error.

    Also, I have the "Evaluate Variable" toggle switch to ON.

    Anything else you recommend I try?

    Thank you.

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,043 Neuron
    edited July 17

    I believe the right formula for non-partitioned datasets will be:

    toNumber(filter(parseJson(stepOutput_REPLACE_WITH_COMPUTE_METRICS_STEP_NAME_WITH_NO_SPACES)['REPLACE_WITH_PROJECT_ID.REPLACE_WITHDATASET_ID_NP']['computed'], x, x["metricId"]=="records:COUNT_RECORDS")[0].value)

  • me2
    me2 Registered Posts: 50 ✭✭✭✭✭

    It finally worked and it turns out I was doing 2 things wrong.

    My final equation...

    toNumber(filter(parseJson(stepOutput_Compute_Metrics)['ProjID.DatasetID_NP']['computed'], x, x["metricId"]=="records:COUNT_RECORDS")[0].value)

    @AlexT
    mentioned I need to add the dataset ID. So I looked at the logs and dataset ID is the name of the dataset. In my case it was simply 'test'. 'ProjID.DatasetID_NP'. Thank you Alex!

    @Turribeach
    you kept stated to add the compute metrics step "stepOuput_REPLACE_WITH_COMPUTE_METRICS_STEP_NAME_WITH_NO_SPACES" and I did the mistake of leaving the default step name "Step #2". When I looked at the logs, the step name was "the_metrics". I tried variations of the two with no sucess... then it occurred to me to rename the Computer Metrics steps from the default to a name with no spaces. So I renamed it "Compute_Metrics". I updated the variable formula in "Define variable" step and SUCCESS!!!

    I am now able to send that scenario variable over to Reporter.

    Thank you!!

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,043 Neuron

    Glad you got there in the end. Certainly this is an area Dataiku should improve as suggested by this idea. On the positive side now you have the tools and the knowledge to use this feature going forward.

Setup Info
    Tags
      Help me…