Is there a way to achieve this excel formula using any of the Dataiku Recipes?

RanjithJose
RanjithJose Dataiku DSS Core Concepts, Registered Posts: 13 ✭✭✭✭

Hi,

Here is a scenario where in i need to have variable which can store count of number of 1's in column A.

Example: Lets say Column A has below values and i need to have variable which can store count of number of 1's from Column A.

Output: Variable C = 3

1

1

1

0

0

in Excel we use =COUNTIF(OFFSET(AR:AR,3,0,ROWS(AR:AR)-3),1)

Thanks!

Best Answer

  • Vitaliy
    Vitaliy Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer Posts: 102 Dataiker
    edited July 17 Answer ✓

    Hi @RanjithJose
    ,

    If I understand your use case correctly, you don't need a recipe for this. You get what you need by using Metrics in a dataset. Open dataset (screenshot 1) > select "Status" tab (screenshot 2) and create a new python probe metric (screenshot 2) > enable metric you created (screenshot 3). Please refer to screenshots and sample code below:

    1.Screenshot 2021-06-17 at 08.44.30.png

    2.Screenshot 2021-06-17 at 08.40.53.png

    3.Screenshot 2021-06-17 at 08.42.08.png

    # Define here a function that returns the metric.
    offset = 1 # index start from 0
    import pandas as pd
    def process(dataset, partition_id):
        # dataset is a dataiku.Dataset object
        df = dataset.get_dataframe()
        df2 = df[(df.new_column == 1) & (df.new_column.index > offset)]
        countif = len(df2)
        return {'countif' : countif}

    If this not what you were looking for, please elaborate on your use case.

    Best Regards,

    Vitaliy

Answers

  • CoreyS
    CoreyS Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Registered Posts: 1,150 ✭✭✭✭✭✭✭✭✭

    Hi @RanjithJose
    while you wait for a more detailed response, here are some resources that I think you may find when working from Excel to Dataiku DSS:

    I hope this helps!

  • pvannies
    pvannies Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 18 Neuron
    edited July 17

    Hi @RanjithJose,

    you could use a metric on your dataset that calculates the number of 1s, or in your specific example, just calculates the sum. You can find this in the Status tab of your dataset, see image below. Once you have setup the metric, add it to the computed metrics in the Metrics tab.

    If you want to use this variable throughout your project, you can set this as a project variable using a simple python script. This python script can be part of a scenario where you first compute the metric and then perform the script. Learning about scenarios is very useful, as they also provide other ways to set the project variables: Scenario steps — Dataiku DSS 9.0 documentation

    Here is an example of the python script:

    import dataiku
    client = dataiku.api_client()
    project = client.get_default_project()
    # get the existing project variables
    variables = project.get_variables()
    
    # replace your_dataset_name 
    dataset = dataiku.Dataset("your_dataset_name")
    metrics = dataset.get_last_metric_values()
    
    # set the new project variable equal to the computed metric and save it
    variables['local']['Counts1ColumnA'] = metrics.get_metric_by_id('col_stats:SUM:A')['lastValues'][0].get('value')
    project.set_variables(variables)

    Good luck!

  • RanjithJose
    RanjithJose Dataiku DSS Core Concepts, Registered Posts: 13 ✭✭✭✭

    Thanks Vitaliy D!

Setup Info
    Tags
      Help me…