Is there a way to achieve this excel formula using any of the Dataiku Recipes?
Hi,
Here is a scenario where in i need to have variable which can store count of number of 1's in column A.
Example: Lets say Column A has below values and i need to have variable which can store count of number of 1's from Column A.
Output: Variable C = 3
1
1
1
0
0
in Excel we use =COUNTIF(OFFSET(AR:AR,3,0,ROWS(AR:AR)-3),1)
Thanks!
Best Answer
-
Hi @RanjithJose
,If I understand your use case correctly, you don't need a recipe for this. You get what you need by using Metrics in a dataset. Open dataset (screenshot 1) > select "Status" tab (screenshot 2) and create a new python probe metric (screenshot 2) > enable metric you created (screenshot 3). Please refer to screenshots and sample code below:
1.
2.
3.
# Define here a function that returns the metric. offset = 1 # index start from 0 import pandas as pd def process(dataset, partition_id): # dataset is a dataiku.Dataset object df = dataset.get_dataframe() df2 = df[(df.new_column == 1) & (df.new_column.index > offset)] countif = len(df2) return {'countif' : countif}
If this not what you were looking for, please elaborate on your use case.
Best Regards,
Vitaliy
Answers
-
CoreyS Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Registered Posts: 1,150 ✭✭✭✭✭✭✭✭✭
Hi @RanjithJose
while you wait for a more detailed response, here are some resources that I think you may find when working from Excel to Dataiku DSS:- From Excel to Dataiku DSS (Dataiku Academy)
- The Excel-to-Dataiku Playbook
- From Excel To Dataiku DSS (Knowledge Base)
I hope this helps!
-
pvannies Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 16 Neuron
Hi @RanjithJose,
you could use a metric on your dataset that calculates the number of 1s, or in your specific example, just calculates the sum. You can find this in the Status tab of your dataset, see image below. Once you have setup the metric, add it to the computed metrics in the Metrics tab.If you want to use this variable throughout your project, you can set this as a project variable using a simple python script. This python script can be part of a scenario where you first compute the metric and then perform the script. Learning about scenarios is very useful, as they also provide other ways to set the project variables: Scenario steps — Dataiku DSS 9.0 documentation
Here is an example of the python script:
import dataiku client = dataiku.api_client() project = client.get_default_project() # get the existing project variables variables = project.get_variables() # replace your_dataset_name dataset = dataiku.Dataset("your_dataset_name") metrics = dataset.get_last_metric_values() # set the new project variable equal to the computed metric and save it variables['local']['Counts1ColumnA'] = metrics.get_metric_by_id('col_stats:SUM:A')['lastValues'][0].get('value') project.set_variables(variables)
Good luck!
-
Thanks Vitaliy D!