Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi,
Here is a scenario where in i need to have variable which can store count of number of 1's in column A.
Example: Lets say Column A has below values and i need to have variable which can store count of number of 1's from Column A.
Output: Variable C = 3
1
1
1
0
0
in Excel we use =COUNTIF(OFFSET(AR:AR,3,0,ROWS(AR:AR)-3),1)
Thanks!
Hi @RanjithJose,
If I understand your use case correctly, you don't need a recipe for this. You get what you need by using Metrics in a dataset. Open dataset (screenshot 1) > select "Status" tab (screenshot 2) and create a new python probe metric (screenshot 2) > enable metric you created (screenshot 3). Please refer to screenshots and sample code below:
1.
2.
3.
# Define here a function that returns the metric.
offset = 1 # index start from 0
import pandas as pd
def process(dataset, partition_id):
# dataset is a dataiku.Dataset object
df = dataset.get_dataframe()
df2 = df[(df.new_column == 1) & (df.new_column.index > offset)]
countif = len(df2)
return {'countif' : countif}
If this not what you were looking for, please elaborate on your use case.
Best Regards,
Vitaliy
Hi @RanjithJose while you wait for a more detailed response, here are some resources that I think you may find when working from Excel to Dataiku DSS:
I hope this helps!
Hi @RanjithJose,
you could use a metric on your dataset that calculates the number of 1s, or in your specific example, just calculates the sum. You can find this in the Status tab of your dataset, see image below. Once you have setup the metric, add it to the computed metrics in the Metrics tab.
If you want to use this variable throughout your project, you can set this as a project variable using a simple python script. This python script can be part of a scenario where you first compute the metric and then perform the script. Learning about scenarios is very useful, as they also provide other ways to set the project variables: Scenario steps โ Dataiku DSS 9.0 documentation
Here is an example of the python script:
import dataiku
client = dataiku.api_client()
project = client.get_default_project()
# get the existing project variables
variables = project.get_variables()
# replace your_dataset_name
dataset = dataiku.Dataset("your_dataset_name")
metrics = dataset.get_last_metric_values()
# set the new project variable equal to the computed metric and save it
variables['local']['Counts1ColumnA'] = metrics.get_metric_by_id('col_stats:SUM:A')['lastValues'][0].get('value')
project.set_variables(variables)
Good luck!
Hi @RanjithJose,
If I understand your use case correctly, you don't need a recipe for this. You get what you need by using Metrics in a dataset. Open dataset (screenshot 1) > select "Status" tab (screenshot 2) and create a new python probe metric (screenshot 2) > enable metric you created (screenshot 3). Please refer to screenshots and sample code below:
1.
2.
3.
# Define here a function that returns the metric.
offset = 1 # index start from 0
import pandas as pd
def process(dataset, partition_id):
# dataset is a dataiku.Dataset object
df = dataset.get_dataframe()
df2 = df[(df.new_column == 1) & (df.new_column.index > offset)]
countif = len(df2)
return {'countif' : countif}
If this not what you were looking for, please elaborate on your use case.
Best Regards,
Vitaliy
Thanks Vitaliy D!