Submit your inspiring success story or innovative use case to the 2022 Dataiku Frontrunner Awards! ENTER YOUR SUBMISSION

count rows with condition

Solved!
Richard_CDC
Level 2
count rows with condition

Hi,

I want to count the number of rows with a condition. And after i want to put i as a projetc variable

There are my dataset with my code:

Dataset name is "Affaires_technique_groupe"

AgenceDIRState
LYONAURAFait
GRENOBLEAURAA faire
TOULONPACACA faire
AIXPACACA faire

 

df_data1=dataiku.Dataset("Affaires_technique_groupe").get_dataframe()
count = df_data1.count()
Scenario().set_project_variables(Nb_affaire=list(count)[0])

 

I would like to count the number of "A faire" and to put it in projetc variables but i don t know how to make the filter.

 

Do you have an idea ?

 

Thanks for the help


Operating system used: Windows 10


Operating system used: Windows 10

0 Kudos
1 Solution
Manuel
Dataiker
Dataiker

Hi,

You can calculate that ratio as an SQL probe resulting in a dataset metric as well.

You can then use our Python API to retrieve the value, https://doc.dataiku.com/dss/latest/python-api/metrics.html.

I hope this helps.

View solution in original post

0 Kudos
4 Replies
Manuel
Dataiker
Dataiker

Hi,

To do the count, there are at least two ways to do it without code:

  • In the dataset metrics, add an SQL probe that has your conditions. This results in a dataset metric that has your count (see attached images);
  • Add a Group recipe, with no grouping keys, that has a pre-filter in your condition. This results in a dataset with your count.

Why do you need the value as a variable?

If it is only to display somewhere, you can easily retrieve the values from the two above suggestions, to display in a dashboard for example.

I hope this helps.

0 Kudos
Richard_CDC
Level 2
Author

Hi,

I need it as variable because i want to calculate a ratio (nb "Affaire" / nb total) and  to put it in email.

That's why i would like to use python to count.

0 Kudos
Manuel
Dataiker
Dataiker

Hi,

You can calculate that ratio as an SQL probe resulting in a dataset metric as well.

You can then use our Python API to retrieve the value, https://doc.dataiku.com/dss/latest/python-api/metrics.html.

I hope this helps.

0 Kudos
Richard_CDC
Level 2
Author

Hi,

In metrics i have this code. I get the number of row but still don t know how to put a filter in it. Is it possible ?

import dataiku

def process(dataset):
df = dataset.get_dataframe()
return {'num_rows' : df.shape[0]}

0 Kudos