Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi Everyone,
I am in a process of moving my workflow from Alteryx to Dataiku. I am looking to resolve the percentile calculation for multiple columns. for example, As can be seen from the attached file, that I want to calculate the 10th percentile (as in columns H, I, J, and K) for four metrics (in columns D, E, F, and G).
Thank you
Operating system used: Windows
Hi @AZS1976,
You could use a Python recipe with Pandas library that contains this function to calculate any quantile https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.quantile.html#pandas-dataframe-quanti...
Below is an example Python code on how to calculate 10th percentile on more columns and write the results on an output dataset:
import dataiku
import pandas as pd
# Read recipe inputs
input = dataiku.Dataset("input_data")
input_df =input.get_dataframe()
new_row=input_df.quantile(0.1)
output_df=pd.DataFrame()
output_df=output_df.append(new_row, ignore_index=True)
# Write recipe outputs
output = dataiku.Dataset("output")
output.write_with_schema(output_df)
Hi @CatalinaS
Thank you for a quick and apt solution. I will try to run this python recipe.
Best regards
Hi @AZS1976,
You can also use the quantile processor in a Prepare recipe to get the 10th percentile.
Thanks!
Jordan
โ
Hi @CatalinaS
The Python recipe yields a single row with 10th percentile for every column. Can you please also help explain how to get to the 10th percentile for Hierarchy 2 level. e.g. if
hierarchy 1 is 'Continent'
Hierarchy 2 is 'Country'
Hierarchy 3 is 'City'
and I have to calculate and populate the data with Heirachy 2, level 10th percentile.
Thanks