Survey banner
The Dataiku Community is moving to a new home! We are temporary in read only mode: LEARN MORE

Compute median without code recipe

malalearning
Level 2
Compute median without code recipe

Hi everyone,

we have an issue about computing the median on dataiku on large datasets. In particular, we do not know how to compute the median of columns in a dataset without using python or pyspark. Is there any method using recipes that can achieve this task, in an efficient way if possible? Thanks to everybody

0 Kudos
2 Replies
AlexT
Dataiker

Hi,
Besides Python/Spark. If your datasets are SQL you can use SQL recipe.

For example ;

SELECT 
    MEDIAN("tshirt_price") as "median_tshirt_price",
    SUM("tshirt_quantity") as "total_tshirt_quantity",
    COUNT(*) as "total_orders"
FROM "PUBLIC"."table_name"
WHERE "tshirt_price" IS NOT NULL

 
result :
Screen Shot 2023-04-05 at 11.45.02 AM.png

0 Kudos
malalearning
Level 2
Author

Thanks Alex, but our datasets are not SQL

0 Kudos