Compute median without code recipe
malalearning
Registered Posts: 7 ✭
Hi everyone,
we have an issue about computing the median on dataiku on large datasets. In particular, we do not know how to compute the median of columns in a dataset without using python or pyspark. Is there any method using recipes that can achieve this task, in an efficient way if possible? Thanks to everybody
Tagged:
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi,
Besides Python/Spark. If your datasets are SQL you can use SQL recipe.
For example ;SELECT MEDIAN("tshirt_price") as "median_tshirt_price", SUM("tshirt_quantity") as "total_tshirt_quantity", COUNT(*) as "total_orders" FROM "PUBLIC"."table_name" WHERE "tshirt_price" IS NOT NULL
result : -
Thanks Alex, but our datasets are not SQL