Submit your innovative use case or inspiring success story to the 2023 Dataiku Frontrunner Awards! LET'S GO

Calculating Median

Level 1
Calculating Median


Could you please help how to calculate median in dataiku with example.


0 Kudos
1 Reply

the simplest is to use a SQL database (or Spark): load the data into a table, and use the database builtin median function (most databases have one). If you don't have access to a SQL database or to Spark, you can compute it in python with median() in Pandas if the data is not too large to fit into memory. If the data is too large for memory and you don't have SQL as an option, then using a window recipe to compute a rank and count on a window ordered by the column of which you seek the median, then filtering to keep the first row after P50 should yield the median