Calculating Median

dhyadav79
Level 2
Calculating Median

Hi,

Could you please help how to calculate median in dataiku with example.

Thanks

0 Kudos
1 Reply
fchataigner2
Dataiker

the simplest is to use a SQL database (or Spark): load the data into a table, and use the database builtin median function (most databases have one). If you don't have access to a SQL database or to Spark, you can compute it in python with median() in Pandas if the data is not too large to fit into memory. If the data is too large for memory and you don't have SQL as an option, then using a window recipe to compute a rank and count on a window ordered by the column of which you seek the median, then filtering to keep the first row after P50 should yield the median