Weighted Average - Pivot Recipe

bratman
Level 1
Weighted Average - Pivot Recipe

Hello all,

I've been trying to populate Pivot recipe output with weighted averages and failing.

For example: if values are Vx and the corresponding weights are Wx, weighted average is the

(Sum(Vx * Wx)/Sum(Wx)).

In Excel, you create a helper column of the numerator (Vx * Wx) and create a calculated field as Sum(Helper column/Weights).

 

Example:

Input table:

CategoryValues (Vx)Weights (Wx)Helper Column for Excel (Vx * Wx)
Category 11001.0100
Category 12000.5100
Category 24000.280
Category 21500.8120

 

Output table:

CategoryWeighted Average CalculationWeighted Output
Category 1(100 + 100)/(1+0.5)133.33
Category 2(80 + 120)/(0.2 + 0.8)200

 

Any solutions would be greatly appreciated!

0 Kudos
5 Replies
Ignacio_Toledo

Hi @bratman,

Do you have a sample dataset representing your input, and what is the expected output? I can think of different ways of achieving what you need, but I'm not sure what kind of "pivot" are you doing first.

Also, if I understand correctly, the weighted average is Sum(Vx * Wx)/Sum(Wx), but you wrote:

Sum(Sum(Vx * Wx)/Sum(Wx))

Which looks like a sum of the weighted averages. Is that a final aggregation you do?

Cheers!

bratman
Level 1
Author

Hello @Ignacio_Toledo, I have added an example. It looks like I cannot add any computations with more than 1 column involved. Any workaround would be appreciated!

0 Kudos
Ignacio_Toledo

Hi @bratman,

Thanks for providing example data. Here is my solution in a recorded video.

Basically, there are 3 stages to get from the input to the output:

  1. Prepare recipe to create the column with the "weighted values" for each row (vx * wx)
  2. Group by Category, and sum the weights and weighted values
  3. Prepare recipe to create the column with the final weighted average.

I hope this helps. There are other options, like using a python recipe, in which case you only need one step. But I guess the idea is to see how to do this with the visual recipes. Maybe other users have a more elegant solution?

Please let me know if you have any doubts.

bratman
Level 1
Author

Hello @Ignacio_Toledo, thanks a lot for your answer.

 

Is there any way I could calculate the weighted average at the visual recipe level?

 

For example, if I create a visualization I need weighted averages of the values (with filters) and Dataiku offers aggregates like count, min, max and average only. 

0 Kudos
Ignacio_Toledo

Hello @bratman

My pleasure to help with what I can.

I'm not sure if I understand correctly your question: do you mean if there is a way to calculate the weighted average in a single "visual recipe", without having to go through the "pivot" recipe? 

Or you are talking about the visualization tools, like the pivot tables? 

0 Kudos