Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hello all,
I've been trying to populate Pivot recipe output with weighted averages and failing.
For example: if values are Vx and the corresponding weights are Wx, weighted average is the
(Sum(Vx * Wx)/Sum(Wx)).
In Excel, you create a helper column of the numerator (Vx * Wx) and create a calculated field as Sum(Helper column/Weights).
Example:
Input table:
Category | Values (Vx) | Weights (Wx) | Helper Column for Excel (Vx * Wx) |
Category 1 | 100 | 1.0 | 100 |
Category 1 | 200 | 0.5 | 100 |
Category 2 | 400 | 0.2 | 80 |
Category 2 | 150 | 0.8 | 120 |
Output table:
Category | Weighted Average Calculation | Weighted Output |
Category 1 | (100 + 100)/(1+0.5) | 133.33 |
Category 2 | (80 + 120)/(0.2 + 0.8) | 200 |
Any solutions would be greatly appreciated!
Hi @bratman,
Do you have a sample dataset representing your input, and what is the expected output? I can think of different ways of achieving what you need, but I'm not sure what kind of "pivot" are you doing first.
Also, if I understand correctly, the weighted average is Sum(Vx * Wx)/Sum(Wx), but you wrote:
Sum(Sum(Vx * Wx)/Sum(Wx))
Which looks like a sum of the weighted averages. Is that a final aggregation you do?
Cheers!
Hello @Ignacio_Toledo, I have added an example. It looks like I cannot add any computations with more than 1 column involved. Any workaround would be appreciated!
Hi @bratman,
Thanks for providing example data. Here is my solution in a recorded video.
Basically, there are 3 stages to get from the input to the output:
I hope this helps. There are other options, like using a python recipe, in which case you only need one step. But I guess the idea is to see how to do this with the visual recipes. Maybe other users have a more elegant solution?
Please let me know if you have any doubts.
Hello @Ignacio_Toledo, thanks a lot for your answer.
Is there any way I could calculate the weighted average at the visual recipe level?
For example, if I create a visualization I need weighted averages of the values (with filters) and Dataiku offers aggregates like count, min, max and average only.
Hello @bratman.
My pleasure to help with what I can.
I'm not sure if I understand correctly your question: do you mean if there is a way to calculate the weighted average in a single "visual recipe", without having to go through the "pivot" recipe?
Or you are talking about the visualization tools, like the pivot tables?