## Sign up to take part

Registered users can ask their own questions, contribute to discussions, and be part of the Community!

This website uses cookies. By clicking OK, you consent to the use of cookies. Read our cookie policy.

Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results forÂ

Registered users can ask their own questions, contribute to discussions, and be part of the Community!

- Community
- Â»
- Discussions
- Â»
- Using Dataiku
- Â»

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Weighted Average - Pivot Recipe

Hello all,

I've been trying to populate Pivot recipe output with weighted averages and failing.

For example: if values are Vx and the corresponding weights are Wx, weighted average is the

(Sum(Vx * Wx)/Sum(Wx)).

In Excel, you create a helper column of the numerator (Vx * Wx) and create a calculated field as Sum(Helper column/Weights).

__Example:__

__Input table:__

Category | Values (Vx) | Weights (Wx) | Helper Column for Excel (Vx * Wx) |

Category 1 | 100 | 1.0 | 100 |

Category 1 | 200 | 0.5 | 100 |

Category 2 | 400 | 0.2 | 80 |

Category 2 | 150 | 0.8 | 120 |

__Output table:__

Category | Weighted Average Calculation | Weighted Output |

Category 1 | (100 + 100)/(1+0.5) | 133.33 |

Category 2 | (80 + 120)/(0.2 + 0.8) | 200 |

Any solutions would be greatly appreciated!

Solutions shown first - Read whole discussion

5 Replies

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Hi @bratman,

Do you have a sample dataset representing your input, and what is the expected output? I can think of different ways of achieving what you need, but I'm not sure what kind of "pivot" are you doing first.

Also, if I understand correctly, the weighted average is Sum(Vx * Wx)/Sum(Wx), but you wrote:

Sum(Sum(Vx * Wx)/Sum(Wx))

Which looks like a sum of the weighted averages. Is that a final aggregation you do?

Cheers!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Hi @bratman,

Thanks for providing example data. Here is my solution in a recorded video.

Basically, there are 3 stages to get from the input to the output:

- Prepare recipe to create the column with the "weighted values" for each row (vx * wx)
- Group by Category, and sum the weights and weighted values
- Prepare recipe to create the column with the final weighted average.

I hope this helps. There are other options, like using a python recipe, in which case you only need one step. But I guess the idea is to see how to do this with the visual recipes. Maybe other users have a more elegant solution?

Please let me know if you have any doubts.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Hello @Ignacio_Toledo, thanks a lot for your answer.

Is there any way I could calculate the weighted average at the visual recipe level?

For example, if I create a visualization I need weighted averages of the values (with filters) and Dataiku offers aggregates like count, min, max and average only.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Hello @bratman.

My pleasure to help with what I can.

I'm not sure if I understand correctly your question: do you mean if there is a way to calculate the weighted average in a single "visual recipe", without having to go through the "pivot" recipe?

Or you are talking about the visualization tools, like the pivot tables?