Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on May 12, 2021 4:42PM
Likes: 0
Replies: 1
Currently, multi-column pivots create columns for every combination of values across all selected pivot columns. But my use-case is different- I'd like just one column per pivot column value, providing aggregates for each value independently. This would be helpful for feature generation, where I'm interested in representing the distribution of a particular dimension for each record. This effect can currently be achieved by creating one pivot recipe per pivot column, then joining them all together on the row identifiers.
Here's an example of the current workaround:
I create multiple pivot recipes to generate my aggregates, then join the results back together to create one table with all the fields I want.
It would be great if the pivot recipe could handle this directly. I think it'd be really useful for creating a normalized set of features to train ML models on.