Multi-column pivots - aggregate mode
Currently, multi-column pivots create columns for every combination of values across all selected pivot columns. But my use-case is different- I'd like just one column per pivot column value, providing aggregates for each value independently. This would be helpful for feature generation, where I'm interested in representing the distribution of a particular dimension for each record. This effect can currently be achieved by creating one pivot recipe per pivot column, then joining them all together on the row identifiers.
Comments
-
Here's an example of the current workaround:
I create multiple pivot recipes to generate my aggregates, then join the results back together to create one table with all the fields I want.
It would be great if the pivot recipe could handle this directly. I think it'd be really useful for creating a normalized set of features to train ML models on.