Discover all of the brand-new features and improvements to existing capabilities in the Dataiku 11.3 updateLET'S GO

Groupby and transform

Solved!
Erlebacher
Level 4
Groupby and transform

I have a dataset with USER and ITEMS. I wish to perform a groupby and use size() or count() aggregation to find the count in each group. But I wish to then create a new column in the original dataset with this count. Using pandas, this is accomplished with the `transform` method. How is this done in Dataiku using any of the recipes (hopefully, the prepare recipe). Thanks.


Operating system used: mac ventura

0 Kudos
1 Solution
MiguelangelC
Dataiker

Hi Erlebacher,

The Prepare Recipe is primarily used for row operations, though there are some processors which can do operations across rows. In order to do aggregations, the Group or Window Recipes are more appropiate. Both can do counts and other aggregations out of the box. Moreover, you can write your own custom aggregations.

Regarding performing transformations in the original dataset, this is rather counter-intuitive to the way the Flow is laid out. At times, you can still power through and set the dataset to the output of a recipe point to the same data location as the recipe feed. However, overlapping datasets can potentially cause problems to the Flow's lineage.

View solution in original post

0 Kudos
1 Reply
MiguelangelC
Dataiker

Hi Erlebacher,

The Prepare Recipe is primarily used for row operations, though there are some processors which can do operations across rows. In order to do aggregations, the Group or Window Recipes are more appropiate. Both can do counts and other aggregations out of the box. Moreover, you can write your own custom aggregations.

Regarding performing transformations in the original dataset, this is rather counter-intuitive to the way the Flow is laid out. At times, you can still power through and set the dataset to the output of a recipe point to the same data location as the recipe feed. However, overlapping datasets can potentially cause problems to the Flow's lineage.

0 Kudos