Survey banner
Switching to Dataiku - a new area to help users who are transitioning from other tools and diving into Dataiku! CHECK IT OUT

Counting Similar Rows

Level 1
Counting Similar Rows

I have a dataset. Four of the columns are Location Number, Employee ID, Date, and Symbol. I would like to create a 5th column that contains counts of how many times a row with matching data in all four columns appears.

For example, there are 5 rows with Location Number=308, Employee ID=37, Date=03/23/2007 and Symbol=TP. The 5th column therefore is "5".

Any suggestions on how I could go about doing this?

0 Kudos
1 Reply

you can make a window recipe, partitioned on these 4 columns, with some column as ordering column (doesn't matter which), and a window frame activated with no limit on preceding nor following rows. And in the aggregates section, activate count on some column with no empty value (employee id for example). This will add on each row the count of rows with the same combination of the 4 columns used for partitioning.