Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Is anyone familiar with a Dataiku visual recipe or formula that will replicate the following python code?
df.drop_duplicates(subset=['col1'], keep='first', inplace=True)
The distinct recipe does not quite accomplish this as it either removes duplicates based on all columns or only return the subset you selected.
Hi,
A group recipe could be used to do this task.
Make 'col1' the group key and select for the other columns to keep the first value. For example:
Hi,
A group recipe could be used to do this task.
Make 'col1' the group key and select for the other columns to keep the first value. For example: