add an incremental column in dataset
requirement is to add an incremental column in datset, it should not be an identity column however data in it will be unique.
Best Answer
-
Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭
Hi,
You can use a Window recipe for this:
- In the Window definition, enable a order column
- In the Aggregations definition, compute the Row Number
- See attached image
For an overview of the Window recipe, check :
I hope this helps.
Answers
-
is it possible to do this in the prepare recipe? not that I don't know how to use window but would like to keep all changes consolidated in a single recipe if possible
-
Using a python code step in a prepare recipe:
count = 0 def process(row): global count count += 1 return count
-
Umut Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1 Dataiker
Hi,
You can use rowIndex function with Formula processor in the Prepare recipe. I just added a sample screenshot to show how this can be calculated.
I hope this helps,
Best,
-
Hi Umut
This is a great solution as I can now use it with my Spark engine in prepare recipe. But may I know how do you figure out the use of rowIndex? Was it documented anywhere?