Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi,
In DSS recipe, is it possible to add a column to a dataframe with an ID indicating the row number ?
I am using R at the moment with the command
data <- tibble::rowid_to_column(data, "ID")
but it would be great to have this function build in in DSS prepare recipe.
Thanks
Hi,
In a prepare recipe, you could do that with a Python formula step, provided that your data is not parallelized when processed. If your input dataset is filesystem, try checking the "Preserve ordering" option in your input dataset's Advanced settings, then adding a Python step (in "cell" mode) to your Prepare recipe, with for instance this kind of code:
count = 0
def process(row):
global count
count = count + 1
return count
You can find more info on Data ordering here.
Since Dataiku V.7, it can be done visually with the prepare recipe: โoutput file record columnโ with "enrich record with context" processor