Submit your inspiring success story or innovative use case to the 2022 Dataiku Frontrunner Awards! ENTER YOUR SUBMISSION

Add a column in dataset and update that column in a incremental way

deep_215
Level 2
Add a column in dataset and update that column in a incremental way

Hi

I am stuck in a situation where i need a column that has to get added in the dataset and that specific column will get updated incrementally based on its value in last row.

for e.g.

input table

Column 1Column 2Column 3
1ABC1/12/2022
2DEF3/7/2022

 

Here I want to add a new column say column 4 that will get incremented based on column 3 value for 4 times.

i.e. column 3 in in date format and it will act as base value for column 4.

Output  dataset

column 1column 2column 3column 4
1ABC1/12/20222/12/2022
1ABC1/12/20223/12/2022
1ABC1/12/20224/12/2022
1ABC1/12/20225/12/2022
2DEF3/7/20224/7/2022
2DEF3/7/20225/7/2022
2DEF3/7/20226/7/2022
2DEF3/7/20227/7/2022

 

in this if you observe for 1, column 3 has 1/12/2022 value and column 4 is incremented by 1 month each time, hence giving 4 rows for 1 .

 

Thanks in advance

 

0 Kudos
1 Reply
AlexGo
Dataiker
Dataiker

Hi,

I would do this with a custom python step (or create a plugin).

Something like the following although you'll have to adjust the format for dates.

Or you can create a new column with 'i' - being 1,2,3,4 - and use the 'increment date' Step): 

Screen Shot 2022-04-29 at 11.27.31 AM.png

def process(row):
    # Define parameters
    num_rows=4
    field_to_increment='price_first_item_purchased'

    ret = []
    for i in range (num_rows):
    
        row['new_column']=float(row[field_to_increment])+1
        newrow=dict(row)
        ret.append(newrow)
    
    return ret

 

 

 

0 Kudos