Add rows to dataset and use it in input next time

scholaschl Dataiku DSS Core Concepts, Registered Posts: 8 ✭✭✭


I want to take a dataset in input data, to add rows to this dataset and during the future execution of the scenario, to take in input the dataset with the rows previously added (like a loop).

What would be the best way to proceed?

Thanks in advance,


Best Answer

  • nmadhu20
    nmadhu20 Neuron, Registered, Neuron 2022, Neuron 2023 Posts: 35 Neuron
    edited July 17 Answer ✓

    Hey @scholaschl
    So you want to append one row with each scenario run to an existing dataframe. Is that about right?

    There are two ways to solve this in my experience:

    • If you want the loop - You can have a python recipe and declare the output as input. Just create the new row array and append. Make sure for the first build, run the recipe without declaring it as input as it will throw an error since the dataset would be empty.
    import dataiku
    import numpy as np

    dt = dataiku.Dataset('dataset_name')
    df = dt.get_dataframe()

    #required computation
    new_arr = np.array(['col1_value', 'col2_value', 'col3_value']) #assuming it has 3 columns

    #finding the last row and appending after that in the existing dataframe
    df.loc([len(df)]) = new_arr
    • If you want to avoid the loop - you can create a recipe with all your computation and select the 'append instead of overwrite' option


    Hope it helps!




Setup Info
      Help me…