How may I select a column based off position?

Rickh008 Dataiku DSS Core Designer, Registered Posts: 15 ✭✭✭✭

How may I select a column based off position within a recipe?

Is there a way to select a column using formula language [e.g. something like Column1 rather than val("column_name")]?

Is there a way to select a column using another type of step within a prepare recipe?



Best Answer

  • Zach
    Zach Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 153 Dataiker
    edited July 17 Answer ✓

    Hi @Rickh008

    You can accomplish this by using a Python recipe.

    The following example will add a new column to the dataset that contains the value of the 5th column (index 4):

    import dataiku
    COLUMN_NAME = "nth_column"
    """Name of the column that will be created"""
    """Index (position) of the column whose value you want to copy
    The index starts from 0, so the 2nd column has an index of 1
    # Read recipe inputs
    input_dataset = dataiku.Dataset("INPUT_DATASET")
    dataframe = input_dataset.get_dataframe()
    # Create a new column where the value is the value of the nth column
    dataframe[COLUMN_NAME] = dataframe.iloc[:, COLUMN_INDEX]
    # Write recipe outputs
    output_dataset = dataiku.Dataset("OUTPUT_DATASET")

    You can change the position of the column that is selected by changing the COLUMN_INDEX variable.

    Once the new column is created, you can then use it in any downstream recipes.



Setup Info
      Help me…