Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on August 23, 2022 3:53PM
Likes: 1
Replies: 1
How may I select a column based off position within a recipe?
Is there a way to select a column using formula language [e.g. something like Column1 rather than val("column_name")]?
Is there a way to select a column using another type of step within a prepare recipe?
Thanks
Hi @Rickh008
,
You can accomplish this by using a Python recipe.
The following example will add a new column to the dataset that contains the value of the 5th column (index 4):
import dataiku COLUMN_NAME = "nth_column" """Name of the column that will be created""" COLUMN_INDEX = 4 """Index (position) of the column whose value you want to copy The index starts from 0, so the 2nd column has an index of 1 """ # Read recipe inputs input_dataset = dataiku.Dataset("INPUT_DATASET") dataframe = input_dataset.get_dataframe() # Create a new column where the value is the value of the nth column dataframe[COLUMN_NAME] = dataframe.iloc[:, COLUMN_INDEX] # Write recipe outputs output_dataset = dataiku.Dataset("OUTPUT_DATASET") output_dataset.write_with_schema(dataframe)
You can change the position of the column that is selected by changing the COLUMN_INDEX variable.
Once the new column is created, you can then use it in any downstream recipes.
Thanks,
Zach