Survey banner
The Dataiku Community is moving to a new home! We are temporary in read only mode:

How to select a column by using values of other columns

Level 1
How to select a column by using values of other columns

I want to select an existing column but the name of that column will be dynamic. Which means it should be dependent upon another columns' value. For example - Suppose I have a dataset like

ID,Year,Month,Scen,09_2023,12_2023

1,2023,09,34.5,5,6

2,2023,09,56.2,1,3

3,2023,09,51.9,2,7

Now the new dataset would be combination of `Month_Year`, so for this example it would be

ID,Year,Month,Scen,09_2023

1,2023,09,34.5,5

2,2023,09,56.2,1

3,2023,09,51.9,2

4 Replies

I don't really understand what you are asking. The only difference I see between the two sample datasets is that the last column 12_2003 is removed in the second dataset. This makes no sense with the title of the post "How to select a column by using values of other columns". Can you please clarify what exactly you are trying to achieve? Don't describe the step but the goal, there might be a better way of achieving the goal.

Dataiker

Hi,

You can do this with a formula. In formula, the strval(col_name) function returns the value of the column "col_name". So what you have to do is to build the col_name by taking month and year, so with this input:

 ID Year Month Scen 09_2023 12_2023 1 2023 09 34.5 5 6 2 2023 09 56.2 1 3 3 2023 12 51.9 2 7

You would use formula (written on multiple lines for clarity)

``````strval(
concat(
strval("Month"),
"_",
strval("Year")
)
)``````

First we use concat to build a string that is the concatenation opf the value of the month, _ and the value of the year. Then we use strval to take the value of this column, which gets us our output:

Now I get it, nice one @Clément_Stenac!

Level 1
Author

you are almost there, but I want the name of the column (viz `output`) to be `09_2023` too. Also, the value of `Month` and `Year` is same throughout the dataset.