Submit your use case or success story to the 2023 edition of the Dataiku Frontrunner Awards ENTER YOUR SUBMISSION

Replace values in multiple columns with respective column name

Solved!
crisdrgmr
Level 2
Replace values in multiple columns with respective column name

Hi All,

I am wondering if there is any easy way to replace values in multiple columns, with the respective column name. It would be easy to use formula or replace processor, but I have 200+ columns, so I am wondering if there is any easy way to do this at once, without typing in each column name (maybe just select the columns and say wherever there is a value in that column, replace with respective column name)

To be more specific, I have a dataset that has over 200 columns like the ones in the screen shot an I want it for example, in the first column replace 1 with SA, in the second column replace 1 with US and so on (screenshot attached)

Is this possible?

Thanks

0 Kudos
1 Solution
CatalinaS
Dataiker

Hi @crisdrgmr,

 

You can use a Python recipe with below code to replace the values with the respective column name at once:

import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu

# Read recipe inputs
input = dataiku.Dataset("input")
df = input.get_dataframe()


for i in range(len(df)) : 
     for col in df.columns:                          
        if df.loc[i,col] ==1 :
            df.loc[i,col]= str(col)  


# Write recipe outputs
output = dataiku.Dataset("output")
output.write_with_schema(df)

  

View solution in original post

0 Kudos
2 Replies
CatalinaS
Dataiker

Hi @crisdrgmr,

 

You can use a Python recipe with below code to replace the values with the respective column name at once:

import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu

# Read recipe inputs
input = dataiku.Dataset("input")
df = input.get_dataframe()


for i in range(len(df)) : 
     for col in df.columns:                          
        if df.loc[i,col] ==1 :
            df.loc[i,col]= str(col)  


# Write recipe outputs
output = dataiku.Dataset("output")
output.write_with_schema(df)

  

0 Kudos
crisdrgmr
Level 2
Author

@CatalinaS thank you! Not sure why I didn't think about using a Python recipe, maybe because I was already using a Prepare recipe and thought maybe there is an easy way in the prepare recipe itself so I don't use another recipe. But yes, I agree, I can use a separate recipe for this.

 

Thanks!

0 Kudos