Replace values in multiple columns with respective column name
Hi All,
I am wondering if there is any easy way to replace values in multiple columns, with the respective column name. It would be easy to use formula or replace processor, but I have 200+ columns, so I am wondering if there is any easy way to do this at once, without typing in each column name (maybe just select the columns and say wherever there is a value in that column, replace with respective column name)
To be more specific, I have a dataset that has over 200 columns like the ones in the screen shot an I want it for example, in the first column replace 1 with SA, in the second column replace 1 with US and so on (screenshot attached)
Is this possible?
Thanks
Best Answer
-
Hi @crisdrgmr
,You can use a Python recipe with below code to replace the values with the respective column name at once:
import dataiku import pandas as pd, numpy as np from dataiku import pandasutils as pdu # Read recipe inputs input = dataiku.Dataset("input") df = input.get_dataframe() for i in range(len(df)) : for col in df.columns: if df.loc[i,col] ==1 : df.loc[i,col]= str(col) # Write recipe outputs output = dataiku.Dataset("output") output.write_with_schema(df)
Answers
-
@CatalinaS
thank you! Not sure why I didn't think about using a Python recipe, maybe because I was already using a Prepare recipe and thought maybe there is an easy way in the prepare recipe itself so I don't use another recipe. But yes, I agree, I can use a separate recipe for this.Thanks!