Submit your innovative use case or inspiring success story to the 2023 Dataiku Frontrunner Awards! LET'S GO

Error with Code Recipe Python Formula

Solved!
BigAl
Level 1
Error with Code Recipe Python Formula
Hi I'm a new user & I trying to use this prepare recipe for the first time. I have to do a find & replace on a column but I have a large number of different values to replace so I created a dictionary with key(string): value(replacement string) pairs. I want to iterate through each row in the column. The code is:
 

def process(row):

  dict = { 'string' : 'replacement' }
  for string, replacement in dict.items():

     row = row.replace(string, replacement)
 
  return row
 
I get the error "Python runtime error/ <type 'exceptions.AttributeError'> : AttributeError("'dict' object has no attribute 'replace'",). " 
 
I can see in the documentation for cell mode "the process(row) function receives the input row as a dict" so this is the issue but don't know how to solve it. Any help would be appreciated cheers.
0 Kudos
1 Solution
JuanE
Dataiker

Hello,

You almost got it. Indeed, the row is received as a dictionary, with the column names as keys. So you have to access the values for a given column in a row by doing:

 

 

row['columnName']

 

 

Let's say you want to replace some values in 'colA' of your dataset. Then, your code would look like this:

 

 

replacement_dict = {
    'foo' : 'newfoo',
    'bar' : 'newbar'
}
def process(row):   
    for string, replacement in  replacement_dict.items(): 
        row['colA'] = row['colA'].replace(string, replacement)
    return row['colA']

 

 

This would have the following effect on an example dataset:

 

Capture1.PNG

 

Note that you don't have to use a Python function processor. You can use a "Find and replace" processor, which in my opinion, is more user-friendly:

Capture2.PNG

I hope that helps.

View solution in original post

2 Replies
JuanE
Dataiker

Hello,

You almost got it. Indeed, the row is received as a dictionary, with the column names as keys. So you have to access the values for a given column in a row by doing:

 

 

row['columnName']

 

 

Let's say you want to replace some values in 'colA' of your dataset. Then, your code would look like this:

 

 

replacement_dict = {
    'foo' : 'newfoo',
    'bar' : 'newbar'
}
def process(row):   
    for string, replacement in  replacement_dict.items(): 
        row['colA'] = row['colA'].replace(string, replacement)
    return row['colA']

 

 

This would have the following effect on an example dataset:

 

Capture1.PNG

 

Note that you don't have to use a Python function processor. You can use a "Find and replace" processor, which in my opinion, is more user-friendly:

Capture2.PNG

I hope that helps.

BigAl
Level 1
Author

Thanks JuanE 😊 . Wanted to use Python as I have around 400 possible replacements to search through which would be time consuming to do using Find & Replace.

0 Kudos