## Sign up to take part

Registered users can ask their own questions, contribute to discussions, and be part of the Community!

This website uses cookies. By clicking OK, you consent to the use of cookies. Read our cookie policy.

Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Registered users can ask their own questions, contribute to discussions, and be part of the Community!

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Solved!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Filter and Replace on condition based in another column

Hey everyone.

I've searched around but couldn't find a solution for this problem.

I have the following data:

Description | Class |

Cable Repair | 1 |

Cable Repair | 20 |

Cable Repair | 20 |

Cable Repair | 1 |

Cable Charge | 10 |

Cable Adaptor | 1 |

Let's say i want to find and replace the values of Class when it is 20 and the Description is Cable Repair, leaving others as they are.

How would one go and create this condition on DSS?

As fair as i know, the Find and Replace Processor has not parameter to use other columns as conditions, does it? Is there any other processor i could use?

Thanks!

1 Solution

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Hi,

One of the solution could be to use the **Formula step** in a **Prepare recipe**. For example to have the behaviour you requested one could use the following formula (I chose 100 as the replacing value):

`if(and(Description=="Cable Repair", Class==20), 100, Class)`

As a side note, on a **Formula step** you have the possibility to click on **Edit** to have a dedicated editor with helpers to build your formula and once the formula is ready you have to click **Save** to apply the formula from the editor.

More information on the **Formula** can be found in the documentation: https://doc.dataiku.com/dss/latest/preparation/processors/formula.html

I hope this helps.

Regards,

Mickaël | R&D at Dataiku

Solutions shown first - Read whole discussion

4 Replies

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Hi,

One of the solution could be to use the **Formula step** in a **Prepare recipe**. For example to have the behaviour you requested one could use the following formula (I chose 100 as the replacing value):

`if(and(Description=="Cable Repair", Class==20), 100, Class)`

As a side note, on a **Formula step** you have the possibility to click on **Edit** to have a dedicated editor with helpers to build your formula and once the formula is ready you have to click **Save** to apply the formula from the editor.

More information on the **Formula** can be found in the documentation: https://doc.dataiku.com/dss/latest/preparation/processors/formula.html

I hope this helps.

Regards,

Mickaël | R&D at Dataiku

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Thanks, that surely helped, I had no idea about the Formula processor.

Now the problem I'm facing is regarding using that same condition but with a regex match, like so:

if(and(match(Description, '^Cable\s{0,}Rep.{0,}'), Class== 30), 42, Class)

But nothing is changed after applying the this processor. I'm using this documentation to try and make it work.

Thanks.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

There are two things regarding the formula you proposed:

- match returns an arrays of matches so you should consider for example that the size of the array is equals to 1
- you need to double the \ for the space code: \\s

So a valid formula would be:

`if(and(match(Description, '^Cable\\s{0,}Rep.{0,}').length()==1, Class==30), 42, Class)`

To easily debug your formula, I suggest to test each term to ensure they are correctly written. For example in this case, one can start outputting the result of **match(Description, '^Cable\\s{0,}Rep.{0,}')** that is an array and from there adjust the formula.

I hope this will help you write the proper formula for your use case.

Regards,

Mickaël | R&D at Dataiku

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Thanks, that did the trick! I had no idea about needing the double left slash.

Thanks for your time!