## Sign up to take part

Registered users can ask their own questions, contribute to discussions, and be part of the Community!

This website uses cookies. By clicking OK, you consent to the use of cookies. Read our cookie policy.

Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results forย

Registered users can ask their own questions, contribute to discussions, and be part of the Community!

- Community
- ยป
- Discussions
- ยป
- Using Dataiku
- ยป

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

(Prepare Recipe) Formula Processor for range values in column of numerical and timestamp format.

So, I'm trying to create a column based on another column values, first would be on the numerical range column case I'm having.

As per the following explanation :

column_a |

0 |

1 |

2 |

3 |

4 |

5 |

... |

100 |

My objective is trying to create another column, let's just call it *column_b*, I'm aware, I could achieve this also through the mean of the bin recipe processor, but I wish to achieve them, through the formula processor instead.

So *column_a* consist of values ranging from 0 to 100. And I wish to create another column based on the range of values from the *column_a*. I've been having problem, since my code wouldn't work, as per the following example:

```
if (column_a <= 15, 0,
if(column_a > 16 && column_a <= 30, 1,
if(column_a > 31 && column_a <= 45, 2,
if(column_a > 46 && column_a <= 60, 3,
if(column_a > 61, 4)
))
)
```

I wish to assign value to column_b, per range of values coming from column_a.

- if value less or equal to 15, then assign value of 0

- if value greater then 16, but less or equal to 30, assign 1 and so forth.

And my second problem would still be similar, but this time with the Timestamp format, I wish to achieve similar table output but with different data format. As the following:

timestamp_a |

2015-01-01T00:00:00.000Z |

2015-01-02T00:00:00.000Z |

2015-01-03T00:00:00.000Z |

2015-01-04T00:00:00.000Z |

2015-01-05T00:00:00.000Z |

2015-01-06T00:00:00.000Z |

2015-01-07T00:00:00.000Z |

.... |

2020-05-31T00:00:00.000Z |

So I wish to achieve somewhat similar output, by assigning a value to another column based on the range value from the *timestamp_a,* as per the following:

```
if (timestamp_a <= โ2016-01-01T00:00:00.000Zโ, 0,
if (timestamp_a > โ2016-01-01T00:00:00.000Zโ && timestamp_a <= โ2016-12-31T00:00:00.000Zโ, 1,
if (timestamp_a > โ2017-01-01T00:00:00.000Zโ && timestamp_a <= โ2017-12-31T00:00:00.000Zโ, 2,
if (timestamp_a > โ2018-01-01T00:00:00.000Zโ && timestamp_a <= โ2018-12-31T00:00:00.000Zโ, 3,
if (timestamp_a > โ2019-01-01T00:00:00.000Zโ, 4)
))
)
```

And yes, I'm aware I could do this using another processor recipes that related to date manipulation, but I simply wish to achieve this through the mean of formula based one instead. Wish to assign new column values, based on the range of

Thanks for all the input.

Cheers,

Solutions shown first - Read whole discussion

2 Replies

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

There was a small issue in your formula. The last IF statement was missing the else clause, I think there was also a missing closing bracket.

```
if(column_a <= 15, 0,
if(column_a > 15 && column_a <= 30, 1,
if(column_a > 30 && column_a <= 45, 2,
if(column_a > 45 && column_a <= 60, 3,
if(column_a > 60, 4, "")
)
)
)
)
```

An alternative way of doing this, since your conditions are incremental, is to start from the end and work your way backwards. This will lead to a less verbose option, have a look as they produce the same outputs.

```
if(column_a > 60, 4,
if(column_a > 45, 3,
if(column_a > 30, 2,
if(column_a > 15, 1, 0)
)
)
)
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Hi @Liev, thanks. I also like the second option workaround better, really appreciate it. Would the solution also work on my timestamp data format too?

Thanks,

Didn't Find What You Needed?