This website uses cookies. By clicking OK, you consent to the use of cookies. Read our cookie policy.

Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

How can I overwrite using new list?

Hi,

I want to overwrite one columns using new list but I don't know how to do that with Dataiku. I tried a lot cases with "prepare" but it was impossible.

Here is example.

a | 18 |

b | 9 |

c | 10 |

d | 7 |

e | 12 |

And new list is like

b | 15 |

e | 4 |

I want to make new list like

a | 18 |

b | 15 |

c | 10 |

d | 7 |

e | 4 |

It's really simple but ... How can I overwrite with Dataiku?

Solutions shown first - Read whole discussion chronologically

5 Replies

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Hi @kishiken

In your case, there are two datasets, one with the original data (A) and another with the new values (B).

One way of achieving your desired results would be to create a join between the two.

The join should be left (A - B) or right (B-A) join.

In the computed columns you should define a formula such as this one:

`if(isBlank(B_value), A_value, B_value)`

As you can see, it will use the new value from B if one is available, or the existing one.

Good luck!

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Hi Liev,

Thank you very much for your quick answering! It's really helpful.

I didn't know "computed columns"...! I tried it but it was impossible because my formula was invalid. For me, it's impossible to understand formula in Dataiku....

if(isBlank(B_value), A_value, B_value)

I named two files as A and B, and I typed same formula in the box however it was invalid.

What is the precisely formula text in case I name two dataset as A and B?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

if your datasets have two columns each:

- name
- value

in the join recipe, under **selected columns** section, give the datasets prefix, let's say **a** and **b**. This would make the results of the join fields to be called

- a_name
- a_value
- b_name
- b_value

In this case the formula under **post-join computed columns** would be

```
if(isBlank(b_value), a_value, b_value)
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Hi Liev,

Thank you very much for your considerate answer.

I tried it but something went wrong...

Error message is "**ERROR** Invalid computed column: Operator cannot be translated to SQL: isBlank"

So, I changed "mode" from DSS formula to SQL expression.

The former error disappeared but anther error occurred.

"Failed to compute execution plan".

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

The reason for this is that this is indeed invalid SQL syntax.

You have two options:

- Under **Mode** leave it as DSS formula, but change the execution engine to DSS. This can be done under the RUN button of the recipe, by clicking on the little cogs.

- Under **Mode **use SQL, in which case your formula should change to something SQL compatible like:

```
COALESCE(b_value, a_value)
```