How can I overwrite using new list?

kishiken
Level 1
How can I overwrite using new list?

Hi,

I want to overwrite one columns using new list but I don't know how to do that with Dataiku. I tried a lot cases with "prepare" but it was impossible.

Here is example. 

a18
b9
c10
d7
e12

 

And new list is like

b15
e4

 

I want to make new list like

a18
b15
c10
d7
e4

 

It's really simple but ... How can I overwrite with Dataiku?

 

0 Kudos
5 Replies
Liev
Dataiker Alumni

Hi @kishiken 

In your case, there are two datasets, one with the original data (A) and another with the new values (B). 
One way of achieving your desired results would be to create a join between the two.

The join should be left (A - B) or right (B-A) join.

In the computed columns you should define a formula such as this one:

if(isBlank(B_value), A_value, B_value)

As you can see, it will use the new value from B if one is available, or the existing one.

Good luck!

0 Kudos
kishiken
Level 1
Author

Hi Liev,
Thank you very much for your quick answering! It's really helpful.

I didn't know "computed columns"...! I tried it but it was impossible because my formula was invalid. For me, it's impossible to understand formula in Dataiku....

if(isBlank(B_value), A_value, B_value)

 I named two files as A and B, and I typed same formula in the box however it was invalid.


What is the precisely formula text in case I name two dataset as A and B?

0 Kudos
Liev
Dataiker Alumni

if your datasets have two columns each:

  • name
  • value

in the join recipe, under selected columns section, give the datasets prefix, let's say a and b. This would make the results of the join fields to be called

  • a_name
  • a_value
  • b_name
  • b_value

In this case the formula under post-join computed columns would be

if(isBlank(b_value), a_value, b_value)

 

0 Kudos
kishiken
Level 1
Author

Hi Liev,

Thank you very much for your considerate answer. 

I tried it but something went wrong...

Error message is "ERROR Invalid computed column: Operator cannot be translated to SQL: isBlank"

So, I changed "mode" from DSS formula to SQL expression.

The former error disappeared but anther error occurred.

"Failed to compute execution plan".

0 Kudos
Liev
Dataiker Alumni

The reason for this is that this is indeed invalid SQL syntax.

You have two options:

- Under Mode leave it as DSS formula, but change the execution engine to DSS. This can be done under the RUN button of the recipe, by clicking on the little cogs.

- Under Mode use SQL, in which case your formula should change to something SQL compatible like:

COALESCE(b_value, a_value)

 

0 Kudos