Survey banner
Share your feedback on the Dataiku documentation with this 5 min survey. Thanks!

# if statement with null values?

Solved!
Level 3
###### if statement with null values?

I have the below table (the left four columns) and would like to create a new column (called "Differences" as per below table), that is rounded to 3 decimal places

What If Statement can I use ?

I tried Round(if(Col1>0,Database-Col1,Database-Col2),3) ; but it did not work

 Category Database Col1 Col2 Differences ABC -0.0018 -0.0018 0.000 XYZ 4.49423 4.494231854 0.000
1 Solution

You could use a visual recipe step to "fold multiple columns" (unpivot) these columns (In your case Col1 & Col2).  (Particularly if there are actually more columns in your real dataset that will actually need to be evaluated in order to be successful.)  This will create a table that sort of looks like this.

 Catagory Database Col_Name Value ABC -0.0018 Col1 -0.0018 XYZ 4.49423 Col 2 4.494231854

From here if there are actually multiple columns that might appear Col1, Col 2, Col 3, Col 4 you need to use a Group by a recipe to come up with an aggregate.  Or you are going to do multiple calculations for each category.

However, if there are just two columns and there is only ever one column filled.  You could do your calculation simply in a formula step from the data above.

Once this stage is done you could choose to drop the Col_Name Colum in the dataset above.  Or even unfold the data back to the original shape.  Depends on what your actual data looks like.  Lots of opportunities here.

--Tom
6 Replies
Dataiker Alumni

How about filling the empty values with zeroes, before the formula step?

If you click on the column name > More Actions > Fill Empty Rows with ...

Level 5

The Coalesce-function would be my first choice here, possibly with an extra if-call to check if there are cases where both col1 and col2 have values. Using a temporary column might also be usefull as these combinations of functions can get confusing. "isNull"/"isNotNull" with != or == could do the trick aswell because 0 (zero) is not the same as Null.

edit : i may have been a bit unclear above. Your if-condition checks whether or not a value is bigger than zero, but an empty (Null) value is not the same as zero so that will not work. "isNotNull(col1)" checks for empty values.

The Coalesce-function returns the first non-Null value of a set of values you can specify. You'll find the description of this function under "String-functions" in the formulalanguage-helpsection (https://doc.dataiku.com/dss/8.0/advanced/formula.html). or press "help" -see picture-

Hope this helps, sorry for the late response!

You could use a visual recipe step to "fold multiple columns" (unpivot) these columns (In your case Col1 & Col2).  (Particularly if there are actually more columns in your real dataset that will actually need to be evaluated in order to be successful.)  This will create a table that sort of looks like this.

 Catagory Database Col_Name Value ABC -0.0018 Col1 -0.0018 XYZ 4.49423 Col 2 4.494231854

From here if there are actually multiple columns that might appear Col1, Col 2, Col 3, Col 4 you need to use a Group by a recipe to come up with an aggregate.  Or you are going to do multiple calculations for each category.

However, if there are just two columns and there is only ever one column filled.  You could do your calculation simply in a formula step from the data above.

Once this stage is done you could choose to drop the Col_Name Colum in the dataset above.  Or even unfold the data back to the original shape.  Depends on what your actual data looks like.  Lots of opportunities here.

--Tom
Level 5

Thank you very much for adding your view on this Tom! It turns out that what you describe is a more efficient solution to a problem i faced a while ago with those multiple columns (a lot of them actually).

Life gets easier with neurons around ๐ Would be funny if Dataiku-staff introduced synapses aswell!

Community Manager

Hey @Jurre - glad you managed to get a great solution!

I suppose the community itself would be the synapses right? Connecting neurons with other neurons (and of course all of our other users!)

I hope I helped! Do you Know that if I was Useful to you or Did something Outstanding you can Show your appreciation by giving me a KUDOS?

Looking for more resources to help you use DSS effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base