Community Conundrum 25:Feature Visualization is now live! Read More

How to fill empty cells of a column with the value of the corresponding row from another column

How to fill empty cells of a column with the value of the corresponding row from another column

Handling missing data is one data preparation challenge that analysts routinely face. Should you discard observations with missing values or perhaps impute missing values with a summary value like the median? 

To handle missing data, the Prepare recipe has dozens of built-in processors ready to solve many of the most common challenges without any coding. In addition, Dataiku DSS has its own Formula language to craft more custom solutions.

For example, in some cases, you may want to fill the empty cells of a column with values of the corresponding rows from another column. 

In a Prepare recipe, use the Formula processor with the `coalesce()` function as shown below:

Here we fill the empty values of `col1` with the corresponding values of `col2` in a new column.Here we fill the empty values of `col1` with the corresponding values of `col2` in a new column.

You can also specify multiple columns, or even directly specify the missing values.

Here we fill the empty values of `col1` with the values of `col2`, or  `0` when `col2` is also empty.Here we fill the empty values of `col1` with the values of `col2`, or `0` when `col2` is also empty.

The Formula language gives you the flexibility to achieve more customized tasks. For example, you can combine functions in the same expression.

Here we fill the empty values of `col1` with the corresponding floored values of `col2` in a new column.Here we fill the empty values of `col1` with the corresponding floored values of `col2` in a new column.

Where can I find more information?

  • See this article and video to learn more about using Formulas in Dataiku DSS.

What’s next?

  • You can also learn more about visual data wrangling more broadly with DSS with this series of hands-on tutorials.
Labels (3)
Version history
Revision #:
7 of 7
Last update:
‎05-07-2020 08:01 PM
Updated by:
 
Contributors