How to fill empty cells of a column with the value of the corresponding row from another column

Dataiku
Dataiku Administrator, Dataiker, Alpha Tester Posts: 88 Administrator

Handling missing data is one data preparation challenge that analysts routinely face. Should you discard observations with missing values or perhaps impute missing values with a summary value like the median?

To handle missing data, the Prepare recipe has dozens of built-in processors ready to solve many of the most common challenges without any coding. In addition, Dataiku DSS has its own Formula language to craft more custom solutions.

For example, in some cases, you may want to fill the empty cells of a column with values of the corresponding rows from another column.

In a Prepare recipe, use the Formula processor with the `coalesce()` function as shown below:

kb-coalesce-1.png

You can also specify multiple columns, or even directly specify the missing values.

dimitri_0-1588874053641.png

The Formula language gives you the flexibility to achieve more customized tasks. For example, you can combine functions in the same expression.

kb-coalesce-3.png

Where can I find more information?

  • See this article and video to learn more about using Formulas in Dataiku DSS.

What’s next?

  • You can also learn more about visual data wrangling more broadly with DSS with this series of hands-on tutorials.
Setup Info
    Tags
      Help me…