space character handling in Dataiku dataset

Retta
Retta Dataiku DSS Core Designer, Registered Posts: 1 ✭✭✭

Hi,

I have a Greenplum database table as an input for my dataset, let's name this as gp_tableA.

From this dataset, I have a Prepare recipe, we name it as compute_gp_tableA_prepared. I only use this recipe to rename some columns and remove unwanted ones.

I notice at the output dataset (gp_tableA_prepared), one of the record, originally have ' ' as value (a single space), converted to null/blank.

example:

in Greenplum Database: record 1 in ColumnA, value ' ' (a single space)

in Dataiku : record 1 in ColumnA, value '' (blank)

is there any tips to retain the ' ' (space) value, as i need this as part of the logic?


Operating system used: Windows 10

Answers

  • Manuel
    Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭

    Hi,

    I am not sure where the values are being replaced and without access to Greenplum, I cannot test.

    However, a workaround is that, in your prepare recipe, you replace the empty values with a space:

    • On the column dropdown, select Analyse
    • On the "empty values" click the pencil to edit the value as a space
    • This adds a new processor to the recipe (see attached image).

    I hope this helps

Setup Info
    Tags
      Help me…