Python recipe removing null records

PM Registered Posts: 10 ✭✭✭✭

Hi all,

i realized that when using python recipes, records with all columns with null values are removed.

You can do a test preparing a dataset with only one column (should have null values). Any recipe you use on this dataset will keep all records, but python recipe (default recipe that makes a copy) will remove records with null values.

Is this the expected behaviour?


Best Answer

  • Clément_Stenac
    Clément_Stenac Dataiker, Dataiku DSS Core Designer, Registered Posts: 753 Dataiker
    Answer ✓


    Unfortunately, this is the expected behavior in the specific case of datasets with a single column.

    This is related to how the data is serialized internally: it uses CSV. For the case of single-column datasets, there is no separator and thus no real difference between an invalid empty line and a all-null line. This means that it would not be easy to change this behavior.

Setup Info
      Help me…