Survey banner
Switching to Dataiku - a new area to help users who are transitioning from other tools and diving into Dataiku! CHECK IT OUT

Issues with dates in Python recipe

UserBird
Dataiker
Issues with dates in Python recipe

Hi,



I had a recipe in DSS 2.3 that worked properly and doesn't work in DSS 3.0




# Read df from a dataset. "date" is a column of type "date in DSS
# df.date is a date column

# do stuff with df

df.fillna("")

# do stuff with df

dataset.write_with_schema(df)


In DSS 3.0, the output column is now a string, not a date anymore

0 Kudos
1 Reply
Clรฉment_Stenac

Hi,



In DSS 3.0, DSS was upgraded to Pandas 0.17, which indeed introduces a behavior change regarding fillna on date columns.



* In DSS 2.3 / Pandas 0.16, filling a date column with "" filled the column with the "NaT" value ("Not a time") and kept the dtype - filling with "anyotherstring" failed



* In DSS 3.0 / Pandas 0.17, filling a date column with any string, whereas empty or not-empty now triggers a downcast of the column to object, which DSS then interprets as a string column



Pandas 0.16:





Pandas 0.17:





Filling a whole dataframe, containing mixed value types, with a single value is inherently dangerous. Both behaviors of Pandas are questionable, but in fine, you'd probably want to fillna only the columns for which it makes sense, with a properly-typed value

0 Kudos