Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi,
I had a recipe in DSS 2.3 that worked properly and doesn't work in DSS 3.0
# Read df from a dataset. "date" is a column of type "date in DSS
# df.date is a date column
# do stuff with df
df.fillna("")
# do stuff with df
dataset.write_with_schema(df)
In DSS 3.0, the output column is now a string, not a date anymore
Hi,
In DSS 3.0, DSS was upgraded to Pandas 0.17, which indeed introduces a behavior change regarding fillna on date columns.
* In DSS 2.3 / Pandas 0.16, filling a date column with "" filled the column with the "NaT" value ("Not a time") and kept the dtype - filling with "anyotherstring" failed
* In DSS 3.0 / Pandas 0.17, filling a date column with any string, whereas empty or not-empty now triggers a downcast of the column to object, which DSS then interprets as a string column
Pandas 0.16:
Pandas 0.17:
Filling a whole dataframe, containing mixed value types, with a single value is inherently dangerous. Both behaviors of Pandas are questionable, but in fine, you'd probably want to fillna only the columns for which it makes sense, with a properly-typed value