DSS sets TIMESTAMP_TZ and TIMESTAMP_NTZ columns as strings in output datasets schemas. However, the DATE type is considered a date in an output dataset even though it doesn't have a time. TIMESTAMP_TZ columns do have both a time and a time zone but are set to a string type. The behavior seems to differ by recipe type (in version 11.1.2). SQL Script recipes result in DATE and TIMESTAMP_LTZ columns being set as dates and usable as dates elsewhere in DSS (e.g., in charts). SQL Query recipes seem to set DATE and TIMESTAMP_LTZ as dates but they show with a meaning of Date Unparsed and can't be used as dates in Charts.
It would be great if DSS consistently worked as follows across all recipe types:
1. TIMESTAMP_TZ and TIMESTAMP_LTZ (point in time types) were both set as dates in output dataset schemas and were directly usable as dates elsewhere in DSS (e.g., in Charts).
2. The user had the option to specify that DATE and TIMESTAMP_NTZ columns (types with no time and/or time zone) be treated as dates in output dataset schemas and be directly usable as dates elsewhere in DSS.
Additionally, it would good if the output dataset schema types that were set by DSS didn't generate an error message when reviewing the output dataset schema. For example, DSS sets a Snowflake DATE column to date in the schema but then reports the following error when one clicks on the Dataset Schema tab: "Type mismatch for column 2 (AS_OF_DT) : 'TIMESTAMP' in dataset, 'DATE'(91:DATE) in table".
This is not unique to Snowflake but it affects all other sources which handle Date data types without time (Oracle, BigQuery, SQL Server, etc). The issue comes from the fact that theDateclass in Java is a representation of a specific instant in time, expressed in milliseconds. Therefore, this makes it impossible to ignore the time on a Date unless you use third party library. Personally I think Dataiku should support Date without time data types as these are extremely common both in data sources, data and Python.
We feel your pain and are currently investigating adding a new data type in DSS to support "Dates without time" values (aka DATE type in Snowflake).
Will keep you posted,
Thanks @apichery - that's great to hear. Date handling has been a real pain point for us since we moved to Snowflake (previously used Netezza and for whatever reason dates weren't much of a problem).
@Turribeach - thanks for your comment. I thought that dates had to be an instant in time too but currently the Snowflake DATE type is treated as a date (at least to some extent) while TIMESTAMP_TZ (which is an instant in time) is not. At least that's what I found in my testing.
In any case, it'd be great if all date types (including DATE and TIMESTAMP_NTZ) could be considered dates by DSS.
Only members of the Community can comment.