nondeterministic writing of dates in the database

Solved!
Marek_Zak
Level 2
nondeterministic writing of dates in the database

Hello
I have a problem with unclear date conversion in Dataiku.
The scenario is as follows:

* Input table (postgresql) have column with data in text format (column type is varchar(32))

eg. content: 03-01-2019

* Next step is a SQL recipe:

select

TO_DATE(input_column, 'MM-dd-YYYY')::timestamp with time zone as output_column

from my_input_table

* Result is saves in my_output_table

 

It works but.

In one row date '03-12-2020' is saved like this: 2020-03-12 01:00:00.000

In second row date '05-01-2020' is saved like this: 2020-05-01 02:00:00.000

I do not understand why in one column some rows have different time zone than others ?!?!

0 Kudos
1 Solution
louisplt
Dataiker

I think you have to specify the time zone in your query:

 

SELECT TO_DATE(input_column, 'MM-dd-YYYY')::timestamp at time zone 'UTC' as output_column

 

 

 

 

View solution in original post

0 Kudos
5 Replies
louisplt
Dataiker

Hello @Marek_Zak,

My guess would be that this 1 hour difference comes from the daylight saving time. This explanation works for the two rows you shared.

Hope this helps

0 Kudos
Marek_Zak
Level 2
Author

Thanks for the clarification.
Now, can I somehow change this logic? Can I turn off this mechanism somewhere? There are no such surprises when doing such conversions in SQL from the database level. However, when this SQL goes through Dataiku, magic starts to happen. 🙁

0 Kudos
louisplt
Dataiker

If you want to avoid this behavior I suggest you remove the option "::timestamp with time zone" in your query.

0 Kudos
Marek_Zak
Level 2
Author

It would be great if I could do it, but Dataiku doesn't know "date" SQL type. When creating a column in the database, the "timestamp with time zone" type will do there anyway, imposing time zones for each value at its own discretion.

0 Kudos
louisplt
Dataiker

I think you have to specify the time zone in your query:

 

SELECT TO_DATE(input_column, 'MM-dd-YYYY')::timestamp at time zone 'UTC' as output_column

 

 

 

 

0 Kudos