Dates localization coerced to UTC

mfnz
Level 1
Dates localization coerced to UTC

Hello,
I'm trying to localize a column with dates from UTC to CET/CEST using a python recipe. The results are correct when I open it in the notebook but seems that Dataiku coerces the dates back to UTC when writing the dataframe. 

Here below, a code that can reproduce then issue

import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
from pandas.tseries.offsets import DateOffset

np.random.seed(0)
random_dates = pd.to_datetime(np.random.randint(
    pd.Timestamp('2000-01-01').value,
    pd.Timestamp('2022-12-31').value,
    size=10), unit='ns')

df = pd.DataFrame({'DateTime_raw': random_dates})

#localize as UTC
df['DateTime_UTC'] = df['DateTime_raw'].dt.tz_localize('UTC')

#localize as CET/CEST
df['DateTime_CET'] = df['DateTime_UTC'].dt.tz_convert('CET')

#convert to string
df['DateTime_UTC_string'] = df['DateTime_UTC'].astype(str)
df['DateTime_CET_string'] = df['DateTime_CET'].astype(str)

 this is one of the code outputs:

2012-03-18T23:26:35.403Z ['DateTime_UTC']
2012-03-18T23:26:35.403Z ['DateTime_CET]
2012-03-18 23:26:35.403958002+00:00 ['DateTime_UTC_string]
2012-03-19 00:26:35.403958002+01:00 ['DateTime_CET_string]

The CET localization is kept only if the date is stored as string. Apparently, this is a default behavior in Dataiku. I'm wondering if there are other solutions rather than necessarily store the column as string. Is it possible to disable it? 

Thanks in advance,

Matteo

0 Kudos
1 Reply
AlexT
Dataiker

Hi,
Indeed, this is the expected behavior currently. As you've already found, the way to preserve the exact date format with time zone is for these columns to string.

https://doc.dataiku.com/dss/latest/preparation/dates.html#timezones-handling
 


0 Kudos