Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hello,
I'm trying to localize a column with dates from UTC to CET/CEST using a python recipe. The results are correct when I open it in the notebook but seems that Dataiku coerces the dates back to UTC when writing the dataframe.
Here below, a code that can reproduce then issue
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
from pandas.tseries.offsets import DateOffset
np.random.seed(0)
random_dates = pd.to_datetime(np.random.randint(
pd.Timestamp('2000-01-01').value,
pd.Timestamp('2022-12-31').value,
size=10), unit='ns')
df = pd.DataFrame({'DateTime_raw': random_dates})
#localize as UTC
df['DateTime_UTC'] = df['DateTime_raw'].dt.tz_localize('UTC')
#localize as CET/CEST
df['DateTime_CET'] = df['DateTime_UTC'].dt.tz_convert('CET')
#convert to string
df['DateTime_UTC_string'] = df['DateTime_UTC'].astype(str)
df['DateTime_CET_string'] = df['DateTime_CET'].astype(str)
this is one of the code outputs:
2012-03-18T23:26:35.403Z ['DateTime_UTC']
2012-03-18T23:26:35.403Z ['DateTime_CET]
2012-03-18 23:26:35.403958002+00:00 ['DateTime_UTC_string]
2012-03-19 00:26:35.403958002+01:00 ['DateTime_CET_string]
The CET localization is kept only if the date is stored as string. Apparently, this is a default behavior in Dataiku. I'm wondering if there are other solutions rather than necessarily store the column as string. Is it possible to disable it?
Thanks in advance,
Matteo
Hi,
Indeed, this is the expected behavior currently. As you've already found, the way to preserve the exact date format with time zone is for these columns to string.
https://doc.dataiku.com/dss/latest/preparation/dates.html#timezones-handling