Want to Stop Rebuilding "Expensive" Parts of your Flow? Explicit Builds are the Answer!READ MORE

Unexpected results with "trunc" function

NikolayK
Level 2
Level 2
Unexpected results with "trunc" function

It's probably rather a bug report than a discussion, but there are chances I'm doing something wrong.

I'm calling trunc(MyDateColumn, 'd') on a column and for "2021-10-31T22:00:00.000Z" it returns "2021-10-31T01:00:00.000Z". Obviously, I expect to get "2021-10-31T00:00:00.000Z".

0 Kudos
6 Replies
CatalinaS
Dataiker
Dataiker

Hi @NikolayK ,

 

I did a quick test using DSS version 11.0.1 and  I'm not able to reproduce this.

Screenshot 2022-09-06 at 09.27.06.png

Can you share more details? Is this issue happening for all the dates of the column MyDateColumn or only for this specific date? Where are you running the function trunc(MyDateColumn, 'd')? 

0 Kudos
NikolayK
Level 2
Level 2
Author

I did some more extensive testing and the issue seems to be quite subtle. Here's what I did:

1. If I use trunc() in a Prepare recipe, it works fine.

2. It breaks when trunc() is used for a pre-computed column in a Join recipe. What is more, the incumbent dataset is stored in S3 (Parquet), and the other joined dataset is stored in PostgreSQL (not sure if it has any significance). Most of the dates are truncated properly, there are only a few problems.

Let me know if you need more details.

0 Kudos
NikolayK
Level 2
Level 2
Author

Hi @CatalinaS . After some more work on it, it seems that it's the Join recipe that has a problem. Without any pre-computed columns, it changes "2021-10-31T00:00:00.000Z" to "2021-10-31T01:00:00.000Z".

0 Kudos
CatalinaS
Dataiker
Dataiker

Thanks for the details provided! I was able to reproduce the issue. Indeed the input date "2021-10-31T00:00:00.000Z" is changed to "2021-10-31T01:00:00.000Z" after a join recipe. The issue doesn't happen if the datetime column is set to be a string rather than a date.

One solution to this is to force the working timezone for the java processes of DSS to UTC and retry. To do this, follow Adding additional Java options and create or add a javaopts section with :
 
[javaopts]
jek.additional.opts = -Duser.timezone="UTC"
 
​A run of bin/dssadmin regenerate-config and a restart of DSS will be needed afterwards.
 
0 Kudos
NikolayK
Level 2
Level 2
Author

Thanks @CatalinaS! Do you have more details about the circumstances under which this problem may occur, so that I can assess the impact?

0 Kudos
NikolayK
Level 2
Level 2
Author

@CatalinaS, also, is this a bug that you plan to fix in future releases?

0 Kudos