Monthly Partitioning changes partition column value

Options
Gipple
Gipple Registered Posts: 5

I am trying to setup monthly partitioning on a date column in my snowflake database. I have the source table and output dataset set as monthly partitioning. In the middle I have a prepare recipe where I use the time range to get a month (screenshot below), the output of the posting_date field changes from an actual date, to the beginning of the month. I suspect it does this because I chose a monthly partition?

So I tried changing the partitioning to the day level. When I do this, I think it keeps the posting_date field as a day level, but this causes an insane slowdown. It is now doing the same query for each day instead of each month, but the queries aren't any faster, so essentially this makes the process about 30 times slower. I am brand new to dataiku and assume I am doing something wrong, because this seems like really strange behavior. Do I need to add a duplicate posting_date field, one for partitioning and one to keep the original values?

Here is a screenshot of the posting_date getting changed to the first of the month. I made a second posting_date to just carry through the recipe so I could see what the value was.

Capture2.PNG

Here is the input/output

Capture.PNG


Operating system used: Windows

Best Answer

Answers

Setup Info
    Tags
      Help me…