Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on May 11, 2023 2:47PM
Likes: 0
Replies: 1
Hello,
I have an input dataset coming from files that I increment weekly with a structure including the day the file is created.
So I have partitioned this file based dataset using a time-based partitioning using a "day" period.
What I notice now is that if I add a new file containing new rows, it will create a new partition with a column containing the partition date that will be the same for all rows, which I do not want in my case.
Whether in a new partition or not, I would like to keep the rows where the column with the partition date is already set and add the current date where necessary.
Note: I know there is a Similar question, but as I could not find an answer, I come to ask the question myself
Many thanks
Operating system used: Windows
Hi @Dmh911
,
I am not sure I fully understand what you are trying to keep precisely.
For the partitioned dataset, the column containing the partition will not be present in the dataset.
You can always add this back using prepare recipe and the processor - https://doc.dataiku.com/dss/latest/preparation/processors/enrich-with-record-context.html
Thanks