I have an input dataset coming from files that I increment weekly with a structure including the day the file is created.
So I have partitioned this file based dataset using a time-based partitioning using a "day" period.
What I notice now is that if I add a new file containing new rows, it will create a new partition with a column containing the partition date that will be the same for all rows, which I do not want in my case.
Whether in a new partition or not, I would like to keep the rows where the column with the partition date is already set and add the current date where necessary.
Note: I know there is a Similar question, but as I could not find an answer, I come to ask the question myself
Many thanks 😊
Operating system used: Windows