How to ignore missing partitions when using time range dependency
In my workflow datasets are partitioned at a day-level (file based partitioning, HDFS connector). For business reasons, some partitions are empty, for example the one corresponding to saturdays and sundays and holidays.
- when using a "Equals" dependency, the recipe runs even if the partition is missing/empty in the input dataset. It looks like some "ignoring of missing partitions" is handled here automatically.
- when using a "Time range" dependency (for example the partitions for the past 7 days are used to compute the partition for the day), the recipe fails, as partitions for the past week end (at least) are missing. See screenshot : it can't compute the partition for the 9th of january as the
Is there any way to setup the time range dependence to ignore the missing partitions ? So the recipe can still be computing, ignoring the empty partitions for week end or holidays.
2 years ago it looks like a similar question was raised (Solved: Re: Recipes with Time Range dependence partitions: is it possible to ignore missing partitions? - Dataiku Community) and the answer seemed to be : we cannot ignore missing partitions.
Has this situation changed ? It seems to me that this past answer is similar to "you cannot use time range dependency in dataiku when working with daily partitions". Is this conclusion unchanged ? I hope and trust i am missing something.... Thank you for your help
Best Answer
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi @MatthieuPx
,
If I am not mistaken, you are looking for the "Missing partitions as empty" option in the input dataset. Which will skip gaps in your partitioning without having the recipe fail: