How to ignore missing partitions when using time range dependency

MatthieuPx Registered Posts: 2

In my workflow datasets are partitioned at a day-level (file based partitioning, HDFS connector). For business reasons, some partitions are empty, for example the one corresponding to saturdays and sundays and holidays.

- when using a "Equals" dependency, the recipe runs even if the partition is missing/empty in the input dataset. It looks like some "ignoring of missing partitions" is handled here automatically.

- when using a "Time range" dependency (for example the partitions for the past 7 days are used to compute the partition for the day), the recipe fails, as partitions for the past week end (at least) are missing. See screenshot : it can't compute the partition for the 9th of january as the

Is there any way to setup the time range dependence to ignore the missing partitions ? So the recipe can still be computing, ignoring the empty partitions for week end or holidays.

2 years ago it looks like a similar question was raised (Solved: Re: Recipes with Time Range dependence partitions: is it possible to ignore missing partitions? - Dataiku Community) and the answer seemed to be : we cannot ignore missing partitions.

Has this situation changed ? It seems to me that this past answer is similar to "you cannot use time range dependency in dataiku when working with daily partitions". Is this conclusion unchanged ? I hope and trust i am missing something.... Thank you for your help


Best Answer

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Answer ✓

    Hi @MatthieuPx
    If I am not mistaken, you are looking for the "Missing partitions as empty" option in the input dataset. Which will skip gaps in your partitioning without having the recipe fail:

    Screenshot 2023-08-09 at 10.21.29 PM.png

Setup Info
      Help me…