Read trailing 14 days data from a partitioned S3 location as /load_date=YYYY-MM-DD/load_hour=HH

Ankita
Ankita Registered Posts: 2

I want to read the trailing 14 days data from S3. I have already setup my S3 connection and want to read data for last 14 days load date and only the 24th load_hour. How can I apply filter just while reading the S3 location using S3 connection setup.
Since I need to do it for multiple data sources reading it individually is a big task where in I had to manually read these dates for all the 11 data sources.

Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,349 Dataiker

    Hi,
    You can use Partitioned Dataset : https://doc.dataiku.com/dss/latest/partitions/fs_datasets.html
    And in the partition dependency you can specific the last 14 days.

Setup Info
    Tags
      Help me…