Run a scenario triggered by modification on S3 dataset.

Jesus · March 2022

Hi, i am creating a scenario for my flow and i want it to run after some S3 source dataset change. Being more specific i got two datasets that are the main object of trigger for the scenario. The first one comes from a "path in bucket" from S3 that read all files in that folder (which contains only one file that changes monthly). The other dataset comes from a different path in bucket that also read all files in that folder (in this case the dataset is generated with x partitions (one per file) , which give rise in the dataiku flow to a single S3 dataset with as many rows as the sum of rows of the original partitions plus their corresponding headers, basically a staking of files).
I want the scenario trigger to be a change in either of the two datasets. In the first one, a change of the file that generate the dataset, taking out the old one from the folder and adding the new one from our source in S3. For the second one, a change of one or more files that generate the different partitions from the dataset.
I thought that this could be easy solved with a dataset modified trigger, checking both datasets and trigger when one of them change, but this does not work. I checked some discussions in the community but did not find other ways to solve this. Just if you wonder the dataiku version i am using is: Version 9.0.4

Here is a screenshot of what i did:

Run a scenario triggered by modification on S3 dataset.

Categories

Setup Info

Tags