Conditions in a output / before writting into a database

chrishnet997
chrishnet997 Registered Posts: 4 ✭✭✭

So i have a job which basically its a dataset that writes data into a postgresql database.

The job would be daily so today it would write data with values from todays date, tommorow from tommorows....

How do i put a condition in dataiku if there is already data with todays date in the database dont write it again?

The flow looks like:

Download recipe - folder - table in db

Thank you

Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,212 Dataiker

    Hi @chrishnet997
    ,

    You may want to consider using partitions in this case.

    Before you sync to your SQL create a partitioned file-based ( filesystem or cloud storage) dataset that is partitioned by day.

    https://knowledge.dataiku.com/latest/kb/data-prep/partitions/partitioning-redispatch.html

    So the follow could look something like :

    Download Recipe - Files in Folder -> Prepare( to truncate date today) -> Filesystem dataset -> Partitioned dataset and then final sync to database. You can specify the partition to build via a scenario each time to CURRENT_DAY /PREVIOUS_DAY so it only writes data from that particular day to the database.

    Screenshot 2022-05-10 at 13.08.47.png

    Let me know if that helps!

Setup Info
    Tags
      Help me…