Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
So i have a job which basically its a dataset that writes data into a postgresql database.
The job would be daily so today it would write data with values from todays date, tommorow from tommorows....
How do i put a condition in dataiku if there is already data with todays date in the database dont write it again?
The flow looks like:
Download recipe - folder - table in db
Thank you
Hi @chrishnet997 ,
You may want to consider using partitions in this case.
Before you sync to your SQL create a partitioned file-based ( filesystem or cloud storage) dataset that is partitioned by day.
https://knowledge.dataiku.com/latest/kb/data-prep/partitions/partitioning-redispatch.html
So the follow could look something like :
Download Recipe - Files in Folder -> Prepare( to truncate date today) -> Filesystem dataset -> Partitioned dataset and then final sync to database. You can specify the partition to build via a scenario each time to CURRENT_DAY /PREVIOUS_DAY so it only writes data from that particular day to the database.
Let me know if that helps!