Conditions in a output / before writting into a database
So i have a job which basically its a dataset that writes data into a postgresql database.
The job would be daily so today it would write data with values from todays date, tommorow from tommorows....
How do i put a condition in dataiku if there is already data with todays date in the database dont write it again?
The flow looks like:
Download recipe - folder - table in db
Thank you
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi @chrishnet997
,You may want to consider using partitions in this case.
Before you sync to your SQL create a partitioned file-based ( filesystem or cloud storage) dataset that is partitioned by day.
https://knowledge.dataiku.com/latest/kb/data-prep/partitions/partitioning-redispatch.html
So the follow could look something like :
Download Recipe - Files in Folder -> Prepare( to truncate date today) -> Filesystem dataset -> Partitioned dataset and then final sync to database. You can specify the partition to build via a scenario each time to CURRENT_DAY /PREVIOUS_DAY so it only writes data from that particular day to the database.
Let me know if that helps!