We're excited to announce that we're launching the second installment of Dataiku Product Days Register Now

Regular snapshots of datasets

Level 2
Regular snapshots of datasets
I was wondering if there's a best practice for snapshotting / creating regular backups of datasets. Let's say I have a flow and at the end of the flow I have a dataset containing customer information. As customer information changes over time, I want to be able to go back in time and see when some data point changed for a customer or identify when an error happened so that I can restore that data.

Is there any solution / best practice for this in Dataiku?
0 Kudos
1 Reply
Dataiker Alumni

Best practice is to use partitioned datasets: https://doc.dataiku.com/dss/latest/partitions/index.html. Having said that, this can be used for snapshotting data, but not as a backup strategy. For backup, you need to configure external backup systems on the database side and on the server hosting DSS: https://www.dataiku.com/learn/guide/admin/operations/backups.html.


0 Kudos


A banner prompting to get Dataiku DSS