I was wondering if there's a best practice for snapshotting / creating regular backups of datasets. Let's say I have a flow and at the end of the flow I have a dataset containing customer information. As customer information changes over time, I want to be able to go back in time and see when some data point changed for a customer or identify when an error happened so that I can restore that data.
Is there any solution / best practice for this in Dataiku?
Best practice is to use partitioned datasets: https://doc.dataiku.com/dss/latest/partitions/index.html. Having said that, this can be used for snapshotting data, but not as a backup strategy. For backup, you need to configure external backup systems on the database side and on the server hosting DSS: https://www.dataiku.com/learn/guide/admin/operations/backups.html.