Project Restoration in DSS
If a user accidentally deletes their project, I can restore it from a backup of the DSS filesystem tree.
Should I also restore the PostgreSQL database? What is its purpose?
Comments
-
Hi @SIAESDSI,
The PostgreSQL database primarily stores structured metadata and configuration, while the filesystem contains actual project data files, models, and scripts. You do not necessarily have to restore the entire PostgreSQL database unless there is corruption or loss of metadata integrity.
If your organization is leveraging Dataiku on Cloud Stacks, you can roll back to the last snapshot.
-
Yes, the filesystem is where the real project content is in DSS. But PostgreSQL is not just optional metadata. It stores things like project-level settings, permissions, dataset schemas, the state of recipes, job history, and more. If you only restore the filesystem, you should be able to see the project again, but things can get strange, like missing connections, broken settings, or objects that don't act right.
The best way to use DSS is to make sure that the filesystem and PostgreSQL come from the same time. You might be fine if this is a one-time recovery and everything looks good after restoring the project directory. If users say that things are acting strangely, settings are missing, or projects are corrupted, that's a sign that the PostgreSQL state is out of sync.
For anything important, the only safe way to do it is to snapshot and restore both at the same time. DSS really thinks that the metadata DB and filesystem move together.