Export command
I want to export my Dataiku flow. There are several options. I do not understand why the default option does not download Managed Folder Data. Why is that? Why would I not wish to export the data?
Among the four top options, one can choose to "Export all datasets". But in that case, why would I ever choose "Export all 'uploaded' dataset at the same time? In other words, why are there four checkboxes (which suggests that making multiple simultaneous selections makes sense) instead of radio buttons, which would indicate only one possible choice out of four? the documentation is very sparse on this matter.
Operating system used: Mac Ventura
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi @Erlebacher
,
Thanks for your feedback and suggestion about the radio button since the 2 options overlap. I would encourage you to raise this :
https://community.dataiku.com/t5/Product-Ideas/idb-p/Product_Ideas.There is far granularly on what can be included in the project bundles, depending on your use case for exporting you may want to consider using bundles and project deployer instead https://doc.dataiku.com/dss/latest/deployment/creating-bundles.html#additional-data
Where you can select which datasets/folders you want to include in the bundle.
Regarding project export, the managed folders is not enabled by default in the UI since this usually requires copying files from remote storage ( S3, Azure Blob, GCS) to the DSS instance to create to export the folder in the project zip. This can require a significant amount of disk space and in many cases is not desired as intermediate folders/datasets can be rebuilt. So it's left to the user to decide if this is needed.
Note that you can export projects from the API and configure the options you want to use as well
https://doc.dataiku.com/dss/latest/python-api/rest-api-client/projects.html#dataikuapi.dss.project.DSSProject.get_export_stream
THanks