Is It Possible to Add Multiple/All Datasets as Additional Content to a Bundle at Once
Hello everyone!
I wanted to know if it's possible to add multiple/all datasets as additional content to a project bundle at once (i.e. add all by clicking and highlighting?) vs. manually selecting the datasets one at a time. The scenario I have is for example there might be a project with multiple uploaded datasets that I want to add to my project bundle at one time. Right now we can add all the datasets by individually clicking on each through the drop down. I'm not sure if it's possible to select all the appropriate tables at once instead, thanks!
Best,
Kathy
Best Answer
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,212 Dataiker
Hi @kathyqingyuxu
,There is currently no bulk select and add for datasets available in the UI when creating a bundle and you have to add them one by one.
If you have a lot of datasets you use the Python API to create a bundle that includes all datasets. You can find an example here:
A slightly modified version that will add all uploaded datasets would be :
import dataiku client = dataiku.api_client() proj_key = "<replace-project-key>" # Need to have a pre-created project variable called "version" assigned to a number as a count for the bundles proj = client.get_project(proj_key) proj_vars = proj.get_variables() current_vers = proj_vars['standard']['version'] # Running raw['bundleExporterSettings'] = {} # and saving the settings, there is no need for a dummy bundle. settings = proj.get_settings() raw = settings.get_raw() raw['bundleExporterSettings'] = {} settings.save() #get list of all uploaded datasets # Retrieve and update export options settings = proj.get_settings() export_options = settings.get_raw()['bundleExporterSettings']['exportOptions'] export_options['exportUploads'] = "True" export_options['exportSavedModels'] = "True" export_options['exportManagedFolders'] = "True" #get list of all uploaded datasets to include other modify the if statement datasets = proj.list_datasets() for dataset in datasets: if dataset['type'] == 'UploadedFiles': ds_name = dataset['name'] export_options['includedDatasetsData'].append({"name": ds_name}) #ucomment to add models #export_options['includedSavedModels'].append({"id": "<model_ID>"}) settings.save() # Setting bundleID bundle_id = "version_" + str(current_vers) # Creating bundle try: proj.export_bundle(bundle_id) print(f"The bundle with ID {bundle_id} was created successfully") except: print("There was an error creating the bundle. See the traceback:") raise # Increasing the count of the version variable: proj_vars['standard']['version'] = current_vers + 1 proj.set_variables(proj_vars) print("Increased the count for the version")