Submit your inspiring success story or innovative use case to the 2022 Dataiku Frontrunner Awards! ENTER YOUR SUBMISSION

Is It Possible to Add Multiple/All Datasets as Additional Content to a Bundle at Once

Is It Possible to Add Multiple/All Datasets as Additional Content to a Bundle at Once

Hello everyone!


I wanted to know if it's possible to add multiple/all datasets as additional content to a project bundle at once (i.e. add all by clicking and highlighting?) vs. manually selecting the datasets one at a time. The scenario I have is for example there might be a project with multiple uploaded datasets that I want to add to my project bundle at one time. Right now we can add all the datasets by individually clicking on each through the drop down. I'm not sure if it's possible to select all the appropriate tables at once instead, thanks!




0 Kudos
1 Reply

Hi @kathyqingyuxu ,

There is currently no bulk select and add for datasets available in the UI when creating a bundle and you have to add them one by one. 

If you have a lot of datasets you use the Python API to create a bundle that includes all datasets. You can find an example here: 

A slightly modified version that will add all uploaded datasets would be :

import dataiku

client = dataiku.api_client()
proj_key = "<replace-project-key>"

# Need to have a pre-created project variable called "version" assigned to a number as a count for the bundles
proj = client.get_project(proj_key)
proj_vars = proj.get_variables()
current_vers = proj_vars['standard']['version']

# Running raw['bundleExporterSettings'] = {}
# and saving the settings, there is no need for a dummy bundle.
settings = proj.get_settings()
raw = settings.get_raw()
raw['bundleExporterSettings'] = {}

#get list of all uploaded datasets

# Retrieve and update export options
settings = proj.get_settings()
export_options = settings.get_raw()['bundleExporterSettings']['exportOptions']
export_options['exportUploads'] = "True"
export_options['exportSavedModels'] = "True"
export_options['exportManagedFolders'] = "True"

#get list of all uploaded datasets to include other modify the if statement

datasets = proj.list_datasets()

for dataset in datasets:
    if dataset['type'] == 'UploadedFiles':
        ds_name = dataset['name']
        export_options['includedDatasetsData'].append({"name": ds_name})

#ucomment to add models
#export_options['includedSavedModels'].append({"id": "<model_ID>"})

# Setting bundleID
bundle_id = "version_" + str(current_vers)

# Creating bundle
    print(f"The bundle with ID {bundle_id} was created successfully")
    print("There was an error creating the bundle. See the traceback:")

# Increasing the count of the version variable:
proj_vars['standard']['version'] = current_vers + 1
print("Increased the count for the version")