Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hello everyone!
I wanted to know if it's possible to add multiple/all datasets as additional content to a project bundle at once (i.e. add all by clicking and highlighting?) vs. manually selecting the datasets one at a time. The scenario I have is for example there might be a project with multiple uploaded datasets that I want to add to my project bundle at one time. Right now we can add all the datasets by individually clicking on each through the drop down. I'm not sure if it's possible to select all the appropriate tables at once instead, thanks!
Best,
Kathy
Hi @kathyqingyuxu ,
There is currently no bulk select and add for datasets available in the UI when creating a bundle and you have to add them one by one.
If you have a lot of datasets you use the Python API to create a bundle that includes all datasets. You can find an example here:
A slightly modified version that will add all uploaded datasets would be :
import dataiku
client = dataiku.api_client()
proj_key = "<replace-project-key>"
# Need to have a pre-created project variable called "version" assigned to a number as a count for the bundles
proj = client.get_project(proj_key)
proj_vars = proj.get_variables()
current_vers = proj_vars['standard']['version']
# Running raw['bundleExporterSettings'] = {}
# and saving the settings, there is no need for a dummy bundle.
settings = proj.get_settings()
raw = settings.get_raw()
raw['bundleExporterSettings'] = {}
settings.save()
#get list of all uploaded datasets
# Retrieve and update export options
settings = proj.get_settings()
export_options = settings.get_raw()['bundleExporterSettings']['exportOptions']
export_options['exportUploads'] = "True"
export_options['exportSavedModels'] = "True"
export_options['exportManagedFolders'] = "True"
#get list of all uploaded datasets to include other modify the if statement
datasets = proj.list_datasets()
for dataset in datasets:
if dataset['type'] == 'UploadedFiles':
ds_name = dataset['name']
export_options['includedDatasetsData'].append({"name": ds_name})
#ucomment to add models
#export_options['includedSavedModels'].append({"id": "<model_ID>"})
settings.save()
# Setting bundleID
bundle_id = "version_" + str(current_vers)
# Creating bundle
try:
proj.export_bundle(bundle_id)
print(f"The bundle with ID {bundle_id} was created successfully")
except:
print("There was an error creating the bundle. See the traceback:")
raise
# Increasing the count of the version variable:
proj_vars['standard']['version'] = current_vers + 1
proj.set_variables(proj_vars)
print("Increased the count for the version")
Hi @kathyqingyuxu ,
There is currently no bulk select and add for datasets available in the UI when creating a bundle and you have to add them one by one.
If you have a lot of datasets you use the Python API to create a bundle that includes all datasets. You can find an example here:
A slightly modified version that will add all uploaded datasets would be :
import dataiku
client = dataiku.api_client()
proj_key = "<replace-project-key>"
# Need to have a pre-created project variable called "version" assigned to a number as a count for the bundles
proj = client.get_project(proj_key)
proj_vars = proj.get_variables()
current_vers = proj_vars['standard']['version']
# Running raw['bundleExporterSettings'] = {}
# and saving the settings, there is no need for a dummy bundle.
settings = proj.get_settings()
raw = settings.get_raw()
raw['bundleExporterSettings'] = {}
settings.save()
#get list of all uploaded datasets
# Retrieve and update export options
settings = proj.get_settings()
export_options = settings.get_raw()['bundleExporterSettings']['exportOptions']
export_options['exportUploads'] = "True"
export_options['exportSavedModels'] = "True"
export_options['exportManagedFolders'] = "True"
#get list of all uploaded datasets to include other modify the if statement
datasets = proj.list_datasets()
for dataset in datasets:
if dataset['type'] == 'UploadedFiles':
ds_name = dataset['name']
export_options['includedDatasetsData'].append({"name": ds_name})
#ucomment to add models
#export_options['includedSavedModels'].append({"id": "<model_ID>"})
settings.save()
# Setting bundleID
bundle_id = "version_" + str(current_vers)
# Creating bundle
try:
proj.export_bundle(bundle_id)
print(f"The bundle with ID {bundle_id} was created successfully")
except:
print("There was an error creating the bundle. See the traceback:")
raise
# Increasing the count of the version variable:
proj_vars['standard']['version'] = current_vers + 1
proj.set_variables(proj_vars)
print("Increased the count for the version")