Is It Possible to Add Multiple/All Datasets as Additional Content to a Bundle at Once

kathyqingyuxu
kathyqingyuxu Neuron, Registered, Neuron 2022 Posts: 46 Neuron

Hello everyone!

I wanted to know if it's possible to add multiple/all datasets as additional content to a project bundle at once (i.e. add all by clicking and highlighting?) vs. manually selecting the datasets one at a time. The scenario I have is for example there might be a project with multiple uploaded datasets that I want to add to my project bundle at one time. Right now we can add all the datasets by individually clicking on each through the drop down. I'm not sure if it's possible to select all the appropriate tables at once instead, thanks!

Best,

Kathy

Best Answer

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
    edited July 17 Answer ✓

    Hi @kathyqingyuxu
    ,

    There is currently no bulk select and add for datasets available in the UI when creating a bundle and you have to add them one by one.

    If you have a lot of datasets you use the Python API to create a bundle that includes all datasets. You can find an example here:

    https://community.dataiku.com/t5/Using-Dataiku/Create-Bundle-which-includes-contents-with-Python-API/m-p/20798

    A slightly modified version that will add all uploaded datasets would be :


    import dataiku
    
    client = dataiku.api_client()
    proj_key = "<replace-project-key>"
    
    # Need to have a pre-created project variable called "version" assigned to a number as a count for the bundles
    proj = client.get_project(proj_key)
    proj_vars = proj.get_variables()
    current_vers = proj_vars['standard']['version']
    
    # Running raw['bundleExporterSettings'] = {}
    # and saving the settings, there is no need for a dummy bundle.
    settings = proj.get_settings()
    raw = settings.get_raw()
    raw['bundleExporterSettings'] = {}
    settings.save()
    
    #get list of all uploaded datasets
    
    # Retrieve and update export options
    settings = proj.get_settings()
    export_options = settings.get_raw()['bundleExporterSettings']['exportOptions']
    export_options['exportUploads'] = "True"
    export_options['exportSavedModels'] = "True"
    export_options['exportManagedFolders'] = "True"
    
    #get list of all uploaded datasets to include other modify the if statement
    
    datasets = proj.list_datasets()
    
    for dataset in datasets:
        if dataset['type'] == 'UploadedFiles':
            ds_name = dataset['name']
            export_options['includedDatasetsData'].append({"name": ds_name})
    
    #ucomment to add models
    #export_options['includedSavedModels'].append({"id": "<model_ID>"})
    settings.save()
    
    # Setting bundleID
    bundle_id = "version_" + str(current_vers)
    
    # Creating bundle
    try:
        proj.export_bundle(bundle_id)
        print(f"The bundle with ID {bundle_id} was created successfully")
    except:
        print("There was an error creating the bundle. See the traceback:")
        raise
    
    # Increasing the count of the version variable:
    proj_vars['standard']['version'] = current_vers + 1
    proj.set_variables(proj_vars)
    print("Increased the count for the version")
Setup Info
    Tags
      Help me…