Check out the first Dataiku 8 Deep Dive focusing on Productivity on October 29th Read More

Automate bundle export import python script

Level 5
Automate bundle export import python script
Hi,

I would like to move from design node to automation node a project using a python script. Does Dataiku have a template or existing script for that? After I export a bundle how can I manage mappings and conflicts during the activate?

Thanks!
0 Kudos
12 Replies
Dataiker
Dataiker
Hi tomas,

You can package a project into a "bundle" and deploy it from the Design Node to the Automation Node.

Some links:

https://www.dataiku.com/learn/guide/tutorials/deploy-production.html
https://doc.dataiku.com/dss/latest/bundles/index.html
Jeremy, Product Manager at Dataiku
0 Kudos
Level 5
Author
Yes, I am using bundle python script.
p = designClient.get_project("MYPROJECT")
p.export_bundle(bundleId)
0 Kudos
Dataiker
Dataiker
Oh, I didn't understand you want to automate that with a Python script. Sorry.
Jeremy, Product Manager at Dataiku
0 Kudos
Dataiker
Dataiker

Hi, 



Please find attached an example script for exporting a bundle from a design node and deploying it to automation.




import dataikuapi

#Define the connections

#Design
host = "http://localhost:12345" # example to be changed
apiKey = "Foo123Foo123F123Foo123" # example to be changed

#Automation
host_auto = "http://localhost:23456" # example to be changed
apiKey_auto = "Bar234Bar234Bar234Bar234" # example to be changed

design_client= dataikuapi.DSSClient(host, apiKey)
auto_client = dataikuapi.DSSClient(host_auto, apiKey_auto)

version_bundle = "bundle_v1"
#Export bundle
project = design_client.get_project("MYSUPERPROJECT")
project.export_bundle(version_bundle)
project.download_exported_bundle_archive_to_file(version_bundle, "temp_bundle.zip")

#Import bundle
project_automation = auto_client.get_project("MYSUPERPROJECT")
project_automation.import_bundle_from_archive("temp_bundle.zip")

# Preload and activate bundle
project_automation.preload_bundle(version_bundle) # for code envs
project_automation.activate_bundle(version_bundle)


 



Note that it works also outside of DSS, which is why you need the API keys. To generate these, you can go to Administration > Security > Global/Personal API keys on the Design and Automation nodes.



At the moment, there is no way through the dataiku API to perform custom remapping of the connections. We assume that the connections are already setup and share the same logical name (although their inner values are different). Having said that, if you have already setup the custom remapping in the interface, promoting a new bundle through the API will keep the custom remapping.



Cheers,



Alex

Level 3
I assume that "temp_bundle.zip" is downloaded locally in this example, but is there a way to push the bundle from the design node onto the automation node directly?
0 Kudos
Dataiker
Dataiker
yes, you can use the template code locally on your design node:
import dataiku
design_client= dataiku.api_client()
For the download_exported_bundle_archive_to_file method, I advise using the Python library tempfile so that you download it to a temporary location. Otherwise, for archiving purposes, you could use a specific directory like /home/dataiku/project_bundles.
0 Kudos
Level 3
Alright, thanks Alex!
0 Kudos
Level 3
Level 3

Hello, I stumbled upon this thread while trying to import a bundle from Design to Automation using the Python SDK. Is it still the case that there is no way to do custom remapping of connections using the API?

0 Kudos
Level 2
Level 2

Hi there, 

 

There is a automated macro in DATAIKU, which will create bundle and move to target for you. 

You can check on that, it might help you.

 

0 Kudos
Level 3
Level 3

Hey Sudheer,

Do you mean this plugin: https://www.dataiku.com/product/plugins/project-deployment/ ? I have already been using it, but it doesn't offer a way to do connection remapping. You'd first need to do a manual import into the automation node to define the connection remapping and then you can use this plugin to automate that. I was looking for a way circumvent the manual upload step, because I'd like to include this export in a CI pipeline.

Best

Marawan

0 Kudos
Dataiker
Dataiker

Hey @marawan,

The APIs should be able to take care of connection remapping on the Automation Node. It will be an additional step to the bundle export or import. Here’s a script that builds on @Alex_Combessie's example from a while ago:

 

import json

# Print current automation project settings for remapping if they exist
project_automation = auto_client.get_project(project_key)
automation_settings = project_automation.get_settings()
automation_settings_dict = automation_settings.get_raw()

print(json.dumps(automation_settings_dict["bundleContainerSettings"]["remapping"], sort_keys=True, indent=2))

# Define remappings

remapping= {
    "codeEnvs": [],
    "connections": [
        {
            "source": "foo",
            "target": "bar"
        }
    ]
}

automation_settings_dict["bundleContainerSettings"]["remapping"] = remapping
automation_settings.save()

# Print settings with remappings
print(json.dumps(automation_settings_dict["bundleContainerSettings"]["remapping"], sort_keys=True, indent=2))

 

The results might look like this:

remapping_connections.png

As you said, the macro (h/t @sudheer_kumar) won’t take care of the connection remapping. Using this script for remapping, you could circumvent the manual import. It is analogous to remapping the connections via the UI. You can then possibly use the macro, as long as the remapping doesn’t change. Once it is set, the next imported bundles re-use it.

Let us know how it works for you.

Yashas

Level 3
Level 3

Hey @YashasV 

Perfect, thank you very much. We had actually already stumbled upon this solution a while ago while testing an unrelated matter and created a plugin based on the original Automation plugin to provide a visual way to do connection remapping, I just forgot to update this thread.

One thing I couldn't find however was a way to do this connection remapping when moving the bundle from one design node to another, because design nodes don't have this `bundleContainerSettings` option in their settings. Do you know of a different way to do connection remapping for design node targets? When you import a bundle (export) visually in the design node, you are prompted to do connection remapping for connections that do not exist. I however couldn't find an analog for this process in the API.

 Thank you

Marawan

0 Kudos
Labels (3)