Automate bundle export import python script
I would like to move from design node to automation node a project using a python script. Does Dataiku have a template or existing script for that? After I export a bundle how can I manage mappings and conflicts during the activate?
Thanks!
Answers
-
Hi tomas,
You can package a project into a "bundle" and deploy it from the Design Node to the Automation Node.
Some links:
https://www.dataiku.com/learn/guide/tutorials/deploy-production.html
https://doc.dataiku.com/dss/latest/bundles/index.html -
Yes, I am using bundle python script.
p = designClient.get_project("MYPROJECT")
p.export_bundle(bundleId) -
Oh, I didn't understand you want to automate that with a Python script. Sorry.
-
Hi,
Please find attached an example script for exporting a bundle from a design node and deploying it to automation.
import dataikuapi
#Define the connections
#Design
host = "http://localhost:12345" # example to be changed
apiKey = "Foo123Foo123F123Foo123" # example to be changed
#Automation
host_auto = "http://localhost:23456" # example to be changed
apiKey_auto = "Bar234Bar234Bar234Bar234" # example to be changed
design_client= dataikuapi.DSSClient(host, apiKey)
auto_client = dataikuapi.DSSClient(host_auto, apiKey_auto)
version_bundle = "bundle_v1"
#Export bundle
project = design_client.get_project("MYSUPERPROJECT")
project.export_bundle(version_bundle)
project.download_exported_bundle_archive_to_file(version_bundle, "temp_bundle.zip")
#Import bundle
project_automation = auto_client.get_project("MYSUPERPROJECT")
project_automation.import_bundle_from_archive("temp_bundle.zip")
# Preload and activate bundle
project_automation.preload_bundle(version_bundle) # for code envs
project_automation.activate_bundle(version_bundle)Note that it works also outside of DSS, which is why you need the API keys. To generate these, you can go to Administration > Security > Global/Personal API keys on the Design and Automation nodes.
At the moment, there is no way through the dataiku API to perform custom remapping of the connections. We assume that the connections are already setup and share the same logical name (although their inner values are different). Having said that, if you have already setup the custom remapping in the interface, promoting a new bundle through the API will keep the custom remapping.
Cheers,
Alex
-
I assume that "temp_bundle.zip" is downloaded locally in this example, but is there a way to push the bundle from the design node onto the automation node directly?
-
yes, you can use the template code locally on your design node:
import dataiku
design_client= dataiku.api_client()
For the download_exported_bundle_archive_to_file method, I advise using the Python library tempfile so that you download it to a temporary location. Otherwise, for archiving purposes, you could use a specific directory like /home/dataiku/project_bundles. -
Alright, thanks Alex!
-
Hello, I stumbled upon this thread while trying to import a bundle from Design to Automation using the Python SDK. Is it still the case that there is no way to do custom remapping of connections using the API?
-
sudheer_kumar L2 Admin, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Registered Posts: 11 ✭✭✭✭
Hi there,
There is a automated macro in DATAIKU, which will create bundle and move to target for you.
You can check on that, it might help you.
-
Hey Sudheer,
Do you mean this plugin: https://www.dataiku.com/product/plugins/project-deployment/ ? I have already been using it, but it doesn't offer a way to do connection remapping. You'd first need to do a manual import into the automation node to define the connection remapping and then you can use this plugin to automate that. I was looking for a way circumvent the manual upload step, because I'd like to include this export in a CI pipeline.
Best
Marawan
-
Hey @marawan
,The APIs should be able to take care of connection remapping on the Automation Node. It will be an additional step to the bundle export or import. Here’s a script that builds on @Alex_Combessie
's example from a while ago:import json # Print current automation project settings for remapping if they exist project_automation = auto_client.get_project(project_key) automation_settings = project_automation.get_settings() automation_settings_dict = automation_settings.get_raw() print(json.dumps(automation_settings_dict["bundleContainerSettings"]["remapping"], sort_keys=True, indent=2)) # Define remappings remapping= { "codeEnvs": [], "connections": [ { "source": "foo", "target": "bar" } ] } automation_settings_dict["bundleContainerSettings"]["remapping"] = remapping automation_settings.save() # Print settings with remappings print(json.dumps(automation_settings_dict["bundleContainerSettings"]["remapping"], sort_keys=True, indent=2))
The results might look like this:
As you said, the macro (h/t @sudheer_kumar
) won’t take care of the connection remapping. Using this script for remapping, you could circumvent the manual import. It is analogous to remapping the connections via the UI. You can then possibly use the macro, as long as the remapping doesn’t change. Once it is set, the next imported bundles re-use it.Let us know how it works for you.
Yashas
-
Hey @YashasV
Perfect, thank you very much. We had actually already stumbled upon this solution a while ago while testing an unrelated matter and created a plugin based on the original Automation plugin to provide a visual way to do connection remapping, I just forgot to update this thread.
One thing I couldn't find however was a way to do this connection remapping when moving the bundle from one design node to another, because design nodes don't have this `bundleContainerSettings` option in their settings. Do you know of a different way to do connection remapping for design node targets? When you import a bundle (export) visually in the design node, you are prompted to do connection remapping for connections that do not exist. I however couldn't find an analog for this process in the API.
Thank you
Marawan