Automate bundle export import python script

Options
tomas
tomas Registered, Neuron 2022 Posts: 120 ✭✭✭✭✭
Hi,

I would like to move from design node to automation node a project using a python script. Does Dataiku have a template or existing script for that? After I export a bundle how can I manage mappings and conflicts during the activate?

Thanks!

Answers

  • jereze
    jereze Alpha Tester, Dataiker Alumni Posts: 190 ✭✭✭✭✭✭✭✭
    Options
    Hi tomas,

    You can package a project into a "bundle" and deploy it from the Design Node to the Automation Node.

    Some links:

    https://www.dataiku.com/learn/guide/tutorials/deploy-production.html
    https://doc.dataiku.com/dss/latest/bundles/index.html
  • tomas
    tomas Registered, Neuron 2022 Posts: 120 ✭✭✭✭✭
    Options
    Yes, I am using bundle python script.
    p = designClient.get_project("MYPROJECT")
    p.export_bundle(bundleId)
  • jereze
    jereze Alpha Tester, Dataiker Alumni Posts: 190 ✭✭✭✭✭✭✭✭
    Options
    Oh, I didn't understand you want to automate that with a Python script. Sorry.
  • Alex_Combessie
    Alex_Combessie Alpha Tester, Dataiker Alumni Posts: 539 ✭✭✭✭✭✭✭✭✭
    edited July 17
    Options

    Hi,

    Please find attached an example script for exporting a bundle from a design node and deploying it to automation.


    import dataikuapi

    #Define the connections

    #Design
    host = "http://localhost:12345" # example to be changed
    apiKey = "Foo123Foo123F123Foo123" # example to be changed

    #Automation
    host_auto = "http://localhost:23456" # example to be changed
    apiKey_auto = "Bar234Bar234Bar234Bar234" # example to be changed

    design_client= dataikuapi.DSSClient(host, apiKey)
    auto_client = dataikuapi.DSSClient(host_auto, apiKey_auto)

    version_bundle = "bundle_v1"
    #Export bundle
    project = design_client.get_project("MYSUPERPROJECT")
    project.export_bundle(version_bundle)
    project.download_exported_bundle_archive_to_file(version_bundle, "temp_bundle.zip")

    #Import bundle
    project_automation = auto_client.get_project("MYSUPERPROJECT")
    project_automation.import_bundle_from_archive("temp_bundle.zip")

    # Preload and activate bundle
    project_automation.preload_bundle(version_bundle) # for code envs
    project_automation.activate_bundle(version_bundle)

    Note that it works also outside of DSS, which is why you need the API keys. To generate these, you can go to Administration > Security > Global/Personal API keys on the Design and Automation nodes.

    At the moment, there is no way through the dataiku API to perform custom remapping of the connections. We assume that the connections are already setup and share the same logical name (although their inner values are different). Having said that, if you have already setup the custom remapping in the interface, promoting a new bundle through the API will keep the custom remapping.

    Cheers,

    Alex

  • rmnvncnt
    rmnvncnt Registered Posts: 41 ✭✭✭✭✭
    Options
    I assume that "temp_bundle.zip" is downloaded locally in this example, but is there a way to push the bundle from the design node onto the automation node directly?
  • Alex_Combessie
    Alex_Combessie Alpha Tester, Dataiker Alumni Posts: 539 ✭✭✭✭✭✭✭✭✭
    Options
    yes, you can use the template code locally on your design node:
    import dataiku
    design_client= dataiku.api_client()
    For the download_exported_bundle_archive_to_file method, I advise using the Python library tempfile so that you download it to a temporary location. Otherwise, for archiving purposes, you could use a specific directory like /home/dataiku/project_bundles.
  • rmnvncnt
    rmnvncnt Registered Posts: 41 ✭✭✭✭✭
    Options
    Alright, thanks Alex!
  • marawan
    marawan Partner, Registered Posts: 19 Partner
    Options

    Hello, I stumbled upon this thread while trying to import a bundle from Design to Automation using the Python SDK. Is it still the case that there is no way to do custom remapping of connections using the API?

  • sudheer_kumar
    sudheer_kumar L2 Admin, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Registered Posts: 11 ✭✭✭✭
    Options

    Hi there,

    There is a automated macro in DATAIKU, which will create bundle and move to target for you.

    You can check on that, it might help you.

  • marawan
    marawan Partner, Registered Posts: 19 Partner
    Options

    Hey Sudheer,

    Do you mean this plugin: https://www.dataiku.com/product/plugins/project-deployment/ ? I have already been using it, but it doesn't offer a way to do connection remapping. You'd first need to do a manual import into the automation node to define the connection remapping and then you can use this plugin to automate that. I was looking for a way circumvent the manual upload step, because I'd like to include this export in a CI pipeline.

    Best

    Marawan

  • YashasV
    YashasV Dataiker, Dataiku DSS Core Designer Posts: 9 Dataiker
    edited July 17
    Options

    Hey @marawan
    ,

    The APIs should be able to take care of connection remapping on the Automation Node. It will be an additional step to the bundle export or import. Here’s a script that builds on @Alex_Combessie
    's example from a while ago:

    import json
    
    # Print current automation project settings for remapping if they exist
    project_automation = auto_client.get_project(project_key)
    automation_settings = project_automation.get_settings()
    automation_settings_dict = automation_settings.get_raw()
    
    print(json.dumps(automation_settings_dict["bundleContainerSettings"]["remapping"], sort_keys=True, indent=2))
    
    # Define remappings
    
    remapping= {
        "codeEnvs": [],
        "connections": [
            {
                "source": "foo",
                "target": "bar"
            }
        ]
    }
    
    automation_settings_dict["bundleContainerSettings"]["remapping"] = remapping
    automation_settings.save()
    
    # Print settings with remappings
    print(json.dumps(automation_settings_dict["bundleContainerSettings"]["remapping"], sort_keys=True, indent=2))
    

    The results might look like this:

    remapping_connections.png

    As you said, the macro (h/t @sudheer_kumar
    ) won’t take care of the connection remapping. Using this script for remapping, you could circumvent the manual import. It is analogous to remapping the connections via the UI. You can then possibly use the macro, as long as the remapping doesn’t change. Once it is set, the next imported bundles re-use it.

    Let us know how it works for you.

    Yashas

  • marawan
    marawan Partner, Registered Posts: 19 Partner
    Options

    Hey @YashasV

    Perfect, thank you very much. We had actually already stumbled upon this solution a while ago while testing an unrelated matter and created a plugin based on the original Automation plugin to provide a visual way to do connection remapping, I just forgot to update this thread.

    One thing I couldn't find however was a way to do this connection remapping when moving the bundle from one design node to another, because design nodes don't have this `bundleContainerSettings` option in their settings. Do you know of a different way to do connection remapping for design node targets? When you import a bundle (export) visually in the design node, you are prompted to do connection remapping for connections that do not exist. I however couldn't find an analog for this process in the API.

    Thank you

    Marawan

Setup Info
    Tags
      Help me…