Bulk replace BigQuery and GCS dataset connections at the same time in Dataiku

Options
JCB
JCB Registered Posts: 7 ✭✭✭

My organization is switching from one bigquery project to another. I would like to keep the same flows and data sets but switch over all the current dataset connections (GCS bucket + bigquery project) into the new one. Now, this could be done if I manually configured the connections settings in each dataset, but it would take a long time. Is there a way to bulk change the connections in a more effective manner?

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,757 Neuron
    Options

    Hi, yeah this is totally doable using the Dataiku API. I am on my phone but I asked chatGPT and it wrote this:

    import dataikuapi

    # Connect to Dataiku API client
    host = 'https://YOUR_DATAIKU_HOST'
    api_key = 'YOUR_API_KEY'
    client = dataikuapi.DSSClient(host, api_key)

    # Specify connection ID and new project ID
    connection_id = 'your_connection_id'
    new_project_id = 'your_new_project_id'

    # Get the connection object
    connection = client.get_connection(connection_id)

    # Update the project ID of the connection
    connection_settings = connection.get_settings()
    connection_settings['projectId'] = new_project_id
    connection.set_settings(connection_settings)

    # Save the updated connection
    connection.update()

    Please make sur estoy test this step by step. Also why not use variables for your GCP Project IDs? That way next time you need to change them you can do a simple text replacement in your global variables.

Setup Info
    Tags
      Help me…