Changing connections for SQL Query datasets

importthepandas
importthepandas Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 115 Neuron

We are migrating our data warehouse currently and are furiously changing connections, branching flows and validating back to old sources, etc. One thing we've encountered, specifically with Teradata connections, is that the "change connection" capability doesn't work on SQL query datasets. We're trying to follow changing connections guidelines as mentioned here: https://knowledge.dataiku.com/latest/kb/setup-admin/remapping/_shared/connection-remapping.html

General error is as follows:

ERR_DATASET_ACTION_NOT_SUPPORTED

Action not supported

Cannot change the connection for any of the specified items

Managed datasets work like a charm, though! Am I missing something or is this not supported for SQL query datasets/Teradata specific SQL query datasets?


Operating system used: Ubuntu 18

Tagged:

Answers

  • importthepandas
    importthepandas Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 115 Neuron

    Just to note: I've used both the GUI and Python API to attempt to change connections. SQL query dataset is all ANSI / standard SQL so attempting to max laziness here!

  • gnaldi62
    gnaldi62 Partner, L2 Designer, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Frontrunner 2022 Participant, Neuron 2023 Posts: 79 Neuron

    Hi,

    that's strange. We're changing SQL connection details almost every day and don't find suich problems. Just to be sure, I attached here the steps we're following (with images):

    S1) we open the flow and select "connections";

    S2) we then select the specific connection we'd like to remap;

    S3/S4) then we select "Change connection" from the right menu;

    S5/S6) we select the new connection (it should be created, obviously); we also check the "Reuse connection settings...".

    S_1.jpgS_2.jpgS_3.jpgS_4.jpgS_5.jpgS_6.jpg

  • Marlan
    Marlan Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant, Neuron 2023 Posts: 320 Neuron
    edited July 17

    Hi @importthepandas
    ,

    We are also in the midst of migrating our data warehouse (Netezza to Snowflake). Sounds like you have already tried changing connections using the API. Nonetheless I thought I would include the code we are using to change connections in case anything there is helpful.

    I also compared the settings for a SQL query dataset and a regular managed dataset and it looks like the logic to change connections for regular datasets would work for SQL query datasets. This was for Netezza; maybe there is something unique to Teradata SQL query datasets.

    In the included Python script, the function change_dataset_connections is passed a handle to a project (client.get_project('PROJKEY'). We also use Exec SQL steps in scenarios often and have to change connections there as well. Would like change connections for SQL Notebooks but no way to do that via the API.

    Marlan

    def swap_connection(connection):
        if connection == 'NZ_CONN1':
            return 'SF_CONN2'
        elif connection == 'NZ_CONNA':
            return 'SF_CONNB'
        else:
            return'SF' + connection[2:999]
    
    def change_dataset_connections(project_obj, save=True):
        
        for ds in project_obj.list_datasets(as_type='objects'):
            s = ds.get_settings()
            raw = s.get_raw()
    
    
            if raw['type'] == 'Netezza' and raw['params']['connection'][0:2] == 'NZ':
    
                old_connection = raw['params']['connection']
                
                raw['type'] = 'Snowflake'
                raw['params']['connection'] = swap_connection(raw['params']['connection'])
                if not raw['managed']:
                    raw['params']['schema'] = 'ADMIN' # snowflake requires a fully specified table spec
    
                if save:
                    s.save()
    
                print('{}: changed connection from {} to {}'.format(ds.dataset_name, old_connection, raw['params']['connection']))
    
            else:
                pass
                #print('{}: skipped - does not use a Netezza NZ prefixed connection'.format(ds.dataset_name))
            
        return

  • importthepandas
    importthepandas Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 115 Neuron

    thank you both - will give it one more shot

  • importthepandas
    importthepandas Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 115 Neuron

    just following up - we got it done via the python API. thanks again!

  • CoreyS
    CoreyS Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Registered Posts: 1,150 ✭✭✭✭✭✭✭✭✭

    Thank you for sharing your solutuion @importthepandas
    !

Setup Info
    Tags
      Help me…