Snowflake dataset override default connection details

Solved!
FabriceD
Level 2
Snowflake dataset override default connection details
 

Hi

I was wondering what the exact names for the parameters were that need to be added in the 'specificSettings' to get a new managed dataset on snowflake be materialized in a catalog and schema different from the default details of the connection.

I added a screenshot of the UI (I am looking for the database and schema fields).
I want to use those in the python SDK (under the hood following API is used: https://doc.dataiku.com/dss/api/12/rest/#datasets-datasets-post-1
But I do not seem to find the correct parameters...
I tried all different combinations of 'catalog', 'schema', 'database', 'snowflake_catalog',...

PS: If there is a better pattern to creating managed datasets through the SDK, please let me know.

 

 

0 Kudos
1 Solution
FabriceD
Level 2
Author

I was in contact with the Dataiku support team and got the answer needed:

The exact names of the 'specificSettings' parameters are 'overrideSQLCatalog' and 'overrideSQLSchema':


builder.creation_settings['specificSettings']['overrideSQLCatalog'] = new_dataset_snowflake_catalog
builder.creation_settings['specificSettings']['overrideSQLSchema'] = new_dataset_snowflake_schema

View solution in original post

6 Replies
Turribeach

Please post your code using a code block (the </> icon) so it can be copy/pasted.

The code looks good to me. What exactly do you see on the created dataset? Is this an unmanaged (input) or managed (output) dataset that you are trying to create?

 

Finally the link you posted is for the REST API not the Python API which you seem to be using. 

0 Kudos
FabriceD
Level 2
Author

Hi


Here is the minimal example code to do what I want.

from dataikuapi.dssclient import DSSClient

host = ""
api_key = ""
connection_name = ""
new_dataset_name = ""
new_dataset_snowflake_schema = ""
new_dataset_snowflake_catalog = ""

# create client and build new dataset
client = DSSClient(host, api_key, insecure_tls=True)
builder = client.project.new_managed_dataset(new_dataset_name)
builder.already_exists()
# add connection details and overwrite defaults
builder.with_store_into(connection_name)
builder.creation_settings['specificSettings']['catalog'] = new_dataset_snowflake_catalog
builder.creation_settings["specificSettings"]['schema'] = new_dataset_snowflake_schema
# create and retrieve created dataset
builder.create(overwrite=True)
dataset = client.project.get_dataset(new_dataset_name)
dataset.get_config()
# --> params contain catalog and schema equal to the defaults of the connection

 

Indeed the URL was for the REST API, but this one is used through that package.
the 'create' function has the following implementation. and that is why I referenced the REST API documentation. There is no documentation available about the specificSettings that are possble to be used. (snippet from the dataikuapi-package)

    def create(self, overwrite=False):
        """
        Executes the creation of the managed dataset according to the selected options
        
        :param overwrite: If the dataset being created already exists, delete it first (removing data), defaults to False
        :type overwrite: bool, optional

        :returns: the newly created dataset
        :rtype: :class:`DSSDataset`
        """
        if overwrite and self.already_exists():
            self.project.get_dataset(self.dataset_name).delete(drop_data = True)

        self.project.client._perform_json("POST", "/projects/%s/datasets/managed" % self.project.project_key,
            body = {
                "name": self.dataset_name,
                "creationSettings":  self.creation_settings
        })
        return DSSDataset(self.project.client, self.project.project_key, self.dataset_name)

 

0 Kudos
Turribeach

Can you post a screen shot of the settings of your Snowflake connection? (you can hide the sensitive parts).

0 Kudos
FabriceD
Level 2
Author

Unfortunately I cannot as I am not an admin on the portal.
I could ask my admin but it might take a while.

Connecting works, writing through python and the UI works as well. but overwriting the default schema only works in the UI...
 

0 Kudos
FabriceD
Level 2
Author

I was in contact with the Dataiku support team and got the answer needed:

The exact names of the 'specificSettings' parameters are 'overrideSQLCatalog' and 'overrideSQLSchema':


builder.creation_settings['specificSettings']['overrideSQLCatalog'] = new_dataset_snowflake_catalog
builder.creation_settings['specificSettings']['overrideSQLSchema'] = new_dataset_snowflake_schema
Turribeach

Many thanks for posting it back and sharing with the community.

0 Kudos