Multiple Snowflake Warehouse in Single Connection

Options
sj0071992
sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron

Hi Team,

I am using Snowflake Data and it is using 2x cluster configuration as my data is huge but after some filter or grouping the data is not that much huge and it will run on smaller configuration as well, so is there any possibility in recipes that we can select the warehouse so that it helps me in cost reduction?

Thanks in Advance

Best Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Answer ✓
    Options

    Hi,

    Yes, your understanding is correct. If you want to run specific recipes to a different snowflake warehouse you can use the recipe variables to override the project or global variables for the warehouse value.

    Screenshot 2021-10-01 at 14.07.06.png

    However, updating variables within a recursive recipe run is not really advised and is likely not needed if you can define a single warehouse for each recipe that needs to run on a different warehouse that is different from the project/global default. If you have no other choice then updating project variables during a recursive recipe run see : https://community.dataiku.com/t5/General-Discussion/Asynchronous-project-variables-recursive-build/m-p/19811

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Answer ✓
    Options

    > No it will not impact other projects/recipes.

    Project variables will only be used within that single project. Even if the same connection is used the warehouse will be treated dynamically per project / per recipe.

    Hope that answers your question

Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Options

    Hi,

    You can parameterize the Snowflake connection and use variables as the warehouse value.

    https://doc.dataiku.com/dss/latest/connecting/sql/snowflake.html#how-to-set-it-up

    If you change the project variables or even override at the recipe level. This should allow you to achieve what you are looking for that for some recipes to use a different warehouse.

  • sj0071992
    sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron
    Options

    Hi,

    Just correct me if my understanding is not correct.

    1. I created Connection with Warehouse as Variable

    2. I run the Recipe with Large Warehouse.

    3. I Update the variable to small one and then i will run my other recipes with that configuration

    If my understanding is correct, then who can we automate this when we are running scenarios or running my complete Workflow at once?

    Thanks in Advance

  • sj0071992
    sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron
    Options

    Hi,

    This makes sense. Thanks for the Valuable information

  • sj0071992
    sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron
    Options

    Hi @AlexT
    ,

    How can we define variable to change the warehouse of connection in SQL recipe?

    Need your help here

    Thanks in Advance

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    edited July 17
    Options

    Hi,

    For SQL recipes you can simply use it directly in the SQL .

    e.g

    USE WAREHOUSE ${variable_name}

    This would behave in the same way and override any default value you have for warehouse in global variables.

    Screenshot 2022-02-05 at 14.45.50.png

  • sj0071992
    sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron
    Options

    HI @AlexT
    ,

    This is perfect.

    Just a follow-up question, for example, I changed my warehouse to a 4X cluster of snowflake and at the same time, my other project is running with the same Snowflake connection. Will the other project use the 4X cluster or the default one?

    Thanks in Advance

  • sj0071992
    sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron
    Options

    Hi @AlexT
    ,

    Could you please help me understanding this?

    Thanks in Advance

  • sj0071992
    sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron
    Options

    Hi @AlexT
    ,

    Could you please help me understanding this?

    Thanks in Advance

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Options

    Sorry missed your last question. Projects will use by default what is defined in the general connection.

    Unless it is overwritten in the user profile or variables value defined as explained here:

    https://doc.dataiku.com/dss/9.0/connecting/sql/snowflake.html#switching-role-and-warehouse

    Perhaps you can share a screenshot of your Snowflake connection settings and the variables defined to get a better understanding.

    Also you can always check the job log where the actual warehouse used will be printed.

  • sj0071992
    sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron
    Options

    Hi @AlexT
    ,

    I totally understand the logic of Dynamic Warehouse usage and ways to override it in recipes.

    The only thing I want clarity is,

    For example I am overriding my Warehouse variable in Python Recipe in one of my project to Large Warehouse but at the same time my other project is running using the same Snowflake Connection which doesn't require large Warehouse.

    Will overriding the Variable in a recipe in one of the project impact the other project if running parallelly?

    I hope you understand my concern

    Thanks in Advance

Setup Info
    Tags
      Help me…