Multiple Snowflake Warehouse in Single Connection

Solved!
sj0071992
Multiple Snowflake Warehouse in Single Connection

Hi Team,

 

I am using Snowflake Data and it is using 2x cluster configuration as my data is huge but after some filter or grouping the data is not that much huge and it will run on smaller configuration as well, so is there any possibility in recipes that we can select the warehouse so that it helps me in cost reduction?

 

Thanks in Advance

0 Kudos
2 Solutions
AlexT
Dataiker

Hi,

Yes, your understanding is correct. If you want to run specific recipes to a different snowflake warehouse you can use the recipe variables to override the project or global variables for the warehouse value. 

Screenshot 2021-10-01 at 14.07.06.png

However, updating variables within a recursive recipe run is not really advised and is likely not needed if you can define a single warehouse for each recipe that needs to run on a different warehouse that is different from the project/global default. If you have no other choice then updating project variables during a recursive recipe run see :  https://community.dataiku.com/t5/General-Discussion/Asynchronous-project-variables-recursive-build/m...

View solution in original post

AlexT
Dataiker

> No it will not impact other projects/recipes.

Project variables will only be used within that single project. Even if the same connection is used the warehouse will be treated dynamically per project / per recipe. 

Hope that answers your question

 

View solution in original post

0 Kudos
12 Replies
AlexT
Dataiker

Hi,

You can parameterize the Snowflake connection and use variables as the warehouse value.

https://doc.dataiku.com/dss/latest/connecting/sql/snowflake.html#how-to-set-it-up

If you change the project variables or even override at the recipe level. This should allow you to achieve what you are looking for that for some recipes to use a different warehouse. 

 

 

 

0 Kudos
sj0071992
Author

Hi,

Just correct me if my understanding is not correct.

1. I created Connection with Warehouse as Variable

2. I run the Recipe with Large Warehouse.

3. I Update the variable to small one and then i will run my other recipes with that configuration

If my understanding is correct, then who can we automate this when we are running scenarios or running my complete Workflow at once?

Thanks in Advance

0 Kudos
AlexT
Dataiker

Hi,

Yes, your understanding is correct. If you want to run specific recipes to a different snowflake warehouse you can use the recipe variables to override the project or global variables for the warehouse value. 

Screenshot 2021-10-01 at 14.07.06.png

However, updating variables within a recursive recipe run is not really advised and is likely not needed if you can define a single warehouse for each recipe that needs to run on a different warehouse that is different from the project/global default. If you have no other choice then updating project variables during a recursive recipe run see :  https://community.dataiku.com/t5/General-Discussion/Asynchronous-project-variables-recursive-build/m...

sj0071992
Author

Hi @AlexT ,

 

How can we define variable to change the warehouse of connection in SQL recipe?

Need your help here

 

Thanks in Advance

0 Kudos
AlexT
Dataiker

Hi, 

For SQL recipes you can simply use it directly in the SQL .

e.g 

USE WAREHOUSE ${variable_name}

 

This would behave in the same way and override any default value you have for warehouse in global variables.

 

Screenshot 2022-02-05 at 14.45.50.png 

0 Kudos
sj0071992
Author

HI @AlexT ,

 

This is perfect.

Just a follow-up question, for example, I changed my warehouse to a 4X cluster of snowflake and at the same time, my other project is running with the same Snowflake connection. Will the other project use the 4X cluster or the default one?

 

Thanks in Advance

0 Kudos
sj0071992
Author

Hi @AlexT ,

 

Could you please help me understanding this?

 

Thanks in Advance

0 Kudos
sj0071992
Author

Hi @AlexT ,

 

Could you please help me understanding this?

 

Thanks in Advance

0 Kudos
AlexT
Dataiker

Sorry missed your last question. Projects will use by default what is defined in the general connection. 

Unless it is overwritten in the user profile or variables value defined as explained here:

https://doc.dataiku.com/dss/9.0/connecting/sql/snowflake.html#switching-role-and-warehouse

Perhaps you can share a screenshot of your Snowflake connection settings and the variables defined to get a better understanding.

Also you can always check the job log where the actual warehouse used will be printed. 

0 Kudos
sj0071992
Author

Hi @AlexT ,

 

I totally understand the logic of Dynamic Warehouse usage and ways to override it in recipes.

The only thing I want clarity is,

For example I am overriding my Warehouse variable in Python Recipe in one of my project to Large Warehouse but at the same time my other project is running using the same Snowflake Connection which doesn't require large Warehouse.

Will overriding the Variable in a recipe in one of the project impact the other project if running parallelly?

 

I hope you understand my concern

 

Thanks in Advance

0 Kudos
AlexT
Dataiker

> No it will not impact other projects/recipes.

Project variables will only be used within that single project. Even if the same connection is used the warehouse will be treated dynamically per project / per recipe. 

Hope that answers your question

 

0 Kudos
sj0071992
Author

Hi, 

 

This makes sense. Thanks for the  Valuable information ๐Ÿ™‚

 

0 Kudos