Multiple Snowflake Warehouse in Single Connection
Hi Team,
I am using Snowflake Data and it is using 2x cluster configuration as my data is huge but after some filter or grouping the data is not that much huge and it will run on smaller configuration as well, so is there any possibility in recipes that we can select the warehouse so that it helps me in cost reduction?
Thanks in Advance
Best Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi,
Yes, your understanding is correct. If you want to run specific recipes to a different snowflake warehouse you can use the recipe variables to override the project or global variables for the warehouse value.
However, updating variables within a recursive recipe run is not really advised and is likely not needed if you can define a single warehouse for each recipe that needs to run on a different warehouse that is different from the project/global default. If you have no other choice then updating project variables during a recursive recipe run see : https://community.dataiku.com/t5/General-Discussion/Asynchronous-project-variables-recursive-build/m-p/19811
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
> No it will not impact other projects/recipes.
Project variables will only be used within that single project. Even if the same connection is used the warehouse will be treated dynamically per project / per recipe.
Hope that answers your question
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi,
You can parameterize the Snowflake connection and use variables as the warehouse value.
https://doc.dataiku.com/dss/latest/connecting/sql/snowflake.html#how-to-set-it-up
If you change the project variables or even override at the recipe level. This should allow you to achieve what you are looking for that for some recipes to use a different warehouse.
-
sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron
Hi,
Just correct me if my understanding is not correct.
1. I created Connection with Warehouse as Variable
2. I run the Recipe with Large Warehouse.
3. I Update the variable to small one and then i will run my other recipes with that configuration
If my understanding is correct, then who can we automate this when we are running scenarios or running my complete Workflow at once?
Thanks in Advance
-
sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron
Hi,
This makes sense. Thanks for the Valuable information
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi,
For SQL recipes you can simply use it directly in the SQL .
e.g
USE WAREHOUSE ${variable_name}
This would behave in the same way and override any default value you have for warehouse in global variables.
-
sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron
HI @AlexT
,This is perfect.
Just a follow-up question, for example, I changed my warehouse to a 4X cluster of snowflake and at the same time, my other project is running with the same Snowflake connection. Will the other project use the 4X cluster or the default one?
Thanks in Advance
-
sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron
-
sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Sorry missed your last question. Projects will use by default what is defined in the general connection.
Unless it is overwritten in the user profile or variables value defined as explained here:
https://doc.dataiku.com/dss/9.0/connecting/sql/snowflake.html#switching-role-and-warehouse
Perhaps you can share a screenshot of your Snowflake connection settings and the variables defined to get a better understanding.
Also you can always check the job log where the actual warehouse used will be printed.
-
sj0071992 Partner, Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Dataiku DSS Developer, Neuron 2022, Neuron 2023 Posts: 131 Neuron
Hi @AlexT
,I totally understand the logic of Dynamic Warehouse usage and ways to override it in recipes.
The only thing I want clarity is,
For example I am overriding my Warehouse variable in Python Recipe in one of my project to Large Warehouse but at the same time my other project is running using the same Snowflake Connection which doesn't require large Warehouse.
Will overriding the Variable in a recipe in one of the project impact the other project if running parallelly?
I hope you understand my concern
Thanks in Advance