Correct use of scenario variables
Hi
I'm looking for documentation on how to use the Scenario Variables in DataIKU. As far as I can see, we access them in Python code using dataiku.get_custom_variables()["myvar"] - project variables can also be updated in a scenario, and that value is persisted after the scenario is run. I'm trying to figure out how to make the scenario variable available to the code without knowing that it is coming from the scenario variable. I.e. I want the scenario variable to temporarily override a project variable.
My goal here is to not have the flow have to understand that the variable is coming from a scenario, thus not having conditional branches that say
if scenario_var exists useit else use_project_var
Answers
-
Marlan Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant, Neuron 2023 Posts: 320 Neuron
Hi @allan
,From Python, you can use an Execute Python step like the following to temporarily override a project variable. In other words, the variable will be set to the new value for the remaining steps in the scenario.
from dataiku.scenario import Scenario Scenario().set_scenario_variables(projectVariableName=12345)
You can also use a Define Variable step to override a project variable for subsequent steps in scenario. Here you can set a fixed value or use DSS formulas to set the value.
Hope this is helpful.
Marlan
-
Hi @Marlan
You've hit on what I'm trying to do - I was trying to use the Define Variable step to temporarily override the value of a project variable in the flow.
From what I've looked at on version 8, your suggestions don't actually do that - it would be perfect if it did. It seems to assign the variable as a custom variable, thus any recipes would need to look in the custom variables dictionary to look it up. I was trying to avoid this in our projects - the recipes should only look in one place for the variable values, and the scenario variable should temporarily override the project variable.
-
Marlan Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant, Neuron 2023 Posts: 320 Neuron
Hi @allan
,I override project variables all of the time using scenario steps in version 8 . It does work. By project variables though, I mean variables that are defined from the Variables choice under the three dot menu.
There are also "Flow variables" which I understand are primarily used for partitions; I haven't used them myself but maybe these are what you are referring to?
The name of the get_custom_variables() function doesn't help. It would clearer if it was named get_project_variables because that's what it does. Regardless, though, this function returns the current values of project variables whether those values were set in the UI (three dots menu, Variables) or overridden to new values by a step in a Scenario.
Marlan
-
So how do you access them in your flow? If in python, are you calling get_custom_variables() to access the scenario variables? If so, you must have a separate logic condition that handles them not existing, no? Thats what I'm trying to avoid, but from my testing, I must call get_custom_variables() to get the scenario variables, and if they don't exist I have to get the project variables as normal
project_handle = dataiku.api_client().get_project(dataiku.default_project_key())
project_vars = project_handle.get_variables()there is no way to get the temporary scenario variables via get_variables() - you can only update those in a scenario with a "Set project variable" step, which permanently changes the value of the variable.
-
Marlan Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant, Neuron 2023 Posts: 320 Neuron
Hi @allan
,Yes, in Python recipes I would call get_custom_variables or in a SQL script recipe I would use the ${varname} syntax. Inside a recipe in the flow I wouldn't use the external API call that you noted (project_handle.get_variables()) but rather get_custom_variables.
My typical work process is to develop recipes in the flow using project variables that I have defined in the UI. I do all my development and testing using those project variables. Then at a later step in the work process I define scenarios where I often override the project variables.
So recipes in the flow in my case always use previously defined project variables. Thus I've never needed to test for variable existence or not.
Let me know if I'm misunderstanding what you are trying to do...
Marlan
-
Thank you - it looks like I had things back to front - as you said, the get_custom_variables method name is misleading.
If I use get_custom_variables, I get the project variables and any overrides in that dictionary; using the method to get the project variables retrieves them without any overrides.
Thank you so much for taking the time to get me to this point!
-
Hi @allan
, just bumped into your question as i am trying to find out some documentation on usage of Scenario variables.Were you able to find out a way to call the scenario variables (set by the Define Scenario Variables step in a scenario) from a Python recipe? As i figured out that dataiku.get_custom_variables() doesnt helps to get the value of a scenario variable.
-
custom_vars = dataiku.get_custom_variables()
print("CUSTOM VARS: {0}".format(custom_vars["myvar"]))
print("CUSTOM VARS: {0}".format(custom_vars["myvartoo"]))when running in the flow by a user launch not in a scenario, the code above will print the variables "myvar" and "myvartoo" with the values as defined in the project global variables.
If you create scenario variables with the same names and run this code as part of a recipe in a scenario, you'll see the values come through as defined in the scenario variables (see attached screenshot).