Submit your inspiring success story or innovative use case to the 2022 Dataiku Frontrunner Awards! ENTER YOUR SUBMISSION

Thread safe exclusive set variable in projects

Solved!
tomas
Neuron
Neuron
Thread safe exclusive set variable in projects

Hi,

 just discovered when running multiple python recipes the the DSS Project API - `update_variables` is not thread safe, if when two updates run in the same time, one process can set a variable and the other can unset it.

Example:

project.get_variables().get('local')
{}
---
# These two recipes run in the same time:
# RecipeA - updates only A
project.update_variables({'A': 1}, 'local')
# RecipeB - updates only B
project.update_variables({'B': 2}, 'local')
---
project.get_variables().get('local')
{'B': 2}
# The result can vary also sometimes {'A': 1} or {'A': 1, 'B': 2}
# The correct result should be always {'A': 1, 'B': 2}

 

Before I bumped into this issue I was using get_variables to fetch the dict, modify a key, and set_variables.

So I assumed the update_variables (taking only a set of keys) will be implemented in a atomic way. But it looks like the implementation is "get all vars" and "set all vars" without any kind of locking.

 

0 Kudos
1 Solution
Clément_Stenac
Dataiker
Dataiker

Hi,

We confirm the behavior you are observing.

Thanks for the feature request, we will be considering it for a future release.

View solution in original post

0 Kudos
2 Replies
Clément_Stenac
Dataiker
Dataiker

Hi,

We confirm the behavior you are observing.

Thanks for the feature request, we will be considering it for a future release.

0 Kudos
tomas
Neuron
Neuron
Author

Thanks for confirming. I came up with not a 100% reliable, but still usable workaround:

def set_project_var(project, key, value, where='local'):
    # Sets the key:value to the project's variable
    # where - can be local or standard
    # Race condition can happen so we set and check until it is set
    project.update_variables({key: value}, type=where)
    while True:
        # wait randomly up to 500ms
        sleep(0.5 * random())
        set_value = project.get_variables().get(where, {}).get(key)
        if not set_value or set_value != value:
            print('>> Race condition the previous set was not succesful')
            sleep(0.5 * random())
            project.update_variables({key: value}, type=where)
        else:
            return



0 Kudos