Submit your innovative use case or inspiring success story to the 2023 Dataiku Frontrunner Awards! LET'S GO

Thread safe exclusive set variable in projects

Solved!
tomas
Thread safe exclusive set variable in projects

Hi,

 just discovered when running multiple python recipes the the DSS Project API - `update_variables` is not thread safe, if when two updates run in the same time, one process can set a variable and the other can unset it.

Example:

project.get_variables().get('local')
{}
---
# These two recipes run in the same time:
# RecipeA - updates only A
project.update_variables({'A': 1}, 'local')
# RecipeB - updates only B
project.update_variables({'B': 2}, 'local')
---
project.get_variables().get('local')
{'B': 2}
# The result can vary also sometimes {'A': 1} or {'A': 1, 'B': 2}
# The correct result should be always {'A': 1, 'B': 2}

 

Before I bumped into this issue I was using get_variables to fetch the dict, modify a key, and set_variables.

So I assumed the update_variables (taking only a set of keys) will be implemented in a atomic way. But it looks like the implementation is "get all vars" and "set all vars" without any kind of locking.

 

0 Kudos
1 Solution
Clรฉment_Stenac

Hi,

We confirm the behavior you are observing.

Thanks for the feature request, we will be considering it for a future release.

View solution in original post

0 Kudos
2 Replies
Clรฉment_Stenac

Hi,

We confirm the behavior you are observing.

Thanks for the feature request, we will be considering it for a future release.

0 Kudos
tomas
Author

Thanks for confirming. I came up with not a 100% reliable, but still usable workaround:

def set_project_var(project, key, value, where='local'):
    # Sets the key:value to the project's variable
    # where - can be local or standard
    # Race condition can happen so we set and check until it is set
    project.update_variables({key: value}, type=where)
    while True:
        # wait randomly up to 500ms
        sleep(0.5 * random())
        set_value = project.get_variables().get(where, {}).get(key)
        if not set_value or set_value != value:
            print('>> Race condition the previous set was not succesful')
            sleep(0.5 * random())
            project.update_variables({key: value}, type=where)
        else:
            return



0 Kudos