How to lock a given dataset?

Solved!
rmnvncnt
Level 3
How to lock a given dataset?
We can access a given dataset using the DSS REST API, but we would like this dataset not to be written on while it is accessed.

1) Is it possible to check whether a dataset is currently modified using the api?

2) Is it possible to lock data writing on a dataset using the api?

If it's not possible to do this using the api, what would be the most suitable alternative (with a scenario)?
0 Kudos
1 Solution
Alex_Combessie
Dataiker Alumni
Hello,

Locking a dataset is not currently possible through the API. Could you please tell us more about the scenario you would like to build?

If all actions can be defined as scenario steps (no API calls external to the scenario), there should not be lock issues since steps are synchronous by default.

Cheers,

Alex

View solution in original post

0 Kudos
4 Replies
Alex_Combessie
Dataiker Alumni
Hello,

Locking a dataset is not currently possible through the API. Could you please tell us more about the scenario you would like to build?

If all actions can be defined as scenario steps (no API calls external to the scenario), there should not be lock issues since steps are synchronous by default.

Cheers,

Alex
0 Kudos
rmnvncnt
Level 3
Author
Thanks Alexandre! The idea here is to prevent another application (external to DSS) to read a data file on the DSS server while the dataset corresponding to this file is being built by DSS.

Our initial assumption was to use the API to tell DSS not to build the dataset until the application is done reading it or to tell the application not to read the file while the dataset is being built. I had two ideas to go around this concurrency issue without direct API call :

1) The dumb solution : have the scenario building that dataset writing a line to another "lock" dataset at the beginning of the build and deleting it at the end of the build : the data file is accessible to the application only when the "lock" dataset is empty.

2) The neckbeard solution : creating system-wide lock files (https://stackoverflow.com/questions/6931342/system-wide-mutex-in-python-on-linux).

I was going for the second one, but other options are welcome!
0 Kudos
Alex_Combessie
Dataiker Alumni
In my opinion, (1) should be simple and straightforward.

Slight modification of 1: If you have access to advanced automation features, you could implement it as a Python scenario step, which executes your own code to tell the external application that the dataset is ready. This way why no dummy dataset is needed ๐Ÿ™‚
0 Kudos
rmnvncnt
Level 3
Author
Thanks a lot! I'm going to give it a try!
0 Kudos