Shared data with Async Await - Part I
The shared data problem #
One of the most important problems in asynchronous programming is the shared data or the critical region problem. To understand this problem have a look at the following figure:In this figure we have stateful APIs from the data store with the basic access to read and write but no in-built transaction support. This type of thing does not happen with modern databases because they have transactions but we could be using raw filesystem for storage where such a thing can take place. In this example the data store has a 100 dollars in the account with two functions one that pays $20 to the account and the other than takes $20 from it. The basic structure of both the functions is the same:
If both the functions are called in parallel there could be problem. Both can read and write at the same time thereby overwriting the value written by the other. In this example the readData returned $100 for both the function but the write of $80 was overwritten and the actions of the
take method are lost.
Using locks as a solution #
The shared data problem is very similar to the problem of having two people trying to drink coffee from the same tumbler with one straw. If there is not understanding between them, it would be chaos. A solution to this problem is to protect the tumbler or the data store in case of the banking example with a lock. Only one method is allowed to use the data store at a time. This way the functions
The usage of this is very simple as well:
This solves the shared data problem by ensuring that only instance even of
pay could access the data at a time. While logically this solution has no issues, there is a subtle problem that this introduces which is not in the code but in the structure. In case of an exception or rejection, the unlock would never happen. While we are writing this code as a single function, no one prevents a programmer from refactoring it and making mistakes. The big rule of programming is to be safe by default and allow going to the unsafe versio only if it is essential for a certain use case. Here is a mistake that could happen:
The lock call could be anywhere. The problem with allowing separate
unlock is that, it introduces the possibility of being called individually. A forgotten
unlock would cause a leak. The lock would not be available any more and the program will come to a halt waiting. The forgotten
lock is more dangerous as something would happen without a
lock and some method could
unlock a lock acquired by someone else. These issues are extremely difficult to debug. a better solution is to disallow this two step process and provide a one step method:
The transaction manages the lock and unlock both for resolve and reject cases. This ensures there is no leaks or accidental unlocks sprayed around in code. This solution is saner and less error prone than the individual lock and unlock call.
Deadlock is a concept when there is cyclic waits. Method
A acquires a
lock and then waits for method
B while method
B needs the same lock to continue execution and is waiting on
A to release the lock. There is no standard way in code to prevent a deadlock. While writing code, we need to be careful to not have such a situation, by putting the minimal amount of code in a transaction and making sure we acquire all locks needed in the same order everywhere.
In the next post, we will go an advanced version of the shared data problem - the reader writer problem.