A Distributed Computing Lecture by Steven Choy
Concurrency Control Methods: Locking, Optimistic concurrent control (OCC), Timestamp ordering
Deadlocks in Distributed Systems: Deadlock Handling, Deadlock Detection, Deadlock Prevention
Locking
Introduction
- The oldest but most widely used.
- An object is locked to prevent accessing by others
- Two state Lock / unlock.
- If an object is locked , other transactions must wait until it has been unlocked.
- If more than one transaction want to lock the object, they must be queued.
- There are two types of lock, read-lock and write-lock
- Read-lock- locks for reading it’s content . An object can be read locked by MORE THAN ONE transaction
- Write-lock - locks for modifying content. Write-lock must be dedicated for one transaction. ( mean if a transaction want to write-lock an object. The object must NOT be read-lock or write-lock by any other transaction
Two-Phase Locking
- The idea is “Do not release any lock, until all invoked objects are locked”
- Object cannot be released in the middle of transaction (even the task of this object is finished)
- The algorithm of two-phase locking is like:
- Find out which objects needed to be access and lock them one by one.
- If all objects are locked, transaction will start and finally commit, if some objects are being locked by others, it will keep on trying to lock it.
- If transaction is committed or aborted. All objects will be released.
- Two phases:
- expanding phase - new locks on items can be acquired but none can be released
- shrinking phase - existing locks can be released but no new ones can be acquired
- The problem of two-phase locking: Deadlock
- Example of deadlock of two phase locking
T1 want to lock A , B and T2 want to lock B, A
T1 has locked A and want to lock B ( B is locked by T2)
T2 has locked B and want to lock A ( A is locked by T1)
Optimistic Concurrent Control (OCC)
- The idea is simple “just go ahead without paying any attention. Before the transaction is committed, it perform a validation check. If something wrong, roll back.
- Three phase of OCC
- Working phase, process the transaction
- Validation phase, check if any data objects have been changed since the transaction started
- Update phase, make a decision to commit or abort
- The are two types of validation check
- Backward validation - A transaction will abort if the data item it has read is written by another committed transaction
- Forward validation - A transaction will abort if the data item it has written is read by another active transaction
Validation of Transactions
Source: Chapter 13: Transactions and Concurrency Control in 'Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 4 (the website for the book is
http://www.cdk4.net/). Please refer to this chapter of the textbook.
- Backward validation of transaction Tv
boolean valid = true;
for (int Ti = startTn+1; Ti <= finishTn; Ti++){
if (read set of Tv intersects write set of Ti) valid = false;
}
- Forward validation of transaction Tv
boolean valid = true;
for (int Tid = active1; Tid <= activeN; Tid++){
if (write set of Tv intersects read set of Tid) valid = false;
}
| Tv
| Ti
| Description
|
| write
| read
| Ti must not read objects written by Tv
|
| read
| write
| Tv must not read objects written by Ti
|
| write
| write
| Ti must not write objects written by Tv and
Tv must not write objects written by Ti
|
Class Exercises
Two transactions T and U are defined as follows:
T: y = read(k); x = read(i); write(j, 22);
U: write(i,33); write(j,44);
Consider optimistic concurrency control as applied to the transactions T and U. Suppose the transactions T and U are active at the same time as one another. Describe the outcome in each of the following cases:
- T’s request to commit comes first and backward validation is used;
- U’s request to commit comes first and backward validation is used;
- T’s request to commit comes first and forward validation is used;
- U’s request to commit comes first and forward validation is used.
Notes: in each case your description must include whether transactions T and U commit or abort and why. Remember that writes are not carried out until after validation.
Timestamp Ordering
- A transaction under OCC is strictly aborted if a conflict occurs.
- Sometimes, however, a transaction does not need to be aborted.
- The idea of timestamp ordering is to try to commit transactions in a sequential order according to the order of their timestamps.
- First, a timestamp is assigned to a transaction the moment it does BEGIN_TRANSACTION (i.e. when a transaction starts its execution).
- The basic concept of timestamp ordering is as follows:
If a transaction wants to
write to an object that was last
accessed (read and written) by a transaction with an earlier timestamp, then the operation can go ahead; otherwise, it will be aborted.
If a transaction wants to read an object that was last written by a transaction with an earlier timestamp, then the operation can go ahead; otherwise it will be aborted.
- Try to commit transaction in a sequential order
- A timestamp is assigned to transaction when it start
- The idea is “if a data item is NOT modified by a Committed Younger ( newer ) transaction , the transaction can read or write it.
(c) is ok, as T4 is not committed
(d) is not ok as T4 is committed and T3 is older than T4
| Tc
| Ti
| Description
|
| write
| read
| Tc must not write an object that has been read by any Ti where Ti > Tc
this requires that Tc > = the maximum read timestamp of the object.
|
| write
| write
| Tc must not write an object that has been written by any Ti where Ti > Tc
this requires that Tc > write timestamp of the committed object.
|
| read
| write
| Tc must not read an object that has been written by any Ti where Ti > Tc
this requires that Tc > write timestamp of the committed object.
|
Comparison of Concurrency Control Methods
- Optimistic concurrency control (OCC) is an optimistic approach.
- It is suitable where there are few conflicts among transactions since almost all transactions can be concurrently committed using this method.
- However, much work may have to be repeated when a transaction is forced to abort because of conflicts.
- When the number of conflicts is large or the number of files required by most of the transactions is large, OCC is not worthwhile.
- Two-phase locking and timestamp ordering are pessimistic approaches, but they differ in the strategy used when conflicting access to a data item is detected.
- Timestamp ordering decides the serialization order statically when a transaction starts.
- That means when a transaction starts, a unique timestamp is assigned to it that represents its priority — it has higher priority than the ones that started later and it has a lower priority than the ones that started earlier.
- This status cannot be changed until the transaction commits or aborts.
- Thus we can say that the order is fixed (static) once the transaction starts.
- Two-phase locking decides the serialization order dynamically, since the order is determined according to the order in which data items are accessed.
- If the transaction is the first one to access the data, it locks and owns the data.
- If the transaction comes later, it has to queue up to wait for other transactions to release the data.
- Timestamp ordering is good for transactions with predominantly READ operations.
- Two-phase locking is good for transactions with more WRITE operations than READ operations.
- If a transaction is to be written, locking is more secure to allow the data to be modified.
Deadlocks in Distributed Systems
Deadlocks
- Locking is the most widely used, but locking may lead to dead lock
- Four strategies to handle deadlocks
- Ostrich algorithm – do not do anything, let the application handle deadlocks itself
- Deadlock detection and recovery – detect and try to recover deadlocks
Deadlock detection and recovery lets deadlocks occur, detects them and tries to recover from them.
- Deadlock prevention – statically try to make deadlock impossible by adding algorithm to transaction
- Deadlock avoidance – avoids deadlocks by allocating resources carefully
Deadlock Handling: The ostrich algorithm
- The system just ignores the possibility of deadlocks (or pretends there is no deadlock problem by ‘putting its head in the sand’), and lets individual applications deal with the problem if deadlocks really occur.
- This strategy is reasonable if deadlocks occur rarely and the cost of deadlock detection and prevention at a system level is relatively high.
- The ostrich algorithm supposes that it is the responsibility of individual applications to implement their own solution if they need one.
Deadlock Handling: Deadlock Prevention
- Deadlock prevention statically makes deadlocks structurally impossible.
- That means an algorithm is developed so that if all transactions follow the algorithm, deadlocks are impossible.
- Deadlock prevention is possible but very difficult to implement.
Deadlock Handling: Deadlock Avoidance
- Deadlock avoidance avoids deadlocks by allocating resources carefully.
- Note that this is different from deadlock prevention
- Deadlock prevention adds an algorithm for transactions to follow, and that might result in significant overheads and some transactions may abort to prevent a deadlock occurring.
- Deadlock avoidance does not touch transactions — it focuses on the distribution of resources to avoid deadlocks.
Deadlock Detection Methods
- Two types of deadlock detection
- Centralized deadlock detection
- Distributed deadlock detection
Centralized Deadlock Detection
- Each machine maintains the resource graph for its own processes and resources.
- The resource graph
- Circle is process, square is resource
- Arrow point from process to resource, mean process is waiting for this resource
- Arrow point from resource to process , mean process is holding this resource
- Illustration: an example of resource graph
- A reading: Kinesia Online Course - Advanced Operating Systems
- A centralized coordinator maintain the resource graph for the entire system.
- When the coordinator detect a cycle, it kills off one process to break the deadlock.
- In updating the coordinator’s graph, messages have to be passed.
- Method 1) Whenever an arc is added or deleted from the resource graph, a message have to be sent to the coordinator.
- Method 2) Periodically, every process can send a list of arcs added and deleted since previous update.
- Method 3) Coordinator ask for information when it needs it.
Deadlock Prevention Methods
- A method that might work is to order the resources and require processes to acquire them in strictly increasing order. This approach means that a process can never hold a high resource and ask for a low one, thus making cycles impossible.
- With global timing and transactions in distributed systems, two other methods are possible -- both based on the idea of assigning each transaction a global timestamp at the moment it starts.
- When one process is about to block waiting for a resource that another process is using, a check is made to see which has a larger timestamp.
- We can then allow the wait only if the waiting process has a lower timestamp.
- The timestamp is always increasing if we follow any chain of waiting processes, so cycles are impossible --- we can used decreasing order if we like.
- It is wiser to give priority to old processes because
- they have run longer so the system have larger investment on these processes.
- they are likely to hold more resources.
- A young process that is killed off will eventually age until it is the oldest one in the system, and that eliminates starvation.
Wait-die
- As we have pointed out before, killing a transaction is relatively harmless, since by definition it can be restarted safely later.
- Wait-die:
- If an old process wants a resource held by a young process, the old one will wait.
- If a young process wants a resource held by an old process, the young process will be killed.
- Observation: The young process, after being killed, will then start up again, and be killed again. This cycle may go on many times before the old one release the resource.
Wound-Wait
- Once we are assuming the existence of transactions, we can do something that had previously been forbidden: take resources away from running processes.
- When a conflict arises, instead of killing the process making the request, we can kill the resource owner. Without transactions, killing a process might have severe consequences. With transactions, these effects will vanish magically when the transaction dies.
- Wound-wait: (we allow preemption & ancestor worship)
- If an old process wants a resource held by a young process, the old one will preempt the young process.
- If a young process wants a resource held by an old process, the young process will wait.
Class Exercises
Wait-die and wound-wait are two practical algorithms for distributed deadlock prevention
- Describe the common idea shared by these two algorithms.
- State one advantage and one disadvantage of wound-wait over wait-die.
- A process with transaction timestamp 64 needs a resource held by a process with transaction timestamp 82. Explain what happens (1) when wait-die algorithm is used; and (2) when wound-wait algorithm is used.
- Why the methods wait-die and wound-wait cannot apply in the deadlock prevention if processes are allowed to hold more than one resource? Give one example to explain your reason.
Thanks for Reading
If you would rather like to have this lecture note in printed format, please click the print action link in the top right corner.
If you find any problem in this lecture note, please feel free to reach Steven by steven@findaway.hk