Deadlock

Last updated

Both processes need resources to continue execution. P1 requires additional resource R1 and is in possession of resource R2, P2 requires additional resource R2 and is in possession of R1; neither process can continue. Process deadlock.svg
Both processes need resources to continue execution. P1 requires additional resource R1 and is in possession of resource R2, P2 requires additional resource R2 and is in possession of R1; neither process can continue.
Four processes (blue lines) compete for one resource (grey circle), following a right-before-left policy. A deadlock occurs when all processes lock the resource simultaneously (black lines). The deadlock can be resolved by breaking the symmetry. Deadlock at a four-way-stop.gif
Four processes (blue lines) compete for one resource (grey circle), following a right-before-left policy. A deadlock occurs when all processes lock the resource simultaneously (black lines). The deadlock can be resolved by breaking the symmetry.

In concurrent computing, a deadlock is a state in which each member of a group is waiting for another member, including itself, to take action, such as sending a message or more commonly releasing a lock. [1] Deadlock is a common problem in multiprocessing systems, parallel computing, and distributed systems, where software and hardware locks are used to arbitrate shared resources and implement process synchronization. [2]

Contents

In an operating system, a deadlock occurs when a process or thread enters a waiting state because a requested system resource is held by another waiting process, which in turn is waiting for another resource held by another waiting process. If a process is unable to change its state indefinitely because the resources requested by it are being used by another waiting process, then the system is said to be in a deadlock. [3]

In a communications system, deadlocks occur mainly due to lost or corrupt signals rather than resource contention. [4]

Two processes competing for two resources in opposite order.
A single process goes through.
The later process has to wait.
A deadlock occurs when the first process locks the first resource at the same time as the second process locks the second resource.
The deadlock can be resolved by cancelling and restarting the first process. Two processes, two resources.gif
Two processes competing for two resources in opposite order.
  1. A single process goes through.
  2. The later process has to wait.
  3. A deadlock occurs when the first process locks the first resource at the same time as the second process locks the second resource.
  4. The deadlock can be resolved by cancelling and restarting the first process.

Necessary conditions

A deadlock situation on a resource can arise if and only if all of the following conditions hold simultaneously in a system: [5]

  1. Mutual exclusion: At least one resource must be held in a non-shareable mode. Otherwise, the processes would not be prevented from using the resource when necessary. Only one process can use the resource at any given instant of time. [6]
  2. Hold and wait or resource holding: a process is currently holding at least one resource and requesting additional resources which are being held by other processes.
  3. No preemption: a resource can be released only voluntarily by the process holding it.
  4. Circular wait: each process must be waiting for a resource which is being held by another process, which in turn is waiting for the first process to release the resource. In general, there is a set of waiting processes, P = {P1, P2, …, PN}, such that P1 is waiting for a resource held by P2, P2 is waiting for a resource held by P3 and so on until PN is waiting for a resource held by P1. [3] [7]

These four conditions are known as the Coffman conditions from their first description in a 1971 article by Edward G. Coffman, Jr. [7]

While these conditions are sufficient to produce a deadlock on single-instance resource systems, they only indicate the possibility of deadlock on systems having multiple instances of resources. [8]

Deadlock handling

Most current operating systems cannot prevent deadlocks. [9] When a deadlock occurs, different operating systems respond to them in different non-standard manners. Most approaches work by preventing one of the four Coffman conditions from occurring, especially the fourth one. [10] Major approaches are as follows.

Ignoring deadlock

In this approach, it is assumed that a deadlock will never occur. This is also an application of the Ostrich algorithm. [10] [11] This approach was initially used by MINIX and UNIX. [7] This is used when the time intervals between occurrences of deadlocks are large and the data loss incurred each time is tolerable.

Detection

Under the deadlock detection, deadlocks are allowed to occur. Then the state of the system is examined to detect that a deadlock has occurred and subsequently it is corrected. An algorithm is employed that tracks resource allocation and process states, it rolls back and restarts one or more of the processes in order to remove the detected deadlock. Detecting a deadlock that has already occurred is easily possible since the resources that each process has locked and/or currently requested are known to the resource scheduler of the operating system. [11]

After a deadlock is detected, it can be corrected by using one of the following methods: [12]

  1. Process termination: one or more processes involved in the deadlock may be aborted. One could choose to abort all competing processes involved in the deadlock. This ensures that deadlock is resolved with certainty and speed.[ citation needed ] But the expense is high as partial computations will be lost. Or, one could choose to abort one process at a time until the deadlock is resolved. This approach has high overhead because after each abort an algorithm must determine whether the system is still in deadlock.[ citation needed ] Several factors must be considered while choosing a candidate for termination, such as priority and age of the process.[ citation needed ]
  2. Resource preemption: resources allocated to various processes may be successively preempted and allocated to other processes until the deadlock is broken. [13]

Prevention

(A) Two processes competing for one resource, following a first-come, first-served policy. (B) Deadlock occurs when both processes lock the resource simultaneously. (C) The deadlock can be resolved by breaking the symmetry of the locks. (D) The deadlock can be prevented by breaking the symmetry of the locking mechanism. Avoiding deadlock.gif
(A) Two processes competing for one resource, following a first-come, first-served policy. (B) Deadlock occurs when both processes lock the resource simultaneously. (C) The deadlock can be resolved by breaking the symmetry of the locks. (D) The deadlock can be prevented by breaking the symmetry of the locking mechanism.

Deadlock prevention works by preventing one of the four Coffman conditions from occurring.

Livelock

A livelock is similar to a deadlock, except that the states of the processes involved in the livelock constantly change with regard to one another, none progressing.

The term was coined by Edward A. Ashcroft in a 1975 paper [15] in connection with an examination of airline booking systems. [16] Livelock is a special case of resource starvation; the general definition only states that a specific process is not progressing. [17]

Livelock is a risk with some algorithms that detect and recover from deadlock. If more than one process takes action, the deadlock detection algorithm can be repeatedly triggered. This can be avoided by ensuring that only one process (chosen arbitrarily or by priority) takes action. [18]

Distributed deadlock

Distributed deadlocks can occur in distributed systems when distributed transactions or concurrency control is being used.

Distributed deadlocks can be detected either by constructing a global wait-for graph from local wait-for graphs at a deadlock detector or by a distributed algorithm like edge chasing.

Phantom deadlocks are deadlocks that are falsely detected in a distributed system due to system internal delays but do not actually exist.

For example, if a process releases a resource R1 and issues a request for R2, and the first message is lost or delayed, a coordinator (detector of deadlocks) could falsely conclude a deadlock (if the request for R2 while having R1 would cause a deadlock).

See also

Related Research Articles

Mutual exclusion property of concurrency control, which is instituted for the purpose of preventing race conditions

In computer science, mutual exclusion is a property of concurrency control, which is instituted for the purpose of preventing race conditions. It is the requirement that one thread of execution never enters its critical section at the same time that another concurrent thread of execution enters its own critical section, which refers to an interval of time during which a thread of execution accesses a shared resource, such as shared memory.

In computer science, a semaphore is a variable or abstract data type used to control access to a common resource by multiple processes in a concurrent system such as a multitasking operating system. A semaphore is simply a variable. This variable is used to solve critical section problems and to achieve process synchronization in the multi processing environment. A trivial semaphore is a plain variable that is changed depending on programmer-defined conditions.

In computer science, rate-monotonic scheduling (RMS) is a priority assignment algorithm used in real-time operating systems (RTOS) with a static-priority scheduling class. The static priorities are assigned according to the cycle duration of the job, so a shorter cycle duration results in a higher job priority.

In computer science, a lock or mutex is a synchronization mechanism for enforcing limits on access to a resource in an environment where there are many threads of execution. A lock is designed to enforce a mutual exclusion concurrency control policy.

In computer science, the dining philosophers problem is an example problem often used in concurrent algorithm design to illustrate synchronization issues and techniques for resolving them.

In computer science, an algorithm is called non-blocking if failure or suspension of any thread cannot cause failure or suspension of another thread; for some operations, these algorithms provide a useful alternative to traditional blocking implementations. A non-blocking algorithm is lock-free if there is guaranteed system-wide progress, and wait-free if there is also guaranteed per-thread progress.

Self-stabilization is a concept of fault-tolerance in distributed systems. Given any initial state, a self-stabilizing distributed system will end up in a correct state in a finite number of execution steps.

In computer science, the ostrich algorithm is a strategy of ignoring potential problems on the basis that they may be exceedingly rare. It is named for the ostrich effect which is defined as "to stick one's head in the sand and pretend there is no problem". It is used when it is more cost-effective to allow the problem to occur than to attempt its prevention.

Eventual consistency is a consistency model used in distributed computing to achieve high availability that informally guarantees that, if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value. Eventual consistency, also called optimistic replication, is widely deployed in distributed systems, and has origins in early mobile computing projects. A system that has achieved eventual consistency is often said to have converged, or achieved replica convergence. Eventual consistency is a weak guarantee – most stronger models, like linearizability are trivially eventually consistent, but a system that is merely eventually consistent does not usually fulfill these stronger constraints.

Concurrent computing is a form of computing in which several computations are executed concurrently—during overlapping time periods—instead of sequentially, with one completing before the next starts.

In computer science, synchronization refers to one of two distinct but related concepts: synchronization of processes, and synchronization of data. Process synchronization refers to the idea that multiple processes are to join up or handshake at a certain point, in order to reach an agreement or commit to a certain sequence of action. Data synchronization refers to the idea of keeping multiple copies of a dataset in coherence with one another, or to maintain data integrity. Process synchronization primitives are commonly used to implement data synchronization.

The Banker algorithm, sometimes referred to as the detection algorithm, is a resource allocation and deadlock avoidance algorithm developed by Edsger Dijkstra that tests for safety by simulating the allocation of predetermined maximum possible amounts of all resources, and then makes an "s-state" check to test for possible deadlock conditions for all other pending activities, before deciding whether allocation should be allowed to continue.

In computer science, a lock convoy is a performance problem that can occur when using locks for concurrency control in a multithreaded application.

In computer science, resource contention is a conflict over access to a shared resource such as random access memory, disk storage, cache memory, internal buses or external network devices. A resource experiencing ongoing contention can be described as oversubscribed.

Wait-for graph

A wait-for graph in computer science is a directed graph used for deadlock detection in operating systems and relational database systems.

In computing, a hang or freeze occurs when either a process or system ceases to respond to inputs. A typical example is when computer's graphical user interface no longer responds to the user typing on the keyboard or moving the mouse. The term covers a wide range of behaviors in both clients and servers, and is not limited to graphical user interface issues.

A distributed operating system is a software over a collection of independent, networked, communicating, and physically separate computational nodes. They handle jobs which are serviced by multiple CPUs. Each individual node holds a specific software subset of the global aggregate operating system. Each subset is a composite of two distinct service provisioners. The first is a ubiquitous minimal kernel, or microkernel, that directly controls that node's hardware. Second is a higher-level collection of system management components that coordinate the node's individual and collaborative activities. These components abstract microkernel functions and support user applications.

In computer science, deadlock prevention algorithms are used in concurrent programming when multiple processes must acquire more than one shared resource. If two or more concurrent processes obtain multiple resources indiscriminately, a situation can occur where each process has a resource needed by another process. As a result, none of the processes can obtain all the resources it needs, so all processes are blocked from further execution. This situation is called a deadlock. A deadlock prevention algorithm organizes resource usage by each process to ensure that at least one process is always able to get all the resources it needs.

In computer science, a Lease is a contract that gives its holder specified rights to some resource for a limited period. Because it is time-limited, a lease is an alternative to a lock for resource serialization.

The Chandy–Misra–Haas algorithm resource model checks for deadlock in a distributed system. It was developed by K. Mani Chandy, Jayadev Misra and Laura M Haas.

References

  1. Coulouris, George (2012). Distributed Systems Concepts and Design. Pearson. p. 716. ISBN   978-0-273-76059-7.
  2. Padua, David (2011). Encyclopedia of Parallel Computing. Springer. p. 524. ISBN   9780387097657 . Retrieved 28 January 2012.
  3. 1 2 3 Silberschatz, Abraham (2006). Operating System Principles (7th ed.). Wiley-India. p. 237. ISBN   9788126509621 . Retrieved 29 January 2012.
  4. Schneider, G. Michael (2009). Invitation to Computer Science. Cengage Learning. p. 271. ISBN   978-0324788594 . Retrieved 28 January 2012.
  5. Silberschatz, Abraham (2006). Operating System Principles (7 ed.). Wiley-India. p. 239. ISBN   9788126509621 . Retrieved 29 January 2012.
  6. "ECS 150 Spring 1999: Four Necessary and Sufficient Conditions for Deadlock". nob.cs.ucdavis.edu. Archived from the original on 29 April 2018. Retrieved 29 April 2018.
  7. 1 2 3 Shibu, K. (2009). Intro To Embedded Systems (1st ed.). Tata McGraw-Hill Education. p. 446. ISBN   9780070145894 . Retrieved 28 January 2012.
  8. "Operating Systems: Deadlocks". www.cs.uic.edu. Retrieved 25 April 2020. If a resource category contains more than one instance, then the presence of a cycle in the resource-allocation graph indicates the possibility of a deadlock, but does not guarantee one. Consider, for example, Figures 7.3 and 7.4 below:
  9. Silberschatz, Abraham (2006). Operating System Principles (7 ed.). Wiley-India. p. 237. ISBN   9788126509621 . Retrieved 29 January 2012.
  10. 1 2 Stuart, Brian L. (2008). Principles of operating systems (1st ed.). Cengage Learning. p. 446. ISBN   9781418837693 . Retrieved 28 January 2012.
  11. 1 2 Tanenbaum, Andrew S. (1995). Distributed Operating Systems (1st ed.). Pearson Education. p. 117. ISBN   9788177581799 . Retrieved 28 January 2012.
  12. "Deadlock Detection And Recovery in OS » PREP INSTA". PREP INSTA. Retrieved 23 May 2020.
  13. "IBM Knowledge Center". www.ibm.com. Archived from the original on 19 March 2017. Retrieved 29 April 2018.
  14. Silberschatz, Abraham (2006). Operating System Principles (7 ed.). Wiley-India. p. 244. ISBN   9788126509621 . Retrieved 29 January 2012.
  15. Ashcroft, E.A. (1975). "Proving assertions about parallel programs". Journal of Computer and System Sciences. 10: 110–135. doi:10.1016/S0022-0000(75)80018-3.
  16. Kwong, Y. S. (1979). "On the absence of livelocks in parallel programs". Semantics of Concurrent Computation. Lecture Notes in Computer Science. 70. pp. 172–190. doi:10.1007/BFb0022469. ISBN   3-540-09511-X.
  17. Anderson, James H.; Yong-jik Kim (2001). "Shared-memory mutual exclusion: Major research trends since 1986". Archived from the original on 25 May 2006.
  18. Zöbel, Dieter (October 1983). "The Deadlock problem: a classifying bibliography". ACM SIGOPS Operating Systems Review. 17 (4): 6–15. doi:10.1145/850752.850753. ISSN   0163-5980.

Further reading