Concurrency (computer science)

Last updated

Concurrency refers to the ability of a system to execute multiple tasks through simultaneous execution or time-sharing (context switching), sharing resources and managing interactions. Concurrency improves responsiveness, throughput, and scalability in modern computing, including: [1] [2] [3] [4] [5]

Contents

Concurrency is a broader concept that encompasses several related ideas, including: [1] [2] [3] [4] [5]

Issues

Because computations in a concurrent system can interact with each other while being executed, the number of possible execution paths in the system can be extremely large, and the resulting outcome can be indeterminate. Concurrent use of shared resources can be a source of indeterminacy leading to issues such as deadlocks, and resource starvation. [7]

Design of concurrent systems often entails finding reliable techniques for coordinating their execution, data exchange, memory allocation, and execution scheduling to minimize response time and maximise throughput. [8]

Theory

Concurrency theory has been an active field of research in theoretical computer science. One of the first proposals was Carl Adam Petri's seminal work on Petri nets in the early 1960s. In the years since, a wide variety of formalisms have been developed for modeling and reasoning about concurrency.

Models

A number of formalisms for modeling and understanding concurrent systems have been developed, including: [9]

Some of these models of concurrency are primarily intended to support reasoning and specification, while others can be used through the entire development cycle, including design, implementation, proof, testing and simulation of concurrent systems. Some of these are based on message passing, while others have different mechanisms for concurrency.

The proliferation of different models of concurrency has motivated some researchers to develop ways to unify these different theoretical models. For example, Lee and Sangiovanni-Vincentelli have demonstrated that a so-called "tagged-signal" model can be used to provide a common framework for defining the denotational semantics of a variety of different models of concurrency, [11] while Nielsen, Sassone, and Winskel have demonstrated that category theory can be used to provide a similar unified understanding of different models. [12]

The Concurrency Representation Theorem in the actor model provides a fairly general way to represent concurrent systems that are closed in the sense that they do not receive communications from outside. (Other concurrency systems, e.g., process calculi can be modeled in the actor model using a two-phase commit protocol. [13] ) The mathematical denotation denoted by a closed system S is constructed increasingly better approximations from an initial behavior called S using a behavior approximating function progressionS to construct a denotation (meaning ) for S as follows: [14]

DenoteS ≡ ⊔i∈ωprogressionSi(⊥S)

In this way, S can be mathematically characterized in terms of all its possible behaviors.

Logics

Various types of temporal logic [15] can be used to help reason about concurrent systems. Some of these logics, such as linear temporal logic and computation tree logic, allow assertions to be made about the sequences of states that a concurrent system can pass through. Others, such as action computational tree logic, Hennessy–Milner logic, and Lamport's temporal logic of actions, build their assertions from sequences of actions (changes in state). The principal application of these logics is in writing specifications for concurrent systems. [7]

Practice

Concurrent programming encompasses programming languages and algorithms used to implement concurrent systems. Concurrent programming is usually considered[ by whom? ] to be more general than parallel programming because it can involve arbitrary and dynamic patterns of communication and interaction, whereas parallel systems generally[ according to whom? ] have a predefined and well-structured communications pattern. The base goals of concurrent programming include correctness, performance and robustness. Concurrent systems such as Operating systems and Database management systems are generally designed[ by whom? ] to operate indefinitely, including automatic recovery from failure, and not terminate unexpectedly (see Concurrency control). Some[ example needed ] concurrent systems implement a form of transparent concurrency, in which concurrent computational entities may compete for and share a single resource, but the complexities of this competition and sharing are shielded from the programmer.

Because they use shared resources, concurrent systems in general[ according to whom? ] require the inclusion of some[ example needed ] kind of arbiter somewhere in their implementation (often in the underlying hardware), to control access to those resources. The use of arbiters introduces the possibility of indeterminacy in concurrent computation which has major implications for practice including correctness and performance. For example, arbitration introduces unbounded nondeterminism which raises issues with model checking because it causes explosion in the state space and can even cause models to have an infinite number of states.

Some concurrent programming models include coprocesses and deterministic concurrency. In these models, threads of control explicitly yield their timeslices, either to the system or to another process.

See also

Related Research Articles

<span class="mw-page-title-main">Process (computing)</span> Particular execution of a computer program

In computing, a process is the instance of a computer program that is being executed by one or many threads. There are many different process models, some of which are light weight, but almost all processes are rooted in an operating system (OS) process which comprises the program code, assigned system resources, physical and logical access permissions, and data structures to initiate, control and coordinate execution activity. Depending on the OS, a process may be made up of multiple threads of execution that execute instructions concurrently.

<span class="mw-page-title-main">Thread (computing)</span> Smallest sequence of programmed instructions that can be managed independently by a scheduler

In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system. In many cases, a thread is a component of a process.

<span class="mw-page-title-main">Superscalar processor</span> CPU that implements instruction-level parallelism within a single processor

A superscalar processor is a CPU that implements a form of parallelism called instruction-level parallelism within a single processor. In contrast to a scalar processor, which can execute at most one single instruction per clock cycle, a superscalar processor can execute or start executing more than one instruction during a clock cycle by simultaneously dispatching multiple instructions to different execution units on the processor. It therefore allows more throughput than would otherwise be possible at a given clock rate. Each execution unit is not a separate processor, but an execution resource within a single CPU such as an arithmetic logic unit.

In computer science, denotational semantics is an approach of formalizing the meanings of programming languages by constructing mathematical objects that describe the meanings of expressions from the languages. Other approaches providing formal semantics of programming languages include axiomatic semantics and operational semantics.

<span class="mw-page-title-main">Parallel computing</span> Programming paradigm in which many processes are executed simultaneously

Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different forms of parallel computing: bit-level, instruction-level, data, and task parallelism. Parallelism has long been employed in high-performance computing, but has gained broader interest due to the physical constraints preventing frequency scaling. As power consumption by computers has become a concern in recent years, parallel computing has become the dominant paradigm in computer architecture, mainly in the form of multi-core processors.

Computer science is the study of the theoretical foundations of information and computation and their implementation and application in computer systems. One well known subject classification system for computer science is the ACM Computing Classification System devised by the Association for Computing Machinery.

In programming language theory, semantics is the rigorous mathematical study of the meaning of programming languages. Semantics assigns computational meaning to valid strings in a programming language syntax. It is closely related to, and often crosses over with, the semantics of mathematical proofs.

In computer science, a parallel random-access machine is a shared-memory abstract machine. As its name indicates, the PRAM is intended as the parallel-computing analogy to the random-access machine (RAM). In the same way that the RAM is used by sequential-algorithm designers to model algorithmic performance, the PRAM is used by parallel-algorithm designers to model parallel algorithmic performance. Similar to the way in which the RAM model neglects practical issues, such as access time to cache memory versus main memory, the PRAM model neglects such issues as synchronization and communication, but provides any (problem-size-dependent) number of processors. Algorithm cost, for instance, is estimated using two parameters O(time) and O(time × processor_number).

The actor model in computer science is a mathematical model of concurrent computation that treats an actor as the basic building block of concurrent computation. In response to a message it receives, an actor can: make local decisions, create more actors, send more messages, and determine how to respond to the next message received. Actors may modify their own private state, but can only affect each other indirectly through messaging.

In computing, a parallel programming model is an abstraction of parallel computer architecture, with which it is convenient to express algorithms and their composition in programs. The value of a programming model can be judged on its generality: how well a range of different problems can be expressed for a variety of different architectures, and its performance: how efficiently the compiled programs can execute. The implementation of a parallel programming model can take the form of a library invoked from a programming language, as an extension to an existing languages.

Automatic parallelization, also auto parallelization, or autoparallelization refers to converting sequential code into multi-threaded and/or vectorized code in order to use multiple processors simultaneously in a shared-memory multiprocessor (SMP) machine. Fully automatic parallelization of sequential programs is a challenge because it requires complex program analysis and the best approach may depend upon parameter values that are not known at compilation time.

Concurrent computing is a form of computing in which several computations are executed concurrently—during overlapping time periods—instead of sequentially—with one completing before the next starts.

In computer science, the Actor model, first published in 1973, is a mathematical model of concurrent computation.

In computer science, unbounded nondeterminism or unbounded indeterminacy is a property of concurrency by which the delay in servicing a request can become unbounded as a result of arbitration of contention for shared resources while still guaranteeing that the request will eventually be serviced. Unbounded nondeterminism became an important issue in the development of the denotational semantics of concurrency, and later became part of research into the theoretical concept of hypercomputation.

Indeterminacy in concurrent computation is concerned with the effects of indeterminacy in concurrent computation. Computation is an area in which indeterminacy is becoming increasingly important because of the massive increase in concurrency due to networking and the advent of many-core computer architectures. These computer systems make use of arbiters which gives rise to indeterminacy.

<span class="mw-page-title-main">Data parallelism</span> Parallelization across multiple processors in parallel computing environments

Data parallelism is parallelization across multiple processors in parallel computing environments. It focuses on distributing the data across different nodes, which operate on the data in parallel. It can be applied on regular data structures like arrays and matrices by working on each element in parallel. It contrasts to task parallelism as another form of parallelism.

Task parallelism is a form of parallelization of computer code across multiple processors in parallel computing environments. Task parallelism focuses on distributing tasks—concurrently performed by processes or threads—across different processors. In contrast to data parallelism which involves running the same task on different components of data, task parallelism is distinguished by running many different tasks at the same time on the same data. A common type of task parallelism is pipelining, which consists of moving a single set of data through a series of separate tasks where each task can execute independently of the others.

<span class="mw-page-title-main">Parallel Extensions</span>

Parallel Extensions was the development name for a managed concurrency library developed by a collaboration between Microsoft Research and the CLR team at Microsoft. The library was released in version 4.0 of the .NET Framework. It is composed of two parts: Parallel LINQ (PLINQ) and Task Parallel Library (TPL). It also consists of a set of coordination data structures (CDS) – sets of data structures used to synchronize and co-ordinate the execution of concurrent tasks.

Explicit Multi-Threading (XMT) is a computer science paradigm for building and programming parallel computers designed around the parallel random-access machine (PRAM) parallel computational model. A more direct explanation of XMT starts with the rudimentary abstraction that made serial computing simple: that any single instruction available for execution in a serial program executes immediately. A consequence of this abstraction is a step-by-step (inductive) explication of the instruction available next for execution. The rudimentary parallel abstraction behind XMT, dubbed Immediate Concurrent Execution (ICE) in Vishkin (2011), is that indefinitely many instructions available for concurrent execution execute immediately. A consequence of ICE is a step-by-step (inductive) explication of the instructions available next for concurrent execution. Moving beyond the serial von Neumann computer, the aspiration of XMT is that computer science will again be able to augment mathematical induction with a simple one-line computing abstraction.

Join-patterns provides a way to write concurrent, parallel and distributed computer programs by message passing. Compared to the use of threads and locks, this is a high level programming model using communication constructs model to abstract the complexity of concurrent environment and to allow scalability. Its focus is on the execution of a chord between messages atomically consumed from a group of channels.

References

  1. 1 2 Operating System Concepts. Wiley. 29 July 2008. ISBN   978-0470128725.
  2. 1 2 Computer Organization and Design: The Hardware/Software Interface. The Morgan Kaufmann Series in Computer Architecture and Design. Morgan Kaufmann. 2012. ISBN   978-0123747501.
  3. 1 2 Distributed Systems: Concepts and Design. Pearson. 2012. ISBN   978-0132143011.
  4. 1 2 Quinn, Michael Jay (1994). Parallel Computing: Theory and Practice. McGraw-Hill. ISBN   978-0070512948.
  5. 1 2 Parallel and Distributed Computing Handbook. ISBN   978-0070730205.
  6. Parallel and Concurrent Programming in Haskell. O'Reilly Media. 2013. ISBN   9781449335922.
  7. 1 2 Cleaveland, Rance; Scott Smolka (December 1996). "Strategic Directions in Concurrency Research". ACM Computing Surveys. 28 (4): 607. doi: 10.1145/242223.242252 . S2CID   13264261.
  8. Campbell, Colin; Johnson, Ralph; Miller, Ade; Toub, Stephen (August 2010). Parallel Programming with Microsoft .NET. Microsoft Press. ISBN   978-0-7356-5159-3.
  9. Filman, Robert; Daniel Friedman (1984). Coordinated Computing - Tools and Techniques for Distributed Software . McGraw-Hill. ISBN   978-0-07-022439-1.
  10. Keller, Jörg; Christoph Keßler; Jesper Träff (2001). Practical PRAM Programming. John Wiley and Sons.
  11. Lee, Edward; Alberto Sangiovanni-Vincentelli (December 1998). "A Framework for Comparing Models of Computation" (PDF). IEEE Transactions on CAD . 17 (12): 1217–1229. doi:10.1109/43.736561.
  12. Mogens Nielsen; Vladimiro Sassone; Glynn Winskel (1993). "Relationships Between Models of Concurrency". REX School/Symposium.
  13. Frederick Knabe. A Distributed Protocol for Channel-Based Communication with Choice PARLE 1992.
  14. William Clinger (June 1981). "Foundations of Actor Semantics". Mathematics Doctoral Dissertation. MIT. hdl:1721.1/6935.{{cite journal}}: Cite journal requires |journal= (help)
  15. Roscoe, Colin (2001). Modal and Temporal Properties of Processes. Springer. ISBN   978-0-387-98717-0.

Further reading