Reo Coordination Language

Last updated
Reo circuit: Alternator Alternator circuit.png
Reo circuit: Alternator

Reo [1] [2] is a domain-specific language for programming and analyzing coordination protocols that compose individual processes into full systems, broadly construed. Examples of classes of systems that can be composed with Reo include component-based systems, service-oriented systems, multithreading systems, biological systems, and cryptographic protocols. Reo has a graphical syntax in which every Reo program, called a connector or circuit, is a labeled directed hypergraph. Such a graph represents the data-flow among the processes in the system. Reo has formal semantics, which stand at the basis of its various formal verification techniques and compilation tools.

Contents

Definitions

In Reo, a concurrent system consists of a set of components which are glued together by a circuit that enables flow of data between components. Components can perform I/O operations on the boundary nodes of the circuit to which they are connected. There are two kinds of I/O operations: put-requests dispatch data items to a node, and get-requests fetch data items from a node. All I/O operations are blocking, which means that a component can proceed only after its pending I/O operation has been successfully processed.

The figure on the top-right shows an example of a producers-consumer system with three components: two producers on the left and one consumer on the right. The circuit in the middle defines the protocol, which states that the producers should send data synchronously, while the consumer receives those data in alternating order.

Formally, the structure of a circuit is defined as follows:

Definition 1. A circuit is a triple where:

  1. N is a set of nodes;
  2. is a set of boundary nodes;
  3. is a set of channels;
  4. assigns a types to every channel.

such that , for all . If is a channel, then I is called the set of input nodes of c and O is called the set of output nodes of c.

The dynamics of a circuit resemble the flow of signals through an electronic circuit.

Nodes have fixed merger-replicator behavior: the data of one of the incoming channels is propagated to all outgoing channels, without storing or altering the data (i.e., replicator behavior). If multiple incoming channels can provide data, the node makes a nondeterministic choice among them (i.e., merger behavior). Nodes with only incoming or outgoing channels are called sink nodes or source nodes, respectively; nodes with both incoming and outgoing channels are called mixed nodes.

In contrast to nodes, channels have user-defined behavior represented by their type. This means that channels may store or alter data items that flow through them. Although every channel connects exactly two nodes, these nodes need not to be input and output. For instance, the vertical channel in the figure on the top-right has two inputs and no outputs. The channel type defines the behavior of the channel with respect to data. Below is a list of common types:

Software engineering properties

Exogeneity

One way to classify coordination languages is in terms of their locus: locus of coordination refers to where coordination activity takes place, classifying coordination models and languages as endogenous or exogenous. [3] Endogenous models and languages, such as Linda, provide primitives that must be incorporated within a computation for its coordination. In applications that use such models, primitives that affect the coordination of each module are inside the module itself. In contrast, Reo is an exogenous language that provides primitives that support coordination of entities from without. In applications that use exogenous models, primitives that affect the coordination of each module are outside the module itself.

Endogenous models are sometimes more natural for a given application. However, they generally lead to an intermixing of coordination primitives with computation code, which entangles the semantics of computation with coordination protocols. This intermixing tends to scatter communication/coordination primitives throughout the source code, making the cooperation model and the coordination protocol of an application nebulous and implicit: generally, there is no piece of source code identifiable as the cooperation model or the coordination protocol of an application, that can be designed, developed, debugged, maintained, and reused, in isolation from the rest of the application code. On the other hand, exogenous models encourage development of coordination modules separately and independently of the computation modules they are supposed to coordinate. Consequently, the result of the substantial effort invested in the design and development of the coordination component of an application can manifest itself as tangible "pure coordinator modules" which are easier to understand, and can also be reused in other applications.

Compositionality / reusability

Reo circuits are compositional. This means that one can construct complex circuits by reusing simpler circuits. To be more explicit, two circuits can be glued together on their boundary nodes to form a new joint circuit. Unlike many other models of concurrency (e.g., pi-calculus), synchrony is preserved under composition. This means that if we compose a circuit with synchronous flow between nodes A and B with another circuit with synchronous flow between nodes B and C, the joint circuit has synchronous flow between nodes A and C. In other words, the composition of two synchronous circuits yields a synchronous circuit.

Semantics

The semantics of a Reo circuit is a formal description of its behavior. Various semantics for Reo exist. [4]

Historically the first semantics of Reo was based on the coalgebraic notion of streams (i.e., infinite sequences). [5] This semantics is based on the concept of a timed data stream, which is a pair consisting of a stream of data items and a stream of monotonically increasing time stamps (real numbers). By associating every node with such a timed data stream, the behavior of a channel can be modeled as a relation on the streams on the connected nodes.

Later, an automaton-based semantics was developed, which is called constraint automata. [6] A constraint automaton is a labeled transition system, where transition labels consist of a synchronization constraint and a data constraint. A synchronization constraint specifies which nodes synchronize in the execution step modeled by the transition, and a data constraint specifies which data items flow on these nodes.

One limitation of constraint automata (and timed data streams) is that they cannot directly model context-sensitive behavior, where the behavior of a channel depends on the (un)availability of a pending I/O operation. For example, using constraint automata, it is impossible to directly model the behavior of a LossySync, which should lose data only if its output node has no pending get-request. To solve this problem, another semantics of Reo has been developed, called connector coloring. [7]

Other semantics for Reo make it possible to model timed [8] or probabilistic [9] behavior.

Implementations

The Extensible Coordination Tools (ECT) are a set of plug-ins for Eclipse that constitute an integrated development environment (IDE) for Reo. The ECT consists of a graphical editor for drawing circuits and an animation engine for animating data-flow through circuits. For code generation, the ECT contains a Reo-to-Java compiler, which generates code for circuits based on their constraint automaton semantics. In particular, on input of a Reo circuit, it produces a Java class, which simulates the constraint automaton that models the circuit. For verification, the ECT contains a tool that translates Reo circuits to process definitions in mCRL2. Users can subsequently use mCRL2 for model checking against mu-calculus property specifications. (Alternatively, the Vereofy model checker also supports verification of Reo circuits.)

Another implementation of Reo is developed in the Scala programming language and executes circuits in a distributed fashion. [10]

Related Research Articles

Finite-state machine Mathematical model of computation

A finite-state machine (FSM) or finite-state automaton, finite automaton, or simply a state machine, is a mathematical model of computation. It is an abstract machine that can be in exactly one of a finite number of states at any given time. The FSM can change from one state to another in response to some inputs; the change from one state to another is called a transition. An FSM is defined by a list of its states, its initial state, and the inputs that trigger each transition. Finite-state machines are of two types—deterministic finite-state machines and non-deterministic finite-state machines. A deterministic finite-state machine can be constructed equivalent to any non-deterministic one.

In computer science, communicating sequential processes (CSP) is a formal language for describing patterns of interaction in concurrent systems. It is a member of the family of mathematical theories of concurrency known as process algebras, or process calculi, based on message passing via channels. CSP was highly influential in the design of the occam programming language and also influenced the design of programming languages such as Limbo, RaftLib, Go, Crystal, and Clojure's core.async.

In theoretical computer science, the π-calculus is a process calculus. The π-calculus allows channel names to be communicated along the channels themselves, and in this way it is able to describe concurrent computations whose network configuration may change during the computation.

A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic. The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performance in the "average case" over all possible choices of random bits. Formally, the algorithm's performance will be a random variable determined by the random bits; thus either the running time, or the output are random variables.

Secure multi-party computation is a subfield of cryptography with the goal of creating methods for parties to jointly compute a function over their inputs while keeping those inputs private. Unlike traditional cryptographic tasks, where cryptography assures security and integrity of communication or storage and the adversary is outside the system of participants, the cryptography in this model protects participants' privacy from each other.

In computer science, the process calculi are a diverse family of related approaches for formally modelling concurrent systems. Process calculi provide a tool for the high-level description of interactions, communications, and synchronizations between a collection of independent agents or processes. They also provide algebraic laws that allow process descriptions to be manipulated and analyzed, and permit formal reasoning about equivalences between processes. Leading examples of process calculi include CSP, CCS, ACP, and LOTOS. More recent additions to the family include the π-calculus, the ambient calculus, PEPA, the fusion calculus and the join-calculus.

Concurrency (computer science) Ability of different parts or units of a program, algorithm, or problem to be executed out-of-order or in partial order, without affecting the final outcome

In computer science, concurrency is the ability of different parts or units of a program, algorithm, or problem to be executed out-of-order or in partial order, without affecting the final outcome. This allows for parallel execution of the concurrent units, which can significantly improve overall speed of the execution in multi-processor and multi-core systems. In more technical terms, concurrency refers to the decomposability property of a program, algorithm, or problem into order-independent or partially-ordered components or units.

In computer science, Linda is a model of coordination and communication among several parallel processes operating upon objects stored in and retrieved from shared, virtual, associative memory. It was developed by Sudhir Ahuja at AT&T Bell Laboratories in collaboration with David Gelernter and Nicholas Carriero at Yale University in 1986.

In digital electronics, an asynchronous circuit, or self-timed circuit, is a sequential digital logic circuit which is not governed by a clock circuit or global clock signal. Instead it often uses signals that indicate completion of instructions and operations, specified by simple data transfer protocols. This type of circuit is contrasted with synchronous circuits, in which changes to the signal values in the circuit are triggered by repetitive pulses called a clock signal. Most digital devices today use synchronous circuits. However asynchronous circuits have the potential to be faster, and may also have advantages in lower power consumption, lower electromagnetic interference, and better modularity in large systems. Asynchronous circuits are an active area of research in digital logic design.

The actor model in computer science is a mathematical model of concurrent computation that treats actor as the universal primitive of concurrent computation. In response to a message it receives, an actor can: make local decisions, create more actors, send more messages, and determine how to respond to the next message received. Actors may modify their own private state, but can only affect each other indirectly through messaging.

A synchronous programming language is a computer programming language optimized for programming reactive systems. Computer systems can be sorted in three main classes: (1) transformational systems that take some inputs, process them, deliver their outputs, and terminate their execution; a typical example is a compiler; (2) interactive systems that interact continuously with their environment, at their own speed; a typical example is the web; and (3) reactive systems that interact continuously with their environment, at a speed imposed by the environment; a typical example is the automatic flight control system of modern airplanes. Reactive systems must therefore react to stimuli from the environment within strict time bounds. For this reason they are often also called real-time systems, and are found often in embedded systems.

Kahn process networks

Kahn process networks is a distributed model of computation where a group of deterministic sequential processes are communicating through unbounded FIFO channels. The resulting process network exhibits deterministic behavior that does not depend on the various computation or communication delays. The model was originally developed for modeling distributed systems but has proven its convenience for modeling signal processing systems. As such, KPNs have found many applications in modeling embedded systems, high-performance computing systems, and other computational tasks. KPNs were first introduced by Gilles Kahn.

The primary focus of this article is asynchronous control in digital electronic systems. In a synchronous system, operations are coordinated by one, or more, centralized clock signals. An asynchronous system, in contrast, has no global clock. Asynchronous systems do not depend on strict arrival times of signals or messages for reliable operation. Coordination is achieved using event-driven architecture triggered by network packet arrival, changes (transitions) of signals, handshake protocols, and other methods.

A fundamental problem in distributed computing and multi-agent systems is to achieve overall system reliability in the presence of a number of faulty processes. This often requires coordinating processes to reach consensus, or agree on some data value that is needed during computation. Example applications of consensus include agreeing on what transactions to commit to a database in which order, state machine replication, and atomic broadcasts. Real-world applications often requiring consensus include cloud computing, clock synchronization, PageRank, opinion formation, smart power grids, state estimation, control of UAVs, load balancing, blockchain, and others.

Boolean circuit model of computation

In computational complexity theory and circuit complexity, a Boolean circuit is a mathematical model for combinational digital logic circuits. A formal language can be decided by a family of Boolean circuits, one circuit for each possible input length. Boolean circuits are also used as a formal model for combinational logic in digital electronics.

Reactive programming was first developed by Glenn Wadden in 1986 as a programming language (VTScript) in the Supervisory Control and Data Acquisition (SCADA) industry.

A signal-flow graph or signal-flowgraph (SFG), invented by Claude Shannon, but often called a Mason graph after Samuel Jefferson Mason who coined the term, is a specialized flow graph, a directed graph in which nodes represent system variables, and branches represent functional connections between pairs of nodes. Thus, signal-flow graph theory builds on that of directed graphs, which includes as well that of oriented graphs. This mathematical theory of digraphs exists, of course, quite apart from its applications.

In computer science, the ambient calculus is a process calculus devised by Luca Cardelli and Andrew D. Gordon in 1998, and used to describe and theorise about concurrent systems that include mobility. Here mobility means both computation carried out on mobile devices, and mobile computation. The ambient calculus provides a unified framework for modeling both kinds of mobility. It is used to model interactions in such concurrent systems as the Internet.

In computer science, Constraint automata are a formalism to describe the behavior and possible data flow in coordination models. It was introduced by Arbab et al. as a variation of finite automata for model checking Reo connector circuits.

MLDesigner is an integrated modeling and simulation tool for the design and analysis of complex embedded and networked systems. MLDesigner speeds up modeling, simulation and analysis of discrete event, discrete time and continuous time systems concerning architecture, function and performance. The tools is based on ideas of the "Ptolemy Project", done at the University if California Berkeley. MLDesigner is developed by MLDesign Technologies Inc. Palo Alto, CA, USA in collaboration with Mission Level Design GmbH, Ilmenau, Germany.

References

  1. Farhad Arbab: Reo: a channel-based coordination model for component composition. Mathematical Structures in Computer Science 14(3):329--366, 2004.
  2. Farhad Arbab: Puff, The Magic Protocol. In Gul Agha, Olivier Danvy, Jose Meseguer, editors, Talcott Festschrift, volume 7000 of LNCS, pages 169-206. Springer, 2011.
  3. Farhad Arbab: Composition of Interacting Computations. In Dina Goldin, Scott Smolka, and Peter Wegner, editors, Interactive Computation, pages 277-321. Springer, 2006.
  4. Sung-Shik Jongmans and Farhad Arbab: Overview of Thirty Semantic Formalisms for Reo. Scientific Annals of Computer Science 22(1):201-251, 2012.
  5. Farhad Arbab and Jan Rutten: A Coinductive Calculus of Component Connectors. In Martin Wirsing, Dirk Pattinson, and Rolf Hennicker, editors, Proceedings of WADT 2002, volume 2755 of LNCS, pages 34--55. Springer, 2003.
  6. Christel Baier, Marjan Sirjani, Farhad Arbab, and Jan Rutten: Modeling component connectors in Reo by constraint automata. Science of Computer Programming 61(2):75-113, 2006.
  7. Dave Clarke and David Costa and Farhad Arbab: Connector colouring I: Synchronisation and context dependency. Science of Computer Programming 66(3):205-225, 2007.
  8. Farhad Arbab, Christel Baier, Frank de Boer, and Jan Rutten: Models and temporal logical specifications for timed component connectors. Software and Systems Modeling 6(1):59-82, 2007.
  9. Christel Baier: Probabilistic Models for Reo Connector Circuits. Journal of Universal Computer Science 11(10):1718-1748, 2005.
  10. José Proença: Synchronous Coordination of Distributed Components. PhD thesis, Leiden University, 2011.