Message passing

Last updated

In computer science, message passing is a technique for invoking behavior (i.e., running a program) on a computer. The invoking program sends a message to a process (which may be an actor or object) and relies on that process and its supporting infrastructure to then select and run some appropriate code. Message passing differs from conventional programming where a process, subroutine, or function is directly invoked by name. Message passing is key to some models of concurrency and object-oriented programming.

Contents

Message passing is ubiquitous in modern computer software.[ citation needed ] It is used as a way for the objects that make up a program to work with each other and as a means for objects and systems running on different computers (e.g., the Internet) to interact. Message passing may be implemented by various mechanisms, including channels.

Overview

Message passing is a technique for invoking behavior (i.e., running a program) on a computer. In contrast to the traditional technique of calling a program by name, message passing uses an object model to distinguish the general function from the specific implementations. The invoking program sends a message and relies on the object to select and execute the appropriate code. The justifications for using an intermediate layer essentially falls into two categories: encapsulation and distribution.

Encapsulation is the idea that software objects should be able to invoke services on other objects without knowing or caring about how those services are implemented. Encapsulation can reduce the amount of coding logic and make systems more maintainable. E.g., rather than having IF-THEN statements that determine which subroutine or function to call a developer can just send a message to the object and the object will select the appropriate code based on its type.

One of the first examples of how this can be used was in the domain of computer graphics. There are various complexities involved in manipulating graphic objects. For example, simply using the right formula to compute the area of an enclosed shape will vary depending on if the shape is a triangle, rectangle, ellipse, or circle. In traditional computer programming this would result in long IF-THEN statements testing what sort of object the shape was and calling the appropriate code. The object-oriented way to handle this is to define a class called Shape with subclasses such as Rectangle and Ellipse (which in turn have subclasses Square and Circle) and then to simply send a message to any Shape asking it to compute its area. Each Shape object will then invoke the subclass's method with the formula appropriate for that kind of object. [1]

Distributed message passing provides developers with a layer of the architecture that provides common services to build systems made up of sub-systems that run on disparate computers in different locations and at different times. When a distributed object is sending a message, the messaging layer can take care of issues such as:

Synchronous versus asynchronous message passing

Synchronous message passing

Synchronous message passing occurs between objects that are running at the same time. It is used by object-oriented programming languages such as Java and Smalltalk.

Synchronous messaging is analogous to a synchronous function call; just as the function caller waits until the function completes, the sending process waits until the receiving process completes. This can make synchronous communication unworkable for some applications. For example, large, distributed systems may not perform well enough to be usable. Such large, distributed systems may need to operate while some of their subsystems are down for maintenance, etc.

Imagine a busy business office having 100 desktop computers that send emails to each other using synchronous message passing exclusively. One worker turning off their computer can cause the other 99 computers to freeze until the worker turns their computer back on to process a single email.

Asynchronous message passing

With asynchronous message passing the receiving object can be down or busy when the requesting object sends the message. Continuing the function call analogy, it is like a function call that returns immediately, without waiting for the called function to complete. Messages are sent to a queue where they are stored until the receiving process requests them. The receiving process processes its messages and sends results to a queue for pickup by the original process (or some designated next process). [3]

Asynchronous messaging requires additional capabilities for storing and retransmitting data for systems that may not run concurrently, and are generally handled by an intermediary level of software (often called middleware); a common type being Message-oriented middleware (MOM).

The buffer required in asynchronous communication can cause problems when it is full. A decision has to be made whether to block the sender or whether to discard future messages. A blocked sender may lead to deadlock. If messages are dropped, communication is no longer reliable.

Hybrids

Synchronous communication can be built on top of asynchronous communication by using a Synchronizer. For example, the α-Synchronizer works by ensuring that the sender always waits for an acknowledgement message from the receiver. The sender only sends the next message after the acknowledgement has been received. On the other hand, asynchronous communication can also be built on top of synchronous communication. For example, modern microkernels generally only provide a synchronous messaging primitive[ citation needed ] and asynchronous messaging can be implemented on top by using helper threads.

Distributed objects

Message-passing systems use either distributed or local objects. With distributed objects the sender and receiver may be on different computers, running different operating systems, using different programming languages, etc. In this case the bus layer takes care of details about converting data from one system to another, sending and receiving data across the network, etc. The Remote Procedure Call (RPC) protocol in Unix was an early example of this. Note that with this type of message passing it is not a requirement that sender nor receiver use object-oriented programming. Procedural language systems can be wrapped and treated as large grained objects capable of sending and receiving messages. [4]

Examples of systems that support distributed objects are: Emerald, ONC RPC, CORBA, Java RMI, DCOM, SOAP, .NET Remoting, CTOS, QNX Neutrino RTOS, OpenBinder and D-Bus. Distributed object systems have been called "shared nothing" systems because the message passing abstraction hides underlying state changes that may be used in the implementation of sending messages.


Distributed, or asynchronous, message-passing has additional overhead compared to calling a procedure. In message-passing, arguments must be copied to the new message. Some arguments can contain megabytes of data, all of which must be copied and transmitted to the receiving object.

Traditional procedure calls differ from message-passing in terms of memory usage, transfer time and locality. Arguments are passed to the receiver typically by general purpose registers requiring no additional storage nor transfer time, or in a parameter list containing the arguments' addresses (a few bits). Address-passing is not possible for distributed systems since the systems use separate address spaces.

Web browsers and web servers are examples of processes that communicate by message-passing. A URL is an example of referencing a resource without exposing process internals.

A subroutine call or method invocation will not exit until the invoked computation has terminated. Asynchronous message-passing, by contrast, can result in a response arriving a significant time after the request message was sent.

A message-handler will, in general, process messages from more than one sender. This means its state can change for reasons unrelated to the behavior of a single sender or client process. This is in contrast to the typical behavior of an object upon which methods are being invoked: the latter is expected to remain in the same state between method invocations. In other words, the message-handler behaves analogously to a volatile object.

Mathematical models

The prominent mathematical models of message passing are the Actor model and Pi calculus. [5] [6] In mathematical terms a message is the single means to pass control to an object. If the object responds to the message, it has a method for that message.

Alan Kay has argued that message passing is more important than objects in OOP, and that objects themselves are often over-emphasized. The live distributed objects programming model builds upon this observation; it uses the concept of a distributed data flow to characterize the behavior of a complex distributed system in terms of message patterns, using high-level, functional-style specifications. [7]

Examples

See also

Related Research Articles

Microkernel Kernel that provides fewer services than a traditional kernel

In computer science, a microkernel is the near-minimum amount of software that can provide the mechanisms needed to implement an operating system (OS). These mechanisms include low-level address space management, thread management, and inter-process communication (IPC).

In distributed computing, a remote procedure call (RPC) is when a computer program causes a procedure (subroutine) to execute in a different address space, which is coded as if it were a normal (local) procedure call, without the programmer explicitly coding the details for the remote interaction. That is, the programmer writes essentially the same code whether the subroutine is local to the executing program, or remote. This is a form of client–server interaction, typically implemented via a request–response message-passing system. In the object-oriented programming paradigm, RPCs are represented by remote method invocation (RMI). The RPC model implies a level of location transparency, namely that calling procedures are largely the same whether they are local or remote, but usually they are not identical, so local calls can be distinguished from remote calls. Remote calls are usually orders of magnitude slower and less reliable than local calls, so distinguishing them is important.

The Common Object Request Broker Architecture (CORBA) is a standard defined by the Object Management Group (OMG) designed to facilitate the communication of systems that are deployed on diverse platforms. CORBA enables collaboration between systems on different operating systems, programming languages, and computing hardware. CORBA uses an object-oriented model although the systems that use the CORBA do not have to be object-oriented. CORBA is an example of the distributed object paradigm.

Inter-process communication How computer operating systems enable data sharing

In computer science, inter-process communication or interprocess communication (IPC) refers specifically to the mechanisms an operating system provides to allow the processes to manage shared data. Typically, applications can use IPC, categorized as clients and servers, where the client requests data and the server responds to client requests. Many applications are both clients and servers, as commonly seen in distributed computing.

In computer science, an object can be a variable, a data structure, a function, or a method. As regions of memory, they contain value and are referenced by identifiers.

In computer science, message queues and mailboxes are software-engineering components typically used for inter-process communication (IPC), or for inter-thread communication within the same process. They use a queue for messaging – the passing of control or of content. Group communication systems provide similar kinds of functionality.

Message-oriented middleware (MOM) is software or hardware infrastructure supporting sending and receiving messages between distributed systems. MOM allows application modules to be distributed over heterogeneous platforms and reduces the complexity of developing applications that span multiple operating systems and network protocols. The middleware creates a distributed communications layer that insulates the application developer from the details of the various operating systems and network interfaces. APIs that extend across diverse platforms and networks are typically provided by MOM.

The actor model in computer science is a mathematical model of concurrent computation that treats actor as the universal primitive of concurrent computation. In response to a message it receives, an actor can: make local decisions, create more actors, send more messages, and determine how to respond to the next message received. Actors may modify their own private state, but can only affect each other indirectly through messaging.

In computer science, the Actor model and process calculi are two closely related approaches to the modelling of concurrent digital computation. See Actor model and process calculi history.

In computer science, future, promise, delay, and deferred refer to constructs used for synchronizing program execution in some concurrent programming languages. They describe an object that acts as a proxy for a result that is initially unknown, usually because the computation of its value is not yet complete.

In computing, a parallel programming model is an abstraction of parallel computer architecture, with which it is convenient to express algorithms and their composition in programs. The value of a programming model can be judged on its generality: how well a range of different problems can be expressed for a variety of different architectures, and its performance: how efficiently the compiled programs can execute. The implementation of a parallel programming model can take the form of a library invoked from a sequential language, as an extension to an existing language, or as an entirely new language.

Concurrent computing is a form of computing in which several computations are executed concurrently—during overlapping time periods—instead of sequentially—with one completing before the next starts.

In computer science, the Actor model, first published in 1973, is a mathematical model of concurrent computation.

The actor model and process calculi share an interesting history and co-evolution.

A fundamental problem in distributed computing and multi-agent systems is to achieve overall system reliability in the presence of a number of faulty processes. This often requires coordinating processes to reach consensus, or agree on some data value that is needed during computation. Example applications of consensus include agreeing on what transactions to commit to a database in which order, state machine replication, and atomic broadcasts. Real-world applications often requiring consensus include cloud computing, clock synchronization, PageRank, opinion formation, smart power grids, state estimation, control of UAVs, load balancing, blockchain, and others.

An Active message is a messaging object capable of performing processing on its own. It is a lightweight messaging protocol used to optimize network communications with an emphasis on reducing latency by removing software overheads associated with buffering and providing applications with direct user-level access to the network hardware. This contrasts with traditional computer-based messaging systems in which messages are passive entities with no processing power.

In computing, a channel is a model for interprocess communication and synchronization via message passing. A message may be sent over a channel, and another process or thread is able to receive messages sent over a channel it has a reference to, as a stream. Different implementations of channels may be buffered or not, and either synchronous or asynchronous.

TNSDL stands for TeleNokia Specification and Description Language. TNSDL is based on the ITU-T SDL-88 language. It is used exclusively at Nokia Networks, primarily for developing applications for telephone exchanges.

Synchronous Interprocess Messaging Project for LINUX (SIMPL) is a free and open-source project that allows QNX-style synchronous message passing by adding a Linux library using user space techniques like shared memory and Unix pipes to implement SendMssg/ReceiveMssg/ReplyMssg inter-process messaging mechanisms.

Join-patterns provides a way to write concurrent, parallel and distributed computer programs by message passing. Compared to the use of threads and locks, this is a high level programming model using communication constructs model to abstract the complexity of concurrent environment and to allow scalability. Its focus is on the execution of a chord between messages atomically consumed from a group of channels.

References

  1. Goldberg, Adele; David Robson (1989). Smalltalk-80 The Language. Addison Wesley. pp. 5–16. ISBN   0-201-13688-0.
  2. Orfali, Robert (1996). The Essential Client/Server Survival Guide. New York: Wiley Computer Publishing. pp.  1–22. ISBN   0-471-15325-7.
  3. Orfali, Robert (1996). The Essential Client/Server Survival Guide. New York: Wiley Computer Publishing. pp.  95–133. ISBN   0-471-15325-7.
  4. Orfali, Robert (1996). The Essential Client/Server Survival Guide. New York: Wiley Computer Publishing. pp.  375–397. ISBN   0-471-15325-7.
  5. Milner, Robin (Jan 1993). "Elements of interaction: Turing award lecture". Communications of the ACM. 36 (1): 78–89. doi: 10.1145/151233.151240 .
  6. Carl Hewitt; Peter Bishop; Richard Steiger (1973). "A Universal Modular Actor Formalism for Artificial Intelligence". IJCAI.{{cite journal}}: Cite journal requires |journal= (help)
  7. Kay, Alan. "prototypes vs classes was: Re: Sun's HotSpot". lists.squeakfoundation.org. Retrieved 2 January 2014.
  8. "Using Message Passing to Transfer Data Between Threads - The Rust Programming Language". doc.rust-lang.org.

Further reading