Reverse computation

Last updated

Reverse computation is a software application of the concept of reversible computing.

Contents

Because it offers a possible solution to the heat problem faced by chip manufacturers, reversible computing has been extensively studied in the area of computer architecture. The promise of reversible computing is that the amount of heat loss for reversible architectures would be minimal for significantly large numbers of transistors. [1] [2] Rather than creating entropy (and thus heat) through destructive operations, a reversible architecture conserves the energy by performing other operations that preserve the system state. [3] [4]

The concept of reverse computation is somewhat simpler than reversible computing in that reverse computation is only required to restore the equivalent state of a software application, rather than support the reversibility of the set of all possible instructions. Reversible computing concepts have been successfully applied as reverse computation in software application areas such as database design, [5] checkpointing and debugging, [6] and code differentiation. [7] [8]

Reverse Computation for Parallel Discrete Event Simulation

List of operations that are reverse computable and their costs. Rc-table.png
List of operations that are reverse computable and their costs.

Based on the successful application of Reverse Computation concepts in other software domains, Chris Carothers, Kalyan Perumalla and Richard Fujimoto [9] suggest the application of reverse computation to reduce state saving overheads in parallel discrete event simulation (PDES). They define an approach based on reverse event codes (which can be automatically generated), and demonstrate performance advantages of this approach over traditional state saving for fine-grained applications (those with a small amount of computation per event). The key property that reverse computation exploits is that a majority of the operations that modify the state variables are “constructive” in nature. That is, the undo operation for such operations requires no history. Only the most current values of the variables are required to undo the operation. For example, operators such as ++, ––, +=, -=, *= and /= belong to this category. Note, that the *= and /= operators require special treatment in the case of multiply or divide by zero, and overflow / underflow conditions. More complex operations such as circular shift (swap being a special case), and certain classes of random number generation also belong here.

Operations of the form a = b, modulo and bitwise computations that result in the loss of data, are termed to be destructive. Typically these operations can only be restored using conventional state-saving techniques. However, we observe that many of these destructive operations are a consequence of the arrival of data contained within the event being processed. For example, in the work of Yaun, Carothers, et al., with large-scale TCP simulation, [10] the last-sent time records the time stamp of the last packet forwarded on a router logical process. The swap operation makes this operation reversible.

History of Reverse Computation as applied to Parallel Discrete Event Simulation

Taxonomy of digital simulation. Simulation-taxonomy.png
Taxonomy of digital simulation.

In 1985 Jefferson introduced the optimistic synchronization protocol, which was utilized in parallel discrete event simulations, known as Time Warp. [11] To date, the technique known as Reverse Computation has only been applied in software for optimistically synchronized, parallel discrete event simulation.

In December 1999, Michael Frank graduated from the University of Florida. His doctoral thesis focused on reverse computation at the hardware level, but included descriptions of both an instruction set architecture and a high level programming language (R) for a processor based on reverse computation. [12] [notes 1]

In 1998 Carothers and Perumalla published a paper for the Principles of Advanced and Distributed Simulation workshop [13] as part of their graduate studies under Richard Fujimoto, introducing technique of Reverse Computation as an alternative rollback mechanism in optimistically synchronized parallel discrete event simulations (Time Warp). In 1998, Carothers became an associate professor at Rensselaer Polytechnic Institute. Working with graduate students David Bauer and Shawn Pearce, Carothers integrated the Georgia Tech Time Warp design into Rensselaer’s Optimistic Simulation System (ROSS), which supported only reverse computation as the rollback mechanism. Carothers also constructed RC models for BitTorrent at General Electric, as well as numerous network protocols with students (BGP4, TCP Tahoe, Multicast). Carothers created a course on Parallel and Distributed Simulation in which students were required to construct RC models in ROSS.

Around the same time, Perumalla graduated from Georgia Tech and went to work at the Oak Ridge National Laboratory (ORNL). There he built the uSik simulator, which was a combined optimistic / conservative protocol PDES. The system was capable of dynamically determining the best protocol for LPs and remapping them during execution in response to model dynamics. In 2007 Perumalla tested uSik on Blue Gene/L and found that, while scalability is limited to 8K processors for pure Time Warp implementation, the conservative implementation scales to 16K available processors. Note that benchmarking was performed using PHOLD with a constrained remote event rate of 10%, where the timestamp of events was determined by an exponential distribution with a mean of 1.0, and an additional lookahead of 1.0 added to each event. This was the first implementation of PDES on Blue Gene using reverse computation.

From 1998 to 2005 Bauer performed graduate work at RPI under Carothers, focusing solely on reverse computation. He developed the first PDES system solely based on reverse computation, called Rensselaer’s Optimistic Simulation System (ROSS). [14] for combined shared and distributed memory systems. From 2006 to 2009 Bauer worked under E.H. Page at Mitre Corporation, and in collaboration with Carothers and Pearce pushed the ROSS simulator to the 131,072 processor Blue Gene/P (Intrepid). This implementation was stable for remote event rates of 100% (every event sent over the network). During his time at RPI and MITRE, Bauer developed the network simulation system ROSS.Net [15] that supports semi-automated experiment design for black-box optimization of network protocol models executing in ROSS. A primary goal of the system was to optimize multiple network protocol models for execution in ROSS. For example, creating an LP layering structure to eliminate events being passed between network protocol LPs on the same simulated machine optimizes simulation of TCP/IP network nodes by eliminating zero-offset timestamps between TCP and IP protocols. Bauer also constructed RC agent-based models for social contact networks to study the effects of infectious diseases, in particular pandemic influenza, that scale to hundreds of millions of agents; as well as RC models for Mobile ad-hoc networks implementing functionality of mobility (proximity detection) and highly accurate physical layer electromagnetic wave propagation (Transmission Line Matrix model). [16]

There has also been a recent push by the PDES community into the realm of continuous simulation. For example, Fujimoto and Perumalla, working with Tang et al. [17] have implemented an RC model of particle-in-cell and demonstrated excellent speedup over continuous simulation for models of light as a particle. Bauer and Page demonstrated excellent speedup for an RC Transmission Line Matrix model (P.B. Johns, 1971), modeling light as a wave at microwave frequencies. Bauer also created an RC variant of SEIR that generates enormous improvement over continuous models in the area of infectious disease spread. In addition, the RC SEIR model is capable of modeling multiple diseases efficiently, whereas the continuous model explodes exponentially with respect to the number of combinations of diseases possible throughout the population.

Events

Notes

  1. Dr. Frank maintains two websites of his publications on reverse computation to 2004 and later.

Related Research Articles

In computer science, an abstract machine is a theoretical model that allows for a detailed and precise analysis of how a computer system functions. It is similar to a mathematical function in that it receives inputs and produces outputs based on predefined rules. Abstract machines vary from literal machines in that they are expected to perform correctly and independently of hardware. Abstract machines are "machines" because they allow step-by-step execution of programmes; they are "abstract" because they ignore many aspects of actual (hardware) machines. A typical abstract machine consists of a definition in terms of input, output, and the set of allowable operations used to turn the former into the latter. They can be used for purely theoretical reasons as well as models for real-world computer systems. In the theory of computation, abstract machines are often used in thought experiments regarding computability or to analyse the complexity of algorithms. This use of abstract machines is fundamental to the field of computational complexity theory, such as finite state machines, Mealy machines, push-down automata, and Turing machines.

<span class="mw-page-title-main">Particle system</span> Technique in game physics, motion graphics and computer graphics

A particle system is a technique in game physics, motion graphics, and computer graphics that uses many minute sprites, 3D models, or other graphic objects to simulate certain kinds of "fuzzy" phenomena, which are otherwise very hard to reproduce with conventional rendering techniques – usually highly chaotic systems, natural phenomena, or processes caused by chemical reactions.

The Gödel Prize is an annual prize for outstanding papers in the area of theoretical computer science, given jointly by the European Association for Theoretical Computer Science (EATCS) and the Association for Computing Machinery Special Interest Group on Algorithms and Computational Theory. The award is named in honor of Kurt Gödel. Gödel's connection to theoretical computer science is that he was the first to mention the "P versus NP" question, in a 1956 letter to John von Neumann in which Gödel asked whether a certain NP-complete problem could be solved in quadratic or linear time.

<span class="mw-page-title-main">Concurrency (computer science)</span> Ability to execute a task in a non-serial manner

In computer science, concurrency is the ability of different parts or units of a program, algorithm, or problem to be executed out-of-order or in partial order, without affecting the outcome. This allows for parallel execution of the concurrent units, which can significantly improve overall speed of the execution in multi-processor and multi-core systems. In more technical terms, concurrency refers to the decomposability of a program, algorithm, or problem into order-independent or partially-ordered components or units of computation.

In computer science, Linda is a coordination model that aids communication in parallel computing environments. Developed by David Gelernter, it is meant to be used alongside a full-fledged computation language like Fortran or C where Linda's role is to "create computational activities and to support communication among them".

In computing, remote direct memory access (RDMA) is a direct memory access from the memory of one computer into that of another without involving either one's operating system. This permits high-throughput, low-latency networking, which is especially useful in massively parallel computer clusters.

General-purpose computing on graphics processing units is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the central processing unit (CPU). The use of multiple video cards in one computer, or large numbers of graphics chips, further parallelizes the already parallel nature of graphics processing.

Reversible computing is any model of computation where the computational process, to some extent, is time-reversible. In a model of computation that uses deterministic transitions from one state of the abstract machine to another, a necessary condition for reversibility is that the relation of the mapping from states to their successors must be one-to-one. Reversible computing is a form of unconventional computing.

<span class="mw-page-title-main">Mesh generation</span> Subdivision of space into cells

Mesh generation is the practice of creating a mesh, a subdivision of a continuous geometric space into discrete geometric and topological cells. Often these cells form a simplicial complex. Usually the cells partition the geometric input domain. Mesh cells are used as discrete local approximations of the larger domain. Meshes are created by computer algorithms, often with human guidance through a GUI, depending on the complexity of the domain and the type of mesh desired. A typical goal is to create a mesh that accurately captures the input domain geometry, with high-quality (well-shaped) cells, and without so many cells as to make subsequent calculations intractable. The mesh should also be fine in areas that are important for the subsequent calculations.

A discrete-event simulation (DES) models the operation of a system as a (discrete) sequence of events in time. Each event occurs at a particular instant in time and marks a change of state in the system. Between consecutive events, no change in the system is assumed to occur; thus the simulation time can directly jump to the occurrence time of the next event, which is called next-event time progression.

Secure two-party computation (2PC) a.k.a. Secure function evaluation is sub-problem of secure multi-party computation (MPC) that has received special attention by researchers because of its close relation to many cryptographic tasks. The goal of 2PC is to create a generic protocol that allows two parties to jointly compute an arbitrary function on their inputs without sharing the value of their inputs with the opposing party. One of the most well known examples of 2PC is Yao's Millionaires' problem, in which two parties, Alice and Bob, are millionaires who wish to determine who is wealthier without revealing their wealth. Formally, Alice has wealth , Bob has wealth , and they wish to compute without revealing the values or .

Trilinos is a collection of open-source software libraries, called packages, intended to be used as building blocks for the development of scientific applications. The word "Trilinos" is Greek and conveys the idea of "a string of pearls", suggesting a number of software packages linked together by a common infrastructure. Trilinos was developed at Sandia National Laboratories from a core group of existing algorithms and utilizes the functionality of software interfaces such as the BLAS, LAPACK, and MPI . In 2004, Trilinos received an R&D100 Award.

<span class="mw-page-title-main">Kurt Mehlhorn</span> German computer scientist (born 1949)

Kurt Mehlhorn is a German theoretical computer scientist. He has been a vice president of the Max Planck Society and is director of the Max Planck Institute for Computer Science.

Michael John Fischer is an American computer scientist who works in the fields of distributed computing, parallel computing, cryptography, algorithms and data structures, and computational complexity.

In computer science, the analysis of parallel algorithms is the process of finding the computational complexity of algorithms executed in parallel – the amount of time, storage, or other resources needed to execute them. In many respects, analysis of parallel algorithms is similar to the analysis of sequential algorithms, but is generally more involved because one must reason about the behavior of multiple cooperating threads of execution. One of the primary goals of parallel analysis is to understand how a parallel algorithm's use of resources changes as the number of processors is changed.

<span class="mw-page-title-main">FEATool Multiphysics</span>

FEATool Multiphysics is a physics, finite element analysis (FEA), and partial differential equation (PDE) simulation toolbox. FEATool Multiphysics features the ability to model fully coupled heat transfer, fluid dynamics, chemical engineering, structural mechanics, fluid-structure interaction (FSI), electromagnetics, as well as user-defined and custom PDE problems in 1D, 2D (axisymmetry), or 3D, all within a graphical user interface (GUI) or optionally as script files. FEATool has been employed and used in academic research, teaching, and industrial engineering simulation contexts.

In the high-performance computing environment, burst buffer is a fast intermediate storage layer positioned between the front-end computing processes and the back-end storage systems. It bridges the performance gap between the processing speed of the compute nodes and the Input/output (I/O) bandwidth of the storage systems. Burst buffers are often built from arrays of high-performance storage devices, such as NVRAM and SSD. It typically offers from one to two orders of magnitude higher I/O bandwidth than the back-end storage systems.

Rachid Guerraoui is a Moroccan-Swiss computer scientist and a professor at the School of Computer and Communication Sciences at École Polytechnique Fédérale de Lausanne (EPFL), known for his contributions in the fields of concurrent and distributed computing. He is an ACM Fellow and the Chair in Informatics and Computational Science for the year 2018–2019 at Collège de France for distributed computing.

<span class="mw-page-title-main">Jeff Edmonds</span>

Jeff Edmonds is a Canadian and American mathematician and computer scientist specializing in computational complexity theory.

Richard Masao Fujimoto is a computer scientist and researcher in reverse computation, distributed computing, and big data. He is a Regents’ Professor, Emeritus, in the School of Computational Science and Engineering (CSE) at the Georgia Institute of Technology. He was also the founding chair of Georgia Tech's school of CSE. Fujimoto's research has provided the basis for the development of new algorithms and computational techniques for discrete event simulations, including the development of the Georgia Tech Time Warp software, which was adopted for use by MITRE to create a commercial air traffic simulator. Fujimoto also led the development and definition of the time management services in the High Level Architecture (HLA) for modeling and simulation which was standardized under IEEE 1516.

References

  1. Landauer, Rolf (July 1961). "Irreversibility and heat generation in the computing process". IBM Journal of Research and Development. 5 (3): 183–191. CiteSeerX   10.1.1.68.7646 . doi:10.1147/rd.53.0183.
  2. Von Neumann, John (1966). Theory of Self-Reproducing Automata. University of Illinois Press. p. 388. Retrieved 2009-04-06.
  3. Bennett, Charles H. (1982). "The thermodynamics of computation—a review" (PDF). International Journal of Theoretical Physics. 21 (12): 905–940. Bibcode:1982IJTP...21..905B. CiteSeerX   10.1.1.655.5610 . doi:10.1007/BF02084158. S2CID   17471991 . Retrieved 2009-04-06.
  4. Frank, Michael P. (June 1999). Reversibility for efficient computing, Ph.D. Thesis (PDF). Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science. Retrieved 2009-04-06.
  5. Leeman Jr., George B. (1986). "A formal approach to undo operations in programming languages". ACM Transactions on Programming Languages and Systems. 8 (1): 50–87. doi: 10.1145/5001.5005 .
  6. Biswas, Bitan; Mall, R. (1999). "Reverse execution of programs". ACM SIGPLAN Notices. 34 (4): 61–69. doi: 10.1145/312009.312079 . S2CID   11685971.
  7. Griewank, Andreas; Juedes, David; Utke, Jean (1996). "Algorithm 755: Adolc: a package for the automatic differentiation of algorithms written in c/c++". ACM Transactions on Mathematical Software. 22 (2): 131–167. doi: 10.1145/229473.229474 . S2CID   7339428.
  8. Grimm, J; Pottier, L.; Rostaing-Schmidt, N. (1996). "Optimal time and minimum space-time product for reversing a certain class of programs" (PDF). Technical Report.
  9. Carothers, Christopher D.; Perumalla, Kalyan S.; Fujimoto, Richard M. (1999). "Efficient optimistic parallel simulations using reverse computation" (PDF). ACM Transactions on Modeling and Computer Simulation. 9 (3): 224–253. CiteSeerX   10.1.1.113.1702 . doi:10.1145/347823.347828. S2CID   969021. Archived from the original (PDF) on 2011-07-17. Retrieved 2009-04-06.
  10. Yaun, Garrett; Carothers, Christopher D.; Kalyanaraman, Shivkumar (2003). "Large-scale TCP models using optimistic parallel simulation". Seventeenth Workshop on Parallel and Distributed Simulation, 2003. (PADS 2003). Proceedings. pp. 153–162. CiteSeerX   10.1.1.115.1320 . doi:10.1109/PADS.2003.1207431. ISBN   978-0-7695-1970-8. S2CID   6196101.
  11. Jefferson, David R. (1985). "Virtual Time" (PDF). ACM Transactions on Programming Languages and Systems. 7 (3): 404–425. doi:10.1145/3916.3988. S2CID   2654080 . Retrieved 2009-04-06.
  12. Vieri, C.; Ammer, M.J.; Frank, M.; Margolus, N.; Knight, T. (June 1998). "A fully reversible asymptotically zero energy microprocessor" (PDF). Power Driven Microarchitecture Workshop: 138–142.
  13. Principles of Advanced and Distributed Simulation Workshop, now ACM SIGSIM Conference on Principles of Advanced Discrete Simulation (PADS)
  14. Carothers, Christopher D.; Bauer, D. W.; Pearce, Shawn O. (2002). "ROSS: A high-performance, low-memory, modular Time Warp system". Journal of Parallel and Distributed Computing. 62 (11): 1648–1669. CiteSeerX   10.1.1.78.3105 . doi:10.1016/S0743-7315(02)00004-7.
  15. Bauer, David W.; Yaun, Garrett; Carothers, Christopher D.; Yuksel, Murat; Kalyanaraman, Shivkumar (2003). "ROSS.Net: Optimistic parallel simulation framework for large-scale Internet models". Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.03EX693). Vol. 1. pp. 703–711. doi:10.1109/WSC.2003.1261486. ISBN   978-0-7803-8131-5. S2CID   2735827.
  16. Bauer Jr., David W.; Page, Ernest H. (2007). "Optimistic parallel discrete event simulation of the event-based transmission line matrix method". Proceedings of the 39th Conference on Winter Simulation: 40 Years! The Best is Yet to Come: 676–684. CiteSeerX   10.1.1.132.307 .
  17. Tang, Y.; Perumalla, K. S.; Fujimoto, R. M.; Karimabadi, H.; Driscoll, J.; Omelchenko, Y. (2005). "Optimistic Parallel Discrete Event Simulations of Physical Systems Using Reverse Computation". Workshop on Principles of Advanced and Distributed Simulation (PADS'05) (PDF). pp. 26–35. CiteSeerX   10.1.1.110.5893 . doi:10.1109/PADS.2005.16. ISBN   978-0-7695-2383-5. S2CID   802601 . Retrieved 2009-04-06.{{cite book}}: |journal= ignored (help)CS1 maint: date and year (link)