This article has multiple issues. Please help improve it or discuss these issues on the talk page . (Learn how and when to remove these messages)
|
Original author(s) | Oak Ridge National Laboratory |
---|---|
Developer(s) | University of Tennessee |
Initial release | 1989 |
Stable release | 3.4.6 / February 2, 2009 [1] |
Written in | C |
Operating system | Windows and Unix |
License | BSD, GPL |
Website | https://www.epm.ornl.gov/pvm/pvm_home.html |
Parallel Virtual Machine (PVM) is a software tool for parallel networking of computers. It is designed to allow a network of heterogeneous Unix and/or Windows machines to be used as a single distributed parallel processor. [2] Thus large computational problems can be solved more cost effectively by using the aggregate power and memory of many computers. The software is very portable; the source code, available free through netlib, has been compiled on everything from laptops to Crays. [3]
PVM enables users to exploit their existing computer hardware to solve much larger problems at less additional cost. PVM has been used as an educational tool to teach parallel programming but has also been used to solve important practical problems. [3] It was developed by the University of Tennessee, Oak Ridge National Laboratory and Emory University. The first version was written at ORNL in 1989, and after being rewritten by University of Tennessee, version 2 was released in March 1991. Version 3 was released in March 1993, and supported fault tolerance and better portability.
PVM was a step towards modern trends in distributed processing and grid computing but has, since the mid-1990s, largely been supplanted by the much more successful MPI standard for message passing on parallel machines. PVM is free software, released under both the BSD License and the GNU General Public License.
PVM is a software system that enables a collection of heterogeneous computers to be used as a coherent and flexible concurrent computational resource, or a "parallel virtual machine".
The individual computers may be shared-memory or local-memory multiprocessors, vector supercomputers, specialized graphics engines, or scalar workstations and PCs, that may be interconnected by a variety of networks, such as Ethernet or FDDI.
PVM consists of a run-time environment and library for message passing, task and resource management, and fault notification. While PVM will not automatically make a commercial software package run faster, it does provide a powerful set of functions for manually parallelizing an existing source program, or for writing new parallel/distributed programs.
The PVM software must be specifically installed on every machine that is to be used in a given "virtual machine". There is no "automatic" installation of executables onto remote machines in PVM, although simply copying the pvm3/lib
and pvm3/bin
directories to another similar machine (and setting $PVM_ROOT
and $PVM_ARCH
) is sufficient for running PVM programs. Compiling or building PVM programs requires the full PVM installation.
User programs written in C, C++, or Fortran can access PVM through provided library routines.
PVM also supports broadcasting (PVM_bcast) which sends to all processes in a group and multicasting (PVM_mcast) which sends to a specific list of processes.
Distributed computing is a field of computer science that studies distributed systems, defined as computer systems whose inter-communicating components are located on different networked computers.
A Beowulf cluster is a computer cluster of what are normally identical, commodity-grade computers networked into a small local area network with libraries and programs installed which allow processing to be shared among them. The result is a high-performance parallel computing cluster from inexpensive personal computer hardware.
Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different forms of parallel computing: bit-level, instruction-level, data, and task parallelism. Parallelism has long been employed in high-performance computing, but has gained broader interest due to the physical constraints preventing frequency scaling. As power consumption by computers has become a concern in recent years, parallel computing has become the dominant paradigm in computer architecture, mainly in the form of multi-core processors.
The Message Passing Interface (MPI) is a standardized and portable message-passing standard designed to function on parallel computing architectures. The MPI standard defines the syntax and semantics of library routines that are useful to a wide range of users writing portable message-passing programs in C, C++, and Fortran. There are several open-source MPI implementations, which fostered the development of a parallel software industry, and encouraged development of portable and scalable large-scale parallel applications.
OpenMP is an application programming interface (API) that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran, on many platforms, instruction-set architectures and operating systems, including Solaris, AIX, FreeBSD, HP-UX, Linux, macOS, and Windows. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior.
Checkpointing is a technique that provides fault tolerance for computing systems. It basically consists of saving a snapshot of the application's state, so that applications can restart from that point in case of failure. This is particularly important for long running applications that are executed in failure-prone computing systems.
Jack Joseph Dongarra is an American computer scientist and mathematician. He is a University Distinguished Professor Emeritus of Computer Science in the Electrical Engineering and Computer Science Department at the University of Tennessee. He holds the position of a Distinguished Research Staff member in the Computer Science and Mathematics Division at Oak Ridge National Laboratory, Turing Fellowship in the School of Mathematics at the University of Manchester, and is an adjunct professor and teacher in the Computer Science Department at Rice University. He served as a faculty fellow at the Texas A&M University Institute for Advanced Study (2014–2018). Dongarra is the founding director of the Innovative Computing Laboratory at the University of Tennessee. He was the recipient of the Turing Award in 2021.
In computer science, Linda is a coordination model that aids communication in parallel computing environments. Developed by David Gelernter, it is meant to be used alongside a full-fledged computation language like Fortran or C where Linda's role is to "create computational activities and to support communication among them".
In computing, single program, multiple data (SPMD) is a term that has been used to refer to computational models for exploiting parallelism where-by multiple processors cooperate in the execution of a program in order to obtain results faster.
In computing, a parallel programming model is an abstraction of parallel computer architecture, with which it is convenient to express algorithms and their composition in programs. The value of a programming model can be judged on its generality: how well a range of different problems can be expressed for a variety of different architectures, and its performance: how efficiently the compiled programs can execute. The implementation of a parallel programming model can take the form of a library invoked from a programming language, as an extension to an existing languages.
Concurrent computing is a form of computing in which several computations are executed concurrently—during overlapping time periods—instead of sequentially—with one completing before the next starts.
Parallel Multithreaded Machine (PM2) is a software for parallel networking of computers.
The ScaLAPACK library includes a subset of LAPACK routines redesigned for distributed memory MIMD parallel computers. It is currently written in a Single-Program-Multiple-Data style using explicit message passing for interprocessor communication. It assumes matrices are laid out in a two-dimensional block cyclic decomposition.
A GPU cluster is a computer cluster in which each node is equipped with a graphics processing unit (GPU). By harnessing the computational power of modern GPUs via general-purpose computing on graphics processing units (GPGPU), very fast calculations can be performed with a GPU cluster.
Global Arrays, or GA, is the library developed by scientists at Pacific Northwest National Laboratory for parallel computing. GA provides a friendly API for shared-memory programming on distributed-memory computers for multidimensional arrays. The GA library is a predecessor to the GAS languages currently being developed for high-performance computing.
A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software. The newest manifestation of cluster computing is cloud computing.
ISP is a tool for the formal verification of MPI programs developed within the School of Computing at the University of Utah. Like model checkers, such as SPIN, ISP verifies the complete state space of a system for a set of safety properties. However, unlike model checkers, ISP performs code level verification. This means that the tool verifies all relevant interleavings of a concurrent program by replaying the actual program code without building verification models. This idea was pioneered in a number of tools, notably by Godefroid, in his VeriSoft tool. Other recent tools of this genre include the Java Pathfinder, Microsoft's CHESS tool, and MODIST. Relevant interleavings are computed using a customized dynamic partial order reduction algorithm called POE.
Róbert Lovas is a Hungarian computer scientist at SZTAKI, Budapest, Hungary.
Tachyon is a parallel/multiprocessor ray tracing software. It is a parallel ray tracing library for use on distributed memory parallel computers, shared memory computers, and clusters of workstations. Tachyon implements rendering features such as ambient occlusion lighting, depth-of-field focal blur, shadows, reflections, and others. It was originally developed for the Intel iPSC/860 by John Stone for his M.S. thesis at University of Missouri-Rolla. Tachyon subsequently became a more functional and complete ray tracing engine, and it is now incorporated into a number of other open source software packages such as VMD, and SageMath. Tachyon is released under a permissive license.
Message passing is an inherent element of all computer clusters. All computer clusters, ranging from homemade Beowulfs to some of the fastest supercomputers in the world, rely on message passing to coordinate the activities of the many nodes they encompass. Message passing in computer clusters built with commodity servers and switches is used by virtually every internet service.