Nimrod (distributed computing)

Last updated

Nimrod is a tool for the parametrization of serial programs to create and execute embarrassingly parallel programs over a computational grid. It is a co-allocating, scheduling and brokering service. [1] Nimrod was one of the first tools to make use of heterogeneous resources in a grid for a single computation. [2] It was also an early example of using a market economy to perform grid scheduling. [3] This enables Nimrod to provide a guaranteed completion time despite using best-effort services. [4]

The tool was created as a research project funded by the Distributed Systems Technology Centre. The principal investigator is Professor David Abramson of Monash University.

Related Research Articles

A distributed system is a system whose components are located on different networked computers, which communicate and coordinate their actions by passing messages to one another. Distributed computing is a field of computer science that studies distributed systems.

<span class="mw-page-title-main">Supercomputer</span> Type of extremely powerful computer

A supercomputer is a computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instructions per second (MIPS). Since 2017, there have existed supercomputers which can perform over 1017 FLOPS (a hundred quadrillion FLOPS, 100 petaFLOPS or 100 PFLOPS). For comparison, a desktop computer has performance in the range of hundreds of gigaFLOPS (1011) to tens of teraFLOPS (1013). Since November 2017, all of the world's fastest 500 supercomputers run on Linux-based operating systems. Additional research is being conducted in the United States, the European Union, Taiwan, Japan, and China to build faster, more powerful and technologically superior exascale supercomputers.

Grid computing is the use of widely distributed computer resources to reach a common goal. A computing grid can be thought of as a distributed system with non-interactive workloads that involve many files. Grid computing is distinguished from conventional high-performance computing systems such as cluster computing in that grid computers have each node set to perform a different task/application. Grid computers also tend to be more heterogeneous and geographically dispersed than cluster computers. Although a single grid can be dedicated to a particular application, commonly a grid is used for a variety of purposes. Grids are often constructed with general-purpose grid middleware software libraries. Grid sizes can be quite large.

The Globus Toolkit is an open-source toolkit for grid computing developed and provided by the Globus Alliance. On 25 May 2017 it was announced that the open source support for the project would be discontinued in January 2018, due to a lack of financial support for that work. The Globus service continues to be available to the research community under a freemium approach, designed to sustain the software, with most features freely available but some restricted to subscribers.

MOSIX is a proprietary distributed operating system. Although early versions were based on older UNIX systems, since 1999 it focuses on Linux clusters and grids. In a MOSIX cluster/grid there is no need to modify or to link applications with any library, to copy files or login to remote nodes, or even to assign processes to different nodes – it is all done automatically, like in an SMP.

General-purpose computing on graphics processing units is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the central processing unit (CPU). The use of multiple video cards in one computer, or large numbers of graphics chips, further parallelizes the already parallel nature of graphics processing.

Ian Tremere Foster is a New Zealand-American computer scientist. He is a distinguished fellow, senior scientist, and director of the Data Science and Learning division at Argonne National Laboratory, and a professor in the department of computer science at the University of Chicago.

<span class="mw-page-title-main">Charlie Catlett</span> American computer scientist

Charlie Catlett is a senior computer scientist at Argonne National Laboratory and a visiting senior fellow at the Mansueto Institute for Urban Innovation at the University of Chicago. From 2020 to 2022 he was a senior research scientist at the University of Illinois Discovery Partners Institute. He was previously a senior computer scientist at Argonne National Laboratory and a senior fellow in the Computation Institute, a joint institute of Argonne National Laboratory and The University of Chicago, and a senior fellow at the University of Chicago's Harris School of Public Policy.

Within cluster and parallel computing, a cluster manager is usually backend graphical user interface (GUI) or command-line interface (CLI) software that runs on a set of cluster nodes that it manages. The cluster manager works together with a cluster management agent. These agents run on each node of the cluster to manage and configure services, a set of services, or to manage and configure the complete cluster server itself In some cases the cluster manager is mostly used to dispatch work for the cluster to perform. In this last case a subset of the cluster manager can be a remote desktop application that is used not for configuration but just to send work and get back work results from a cluster. In other cases the cluster is more related to availability and load balancing than to computational or specific service clusters.

Keith Marzullo is the inventor of Marzullo's algorithm, which is part of the basis of the Network Time Protocol and the Windows Time Service. On August 1, 2016 he became the Dean of the University of Maryland College of Information Studies after serving as the Director of the NITRD National Coordination Office. Prior to this he was a Professor in the Department of Computer Science and Engineering at University of California, San Diego. In 2011 he was inducted as a Fellow of the Association for Computing Machinery.

<span class="mw-page-title-main">Computer cluster</span> Set of computers configured in a distributed computing system

A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software.

In computing, algorithmic skeletons, or parallelism patterns, are a high-level parallel programming model for parallel and distributed computing.

Ignacio Martín Llorente is an entrepreneur, researcher and educator in the field of cloud and distributed computing. He is the director of OpenNebula, a visiting scholar at Harvard University and a full professor at Complutense University.

Data-intensive computing is a class of parallel computing applications which use a data parallel approach to process large volumes of data typically terabytes or petabytes in size and typically referred to as big data. Computing applications which devote most of their execution time to computational requirements are deemed compute-intensive, whereas computing applications which require large volumes of data and devote most of their processing time to I/O and manipulation of data are deemed data-intensive.

<span class="mw-page-title-main">Quasi-opportunistic supercomputing</span> Computational paradigm for supercomputing

Quasi-opportunistic supercomputing is a computational paradigm for supercomputing on a large number of geographically disperse computers. Quasi-opportunistic supercomputing aims to provide a higher quality of service than opportunistic resource sharing.

<span class="mw-page-title-main">Tachyon (software)</span>

Tachyon is a parallel/multiprocessor ray tracing software. It is a parallel ray tracing library for use on distributed memory parallel computers, shared memory computers, and clusters of workstations. Tachyon implements rendering features such as ambient occlusion lighting, depth-of-field focal blur, shadows, reflections, and others. It was originally developed for the Intel iPSC/860 by John Stone for his M.S. thesis at University of Missouri-Rolla. Tachyon subsequently became a more functional and complete ray tracing engine, and it is now incorporated into a number of other open source software packages such as VMD, and SageMath. Tachyon is released under a permissive license.

<span class="mw-page-title-main">Supercomputer architecture</span> Aspect of supercomputer

Approaches to supercomputer architecture have taken dramatic turns since the earliest systems were introduced in the 1960s. Early supercomputer architectures pioneered by Seymour Cray relied on compact innovative designs and local parallelism to achieve superior computational peak performance. However, in time the demand for increased computational power ushered in the age of massively parallel systems.

Professor David Abramson FIEEE FACM FTSE FACS has been Director of the Research Computing Centre at the University of Queensland, Australia, since 2012. He has been involved in computer architecture and high performance computing research since 1979.

<span class="mw-page-title-main">Ishfaq Ahmad (computer scientist)</span> Computer scientist and university professor

Ishfaq Ahmad is a computer scientist, IEEE Fellow and Professor of Computer Science and Engineering at the University of Texas at Arlington (UTA). He is the Director of the Center For Advanced Computing Systems (CACS) and has previously directed IRIS at UTA. He is widely recognized for his contributions to scheduling techniques in parallel and distributed computing systems, and video coding.

Katarzyna Keahey is a Senior Computer Scientist at Argonne National Laboratory and the Consortium for Advanced Science and Engineering (CASE) of the University of Chicago. She is a Principal Investigator (PI) of the Chameleon project, which provides an innovative experimentation platform for computer science systems experiments. She created Nimbus, one of the first open source implementations of infrastructure-as-a-Service (IaaS), and co-founded the SoftwareX journal, publishing software as a scientific instrument.

References

  1. Foster, Ian; Zhao, Yong; Raicu, Ioan (2008). "Cloud Computing and Grid Computing 360-Degree Compared". 2008 Grid Computing Environments Workshop. pp. 1–10. arXiv: 0901.0131 . doi:10.1109/GCE.2008.4738445. ISBN   978-1-4244-2860-1. S2CID   3187572.
  2. Abramson, D.; Foster, I.; Giddy, J.; Lewis, A.; Sosic, R.; Sutherst, R.; White, N. (February 1997). "The Nimrod Computational Workbench: A Case Study in Desktop Metacomputing" (PDF). Proceedings of the Australian Computer Science Conference (ACSC 97).
  3. Abramson, D.; Giddy, J.; Kotler, L. (May 2000). "High Performance Parametric Modeling with Nimrod/G: Killer Application for the Global Grid?" (PDF). Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS 2000). USA: IEEE Computer Society Press. pp. 520–528.
  4. Buyya, R.; Abramson, D.; Giddy, J. (May 2000). "Nimrod/G: An Architecture of a Resource Management and Scheduling System in a Global Computational Grid" (PDF). Proceedings of HPC Asia 2000. USA: IEEE Computer Society Press. pp. 283–289.