DRMAA

Last updated

Distributed Resource Management Application API (DRMAA) is a high-level Open Grid Forum (OGF) API specification for the submission and control of jobs to a distributed resource management (DRM) system, such as a cluster or grid computing infrastructure. The scope of the API covers all the high level functionality required for applications to submit, control, and monitor jobs on execution resources in the DRM system.

Contents

In 2007, DRMAA was one of the first two (the other one was GridRPC) specifications that reached the full recommendation status in the OGF. [1]

In 2012 the second version of the DRMAA standard (DRMAA2) was published in an abstract interface definition language (IDL) defining the semantic of the functions in GFD 194. [2] DRMAA2 specifies more than twice as many calls as DRMAA. It covers cluster monitoring, has a notion of queues and machines, and introduces a multi job-session concept for single applications for a better job workflow management. Later in 2012 the C API was specified as a first language binding in GF 198. [3]

Development model

The development of this API was done through the Global Grid Forum, in the model of IETF standard development, and it was originally co-authored by:

This specification was first proposed at Global Grid Forum 3 (GGF3) [4] in Frascati, Italy, but gained most of its momentum at Global Grid Forum 4 in Toronto, Ontario. The development of the specification was first proposed with the objective to facilitate direct interfacing of applications to existing DRM systems by application's builders, portal builders, and Independent Software Vendors (ISVs). Because the API was co-authored by participants from a wide-selection of companies and included participants from industries and education, its development resulted in an open standard that received a relatively good reception from a wide audience quickly.

Significance

Without DRMAA, no standard model existed to submit jobs to component regions of a Grid, assuming each region was running local DRMSs. The first version of DRMAA API has been implemented in Sun's Grid Engine and also in the University of Wisconsin–Madison's program Condor.[ promotion? ]

Related Research Articles

In distributed computing, a remote procedure call (RPC) is when a computer program causes a procedure (subroutine) to execute in a different address space, which is written as if it were a normal (local) procedure call, without the programmer explicitly writing the details for the remote interaction. That is, the programmer writes essentially the same code whether the subroutine is local to the executing program, or remote. This is a form of client–server interaction, typically implemented via a request–response message-passing system. In the object-oriented programming paradigm, RPCs are represented by remote method invocation (RMI). The RPC model implies a level of location transparency, namely that calling procedures are largely the same whether they are local or remote, but usually, they are not identical, so local calls can be distinguished from remote calls. Remote calls are usually orders of magnitude slower and less reliable than local calls, so distinguishing them is important.

Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems (Sun) in 1984, allowing a user on a client computer to access files over a computer network much like local storage is accessed. NFS, like many other protocols, builds on the Open Network Computing Remote Procedure Call system. NFS is an open IETF standard defined in a Request for Comments (RFC), allowing anyone to implement the protocol.

Open Grid Services Architecture (OGSA) describes a service-oriented architecture for a grid computing environment for business and scientific use. It was developed within the Open Grid Forum, which was called the Global Grid Forum (GGF) at the time, around 2002 to 2006.

Message-oriented middleware (MOM) is software or hardware infrastructure supporting sending and receiving messages between distributed systems. MOM allows application modules to be distributed over heterogeneous platforms and reduces the complexity of developing applications that span multiple operating systems and network protocols. The middleware creates a distributed communications layer that insulates the application developer from the details of the various operating systems and network interfaces. APIs that extend across diverse platforms and networks are typically provided by MOM.

The Java Web Services Development Pack (JWSDP) is a free software development kit (SDK) for developing Web Services, Web applications and Java applications with the newest technologies for Java.

OMA SpecWorks, previously the Open Mobile Alliance (OMA) is a standards organization which develops open, international technical standards for the mobile phone industry. It is a nonprofit Non-governmental organization (NGO), not a formal government-sponsored standards organization as is the International Telecommunication Union (ITU): a forum for industry stakeholders to agree on common specifications for products and services.

Oracle Grid Engine, previously known as Sun Grid Engine (SGE), CODINE or GRD, was a grid computing computer cluster software system, acquired as part of a purchase of Gridware, then improved and supported by Sun Microsystems and later Oracle. There have been open source versions and multiple commercial versions of this technology, initially from Sun, later from Oracle and then from Univa Corporation.

HTCondor is an open-source high-throughput computing software framework for coarse-grained distributed parallelization of computationally intensive tasks. It can be used to manage workload on a dedicated cluster of computers, or to farm out work to idle desktop computers – so-called cycle scavenging. HTCondor runs on Linux, Unix, Mac OS X, FreeBSD, and Microsoft Windows operating systems. HTCondor can integrate both dedicated resources and non-dedicated desktop machines into one computing environment.

Web Services Resource Framework (WSRF) is a family of OASIS-published specifications for web services. Major contributors include the Globus Alliance and IBM.

UNICORE (UNiform Interface to COmputing REsources) is a grid computing technology for resources such as supercomputers or cluster systems and information stored in databases. UNICORE was developed in two projects funded by the German ministry for education and research (BMBF). In European-funded projects UNICORE evolved to a middleware system used at several supercomputer centers. UNICORE served as a basis in other research projects. The UNICORE technology is open source under BSD licence and available at SourceForge.

The Data Distribution Service (DDS) for real-time systems is an Object Management Group (OMG) machine-to-machine standard that aims to enable dependable, high-performance, interoperable, real-time, scalable data exchanges using a publish–subscribe pattern.

Job Submission Description Language is an extensible XML specification from the Global Grid Forum for the description of simple tasks to non-interactive computer execution systems. Currently at version 1.0, the specification focuses on the description of computational task submissions to traditional high-performance computer systems like batch schedulers.

CDDLM is a Global Grid Forum standard for the management, deployment and configuration of Grid Service lifecycles or inter-organization resources.

GridWay is an open-source meta-scheduling technology that enables large-scale, secure, reliable and efficient sharing of computing resources, managed by different distributed resource management systems (DRMS), such as SGE, HTCondor, PBS or LSF, within a single organization or scattered across several administrative domains. To this end, GridWay supports several Grid middlewares.

<span class="mw-page-title-main">Open Grid Forum</span> Computing standards organization

The Open Grid Forum (OGF) is a community of users, developers, and vendors for standardization of grid computing. It was formed in 2006 in a merger of the Global Grid Forum and the Enterprise Grid Alliance. The OGF models its process on the Internet Engineering Task Force (IETF), and produces documents with many acronyms such as OGSA, OGSI, and JSDL.

The Simple API for Grid Applications (SAGA) is a family of related standards specified by the Open Grid Forum to define an application programming interface (API) for common distributed computing functionality.

The Multicore Association was founded in 2005. Multicore Association is a member-funded, non-profit, industry consortium focused on the creation of open standard APIs, specifications, and guidelines that allow system developers and programmers to more readily adopt multicore technology into their applications.

GridRPC in distributed computing, is Remote Procedure Call over a grid. This paradigm has been proposed by the GridRPC working group of the Open Grid Forum (OGF), and an API has been defined in order for clients to access remote servers as simply as a function call. It is used among numerous Grid middleware for its simplicity of implementation, and has been standardized by the OGF in 2007. For interoperability reasons between the different existing middleware, the API has been followed by a document describing good use and behavior of the different GridRPC API implementations. Works have then been conducted on the GridRPC Data Management, which has been standardized in 2011.

<span class="mw-page-title-main">DIET</span>

DIET is a software for grid-computing. As middleware, DIET sits between the operating system and the application software. DIET was created in 2000. It was designed for high-performance computing. It is currently developed by INRIA, École Normale Supérieure de Lyon, CNRS, Claude Bernard University Lyon 1, SysFera. It is open-source software released under the CeCILL license.

References

  1. "DRMAA and GridRPC Documents Achieve "Grid Recommendation" Status". Open Grid Forum. 2008-01-07.
  2. "Distributed Resource Management Application API Version 2" (PDF). Open Grid Forum. 2012-02-01.
  3. "Distributed Resource Management Application API Version 2 - C Language Binding" (PDF). Open Grid Forum. 2012-12-01.
  4. GGF3 - The Third Global Grid Forum October 7 - 11, 2001