GLite

Last updated
gLite
Developer(s) EGEE
Stable release
3.2 / 23 March 2009
Operating system Scientific Linux 3, 4 ,5
Type Grid computing
License EGEE Collaboration 2004
Website glite.cern.ch

gLite (pronounced "gee-lite") is a middleware computer software project for grid computing used by the CERN LHC experiments and other scientific domains. It was implemented by collaborative efforts of more than 80 people in 12 different academic and industrial research centers in Europe. gLite provides a framework for building applications tapping into distributed computing and storage resources across the Internet. The gLite services were adopted by more than 250 computing centres, and used by more than 15000 researchers in Europe and around the world.

Contents

History

After prototyping phases in 2004 and 2005, convergence with the LHC Computing Grid (LCG-2) distribution was reached in May 2006, when gLite 3.0 was released, and became the official middle-ware of the Enabling Grids for E-sciencE (EGEE) project which ended in 2010.

Development of the gLite middle-ware was then taken over by the European Middleware Initiative, and is now maintained as part of the EMI software stack.

The distributed computing infrastructure built by EGEE is now supported by the European Grid Infrastructure. It runs the Grid middle-ware produced by the "European Middleware Initiative", many components of which came from the gLite middle-ware.

Middle-ware description

Security

The gLite user community is grouped into Virtual Organisations (VOs). [1] A user must join a VO that is supported by the infrastructure running gLite to be authenticated and authorized to using grid resources.

The Grid Security Infrastructure (GSI) in WLCG/EGEE enables secure authentication and communication over an open network. [2] GSI is based on public key encryption, X.509 certificates, and the Secure Sockets Layer (SSL) communication protocol, with extensions for single sign-on and delegation.

To authenticate oneself, a user needs to have a digital X.509 certificate issued by a Certification Authority (CA) trusted by the infrastructure running the middle-ware.

The authorization of a user on a specific grid resource can be done in two different ways. The first is simpler, and relies on the grid-mapfile mechanism. The second way relies on the Virtual Organisation Membership Service (VOMS) and the LCAS/LCMAPS mechanism, which allow for a more detailed definition of user privileges.

User interface

The access point to the gLite Grid is the User Interface (UI). This can be any machine where users have a personal account and where their user certificate is installed. From a UI, a user can be authenticated and authorized to use the WLCG/EGEE resources, and can access the functionalities offered by the Information, Workload and Data management systems. It provides CLI tools to perform some basic Grid operations:

Computing element

A Computing Element (CE), in Grid terminology, is some set of computing resources localized at a site (i.e. a cluster, a computing farm). A CE includes a Grid Gate (GG), which acts as a generic interface to the cluster; a Local Resource Management System (LRMS) (sometimes called batch system), and the cluster itself, a collection of Worker Nodes (WNs), the nodes where the jobs are run.

There are two CE implementations in gLite 3.1: the LCG CE, developed by EDG and used in LCG-22, and the gLite CE, developed by EGEE. Sites can choose what to install, and some of them provide both types. The GG is responsible for accepting jobs and dispatching them for execution on the WNs via the LRMS.

In gLite 3.1 supported LRMS types were OpenPBS/PBSPro, Platform LSF, Maui/Torque, BQS and Condor, and Sun Grid Engine. [3]

Storage element

A Storage Element (SE) provides uniform access to data storage resources. The Storage Element may control simple disk servers, large disk arrays or tape-based Mass Storage Systems (MSS). Most WLCG/EGEE sites provide at least one SE.

Storage Elements can support different data access protocols and interfaces. Simply speaking, GSIFTP (a GSI-secure FTP) is the protocol for whole-file transfers, while local and remote file access is performed using RFIO or gsidcap.

Most storage resources are managed by a Storage Resource Manager (SRM), a middle-ware service providing capabilities like transparent file migration from disk to tape, file pinning, space reservation, etc. However, different SEs may support different versions of the SRM protocol and the capabilities can vary.

There is a number of SRM implementations in use, with varying capabilities. The Disk Pool Manager (DPM) is used for fairly small SEs with disk-based storage only, while CASTOR is designed to manage large-scale MSS, with front-end disks and back-end tape storage. dCache is targeted at both MSS and large-scale disk array storage systems. Other SRM implementations are in development, and the SRM protocol specification itself is also evolving.

Classic SEs, which do not have an SRM interface, provide a simple disk-based storage model. They are in the process of being phased out.[ when? ]

Information service

The Information Service (IS) provides information about the WLCG/EGEE Grid resources and their status. This information is essential for the operation of the whole Grid, as it is via the IS that resources are discovered. The published information is also used for monitoring and accounting purposes.

Much of the data published to the IS conforms to the GLUE Schema, [4] which defines a common conceptual data model to be used for Grid resource monitoring and discovery.

The Information System that is used in gLite 3.1 inherits its main concepts from the Globus Monitoring and Discovery Service (MDS). [5] However, the GRIS and GIIS in MDS has been replaced by the Berkeley Database Information Index (BDII) which is essentially an OpenLDAP server that is updated by an external process.

Workload management

The purpose of the Workload Management System (WMS) [6] is to accept user jobs, to assign them to the most appropriate Computing Element, to record their status and retrieve their output. The Resource Broker (RB) is the machine where the WMS services run.

Jobs to be submitted are described using the Job Description Language (JDL), which specifies, for example, which executable to run and its parameters, files to be moved to and from the Worker Node on which the job is run, input Grid files needed, and any requirements on the CE and the Worker Node.

The choice of CE to which the job is sent is made in a process called match-making, which first selects, among all available CEs, those which fulfill the requirements expressed by the user and which are close to specified input Grid files. It then chooses the CE with the highest rank, a quantity derived from the CE status information which expresses the goodness of a CE (typically a function of the numbers of running and queued jobs).

The RB locates the Grid input files specified in the job description using a service called the Data Location Interface (DLI), which provides a generic interface to a file catalogue. In this way, the Resource Broker can talk to file catalogs other than LFC (provided that they have a DLI interface).

The most recent implementation of the WMS from EGEE allows not only the submission of single jobs, but also collections of jobs (possibly with dependencies between them) in a much more efficient way then the old LCG-2 WMS, and has many other new options.

Finally, the Logging and Bookkeeping service (LB) [7] tracks jobs managed by the WMS. It collects events from many WMS components and records the status and history of the job.

Related Research Articles

Grid computing is the use of widely distributed computer resources to reach a common goal. A computing grid can be thought of as a distributed system with non-interactive workloads that involve many files. Grid computing is distinguished from conventional high-performance computing systems such as cluster computing in that grid computers have each node set to perform a different task/application. Grid computers also tend to be more heterogeneous and geographically dispersed than cluster computers. Although a single grid can be dedicated to a particular application, commonly a grid is used for a variety of purposes. Grids are often constructed with general-purpose grid middleware software libraries. Grid sizes can be quite large.

Storage Resource Broker (SRB) is data grid management computer software used in computational science research projects. SRB is a logical distributed file system based on a client-server architecture which presents users with a single global logical namespace or file hierarchy. Essentially, the software enables a user to use a single mechanism to work with multiple data sources.

UNICORE is a grid computing technology for resources such as supercomputers or cluster systems and information stored in databases. UNICORE was developed in two projects funded by the German ministry for education and research (BMBF). In European-funded projects UNICORE evolved to a middleware system used at several supercomputer centers. UNICORE served as a basis in other research projects. The UNICORE technology is open source under BSD licence and available at SourceForge.

The Storage Resource Management (SRM) technology was initiated by the Scientific Data Management Group at Lawrence Berkeley National Laboratory (LBNL) and developed in response to the growing needs of managing large datasets on a variety of storage systems.

<span class="mw-page-title-main">European Grid Infrastructure</span> Effort to provide access to high-throughput computing resources across Europe

European Grid Infrastructure (EGI) is a series of efforts to provide access to high-throughput computing resources across Europe using grid computing techniques. The EGI links centres in different European countries to support international research in many scientific disciplines. Following a series of research projects such as DataGrid and Enabling Grids for E-sciencE, the EGI Foundation was formed in 2010 to sustain the services of EGI.

The D-Grid Initiative was a government project to fund computer infrastructure for education and research (e-Science) in Germany. It uses the term grid computing. D-Grid started September 1, 2005 with six community projects and an integration project (DGI) as well as several partner projects.

Wireless grids are wireless computer networks consisting of different types of electronic devices with the ability to share their resources with any other device in the network in an ad hoc manner. A definition of the wireless grid can be given as: "Ad hoc, distributed resource-sharing networks between heterogeneous wireless devices" The following key characteristics further clarify this concept:

<span class="mw-page-title-main">Open Grid Forum</span> Computing standards organization

The Open Grid Forum (OGF) is a community of users, developers, and vendors for standardization of grid computing. It was formed in 2006 in a merger of the Global Grid Forum and the Enterprise Grid Alliance. The OGF models its process on the Internet Engineering Task Force (IETF), and produces documents with many acronyms such as OGSA, OGSI, and JSDL.

The Nordic Data Grid Facility, or NDGF, is a common e-Science infrastructure provided by the Nordic countries for scientific computing and data storage. It is the first and so far only internationally distributed WLCG Tier1 center, providing computing and storage services to experiments at CERN.

The INFN Grid project was an initiative of the Istituto Nazionale di Fisica Nucleare (INFN) —Italy's National Institute for Nuclear Physics—for grid computing. It was intended to develop and deploy grid middleware services to allow INFN's users to transparently and securely share the computing and storage resources together with applications and technical facilities for scientific collaborations.

<span class="mw-page-title-main">Computer cluster</span> Set of computers configured in a distributed computing system

A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software.

<span class="mw-page-title-main">GLite-AMGA</span>

The ARDA Metadata Grid Application (AMGA) is a general purpose metadata catalogue and part of the European Middleware Initiative middleware distribution. It was originally developed by the EGEE project as part of its gLite middleware, when it became clear that many Grid applications needed metadata information on files and to organize a work-flow. AMGA is now developed and supported by the European Middleware Initiative.

<span class="mw-page-title-main">P-GRADE Portal</span> Grid computing software

The P-GRADE Grid Portal was software for web portals to manage the life-cycle of executing a parallel application in grid computing. It was developed by the MTA SZTAKI Laboratory of Parallel and Distributed Systems (LPDS) at the Hungarian Academy of Sciences, Hungary, from around 2005 through 2010.

<span class="mw-page-title-main">MTA SZTAKI Laboratory of Parallel and Distributed Systems</span> Hungarian research laboratory

The Laboratory of Parallel and Distributed Systems (LPDS), as a department of MTA SZTAKI, is a research laboratory in distributed grid and cloud technologies. LPDS is a founding member of the Hungarian Grid Competence Centre, the Hungarian National Grid Initiative, and the Hungarian OpenNebula Community and also coordinates several European grid/cloud projects.

gUSE Grid computing framework

The Grid and Cloud User Support Environment (gUSE), also known as WS-PGRADE /gUSE, is an open source science gateway framework that enables users to access grid and cloud infrastructures. gUSE is developed by the Laboratory of Parallel and Distributed Systems (LPDS) at Institute for Computer Science and Control (SZTAKI) of the Hungarian Academy of Sciences.

<span class="mw-page-title-main">OpenNebula</span> Cloud computing platform for managing heterogeneous distributed data center infrastructures

OpenNebula is a hyper-converged infrastructure platform for managing heterogeneous distributed data center infrastructures. The OpenNebula platform manages a data center's virtual infrastructure to build private, public and hybrid implementations of Infrastructure as a Service. The two primary uses of the OpenNebula platform are data center virtualization and cloud deployments based on the KVM hypervisor, LXD/LXC system containers, and AWS Firecracker microVMs. The platform is also capable of offering the cloud infrastructure necessary to operate a cloud on top of existing VMware infrastructure. In early June 2020, OpenNebula announced the release of a new Enterprise Edition for corporate users, along with a Community Edition. OpenNebula CE is free and open-source software, released under the Apache License version 2. OpenNebula CE comes with free access to maintenance releases but with upgrades to new minor/major versions only available for users with non-commercial deployments or with significant contributions to the OpenNebula Community. OpenNebula EE is distributed under a closed-source license and requires a commercial Subscription.

The SHIWA project within grid computing was a project led by the LPDS of MTA Computer and Automation Research Institute. The project coordinator was Prof. Dr. Peter Kacsuk. It started on 1 July 2010 and lasted two years. SHIWA was supported by a grant from the European Commission's FP7 INFRASTRUCTURES-2010-2 call under grant agreement n°261585.

<span class="mw-page-title-main">DIET</span>

DIET is a software for grid-computing. As middleware, DIET sits between the operating system and the application software. DIET was created in 2000. It was designed for high-performance computing. It is currently developed by INRIA, École Normale Supérieure de Lyon, CNRS, Claude Bernard University Lyon 1, SysFera. It is open-source software released under the CeCILL license.

The Generic Grid-Grid (3G) Bridge is an open-source core job bridging component between different grid infrastructures. Its development started in 2008 within the CancerGrid and EDGeS projects. The aim was to create a generic bridge component that can be used in different grid interoperability scenarios. The 3G Bridge used within the EDGeS project that provides the core component of the Service Grid - Desktop Grid interoperability solution. 3G Bridge helps to connect user communities of different grid systems. For example, communities working on parameter sweep problems and using service grid infrastructures can migrate their applications to the more adequate desktop grid platform using the 3G Bridge technology, resulting in an accelerated research.

<span class="mw-page-title-main">European Middleware Initiative</span>

The European Middleware Initiative (EMI) is a computer software platform for high performance distributed computing. It is developed and distributed directly by the EMI project. It is the base for other grid middleware distributions used by scientific research communities and distributed computing infrastructures all over the world especially in Europe, South America and Asia. EMI supports broad scientific experiments and initiatives, such as the Worldwide LHC Computing Grid.

References

  1. Foster, Kesselman, Tuecke, The Anatomy of the Grid: Enabling Scalable Virtual Organizations Archived 2009-03-10 at the Wayback Machine , Int. J. High Performance Computing Applicat., 2001
  2. The Globus Toolkit 4.0, Overview of the Grid Security Infrastructure Archived 2008-04-20 at the Wayback Machine
  3. CESGA Experience with the Grid Engine batch system
  4. OGF MDS 2.2 Features Archived 2012-12-13 at the Wayback Machine in the Globus Toolkit 2.2 Release
  5. GLUE Working Group (GLUE)
  6. F Pacini, EGEE User's Guide, WMS Service, DATAMAT, 2005
  7. EGEE User's Guide, Service Logging and Bookkeeping (L&B), CESNET, 2005