Advanced Resource Connector

Last updated
ARC
Developer(s) NorduGrid, NeIC, EU projects
Initial release13 April 2004;20 years ago (2004-04-13)
Stable release
6.19 / 10 April 2024;5 days ago (2024-04-10)
Repository github.com/nordugrid/arc
Written in C++, PHP, Perl, Python, Shell
Operating system Linux, Microsoft Windows, Mac OS X
Available inEnglish, Russian, Swedish
Type Grid computing
License Apache License 2.0 [1]
Website www.nordugrid.org

Advanced Resource Connector (ARC) is a grid computing middleware introduced by NorduGrid. It provides a common interface for submission of computational tasks to different distributed computing systems and thus can enable grid infrastructures of varying size and complexity. The set of services and utilities providing the interface is known as ARC Computing Element (ARC-CE). [2] ARC-CE functionality includes data staging and caching, developed in order to support data-intensive distributed computing. [3] ARC is an open source software distributed under the Apache License 2.0. [1]

Contents

History

ARC appeared (and is still often referred to) as the NorduGrid middleware , originally proposed as an architecture on top of the Globus Toolkit [4] optimized for the needs of High-Energy Physics computing for the Large Hadron Collider experiments. [5] First deployment of ARC at the NorduGrid testbed took place in summer 2002, and by 2003 it was used to support complex computations. [6]

The first stable release of ARC (version 0.4) came out in April 2004 under the GNU General Public License. [7] The name "Advanced Resource Connector" was introduced for this release to distinguish the middleware from the infrastructure. In the same year, the Swedish national Grid project Swegrid became the first large cross-discipline infrastructure to be based on ARC. [8]

In 2005, NorduGrid was formally established as a collaboration to support and coordinate ARC development. [9] In 2006 two closely related projects were launched: the Nordic Data Grid Facility, deploying a pan-Nordic e-Science infrastructure based on ARC, and KnowARC, focused on transforming ARC into a next generation Grid middleware. [10]

ARC v0.6 was released in May 2007, becoming the second stable release. [11] Its key feature was introduction of the client library enabling easy development of higher-level applications. It was also the first ARC release making use of open standards, as it included support for JSDL. Later that year, the first technology preview of the next generation ARC middleware was made available, though was not distributed with ARC itself. [12] The new approach involved switching to a Web service based architecture, and in general a very substantial re-factorisation of the core code. [13]

In 2008, the NorduGrid consortium adopted the Apache License for all ARC components. [14]

The last stable release in the 0-line was ARC v0.8, shipped in September 2009. [15] It eventually included a preview version of the new execution service - the A-REX - and several other components, such as Chelonia, ISIS, Charon and the arcjobtool GUI. [16]

In parallel to ARC v0.8, the EU KnowARC project released in November 2009 the conceptual ARC NOX suite, which was a complete Grid solution, fully based on Web service technologies. [17] The name NOX actually indicates the release date: November of the Year of the Ox. [17]

In May 2011, NorduGrid released ARC v11.05 (adopting Ubuntu versioning scheme this time). This release marked the complete transition from the old execution service to A-REX and accompanying services. For backwards compatibility with the existing infrastructures, old interfaces for the execution service and the information system were retained. [18]

ARC 6 was released in May 2019 [19] and while having same interfaces it features a completely redesigned configuration and a new management tool. [20]

Source code

ARC is free software available from the NorduGrid public repository, both as binary packages for a variety of Linux systems and source, as well as on GitHub. [21] The open source development of the ARC middleware is coordinated by the NorduGrid collaboration. Contributions to the software, documentation and dissemination activities are coming from the community and from various projects, such as the EU KnowARC and EMI projects, NDGF, NeIC and various national infrastructure and research projects.

Versioning


Between 2011 and 2018 ARC used an Ubuntu-like versioning schema for bundled releases consisting of individual components. Individual components have own versioning, corresponding to code tags. [22] Version of the core ARC packages is often used instead of the formal release number in everyday communication. Starting with ARC6 (2019) version number of the release coincides with that of the tag.

Standards and interoperability

ARC implements several Open Grid Forum standards, in particular, JSDL, Glue2, BES, UR/RUS and StAR. [23]

ARC in various projects and initiatives

European Middleware Initiative

In 2010-2013, several key ARC components - most notably, HED, A-REX, clients and libraries - were included in the European Middleware Initiative (EMI) software stack. Through EMI, ARC became a part of the Unified Middleware Distribution (UMD) of the European Grid Infrastructure (EGI).

Nordic DataGrid Facility and NeIC

ARC is the basis of the computing infrastructure of the Nordic Data Grid Facility (NDGF), which constitutes a Tier1 center of WLCG. In 2006-2010 NDGF actively contributed to ARC development, and since 2010 provides ARC deployment expertise within EGI. Since 2012, NDGF became a part of the Nordic e-Infrastructure Collaboration as a Nordic Tier-1 (NT1) project. [24]

KnowARC project

Grid-enabled Know-how Sharing Technology Based on ARC Services and Open Standards (KnowARC) was a Sixth Framework Programme Specific Targeted Research Project, funded under Priority IST-2005-2.5.4 "Advanced Grid Technologies, Systems and Services" from June 2006 to November 2009. [25] [26] In many ways it was the project that shaped ARC. The main goal was to make ARC based on open community standards, and among the key results was creation of the standardized Hosting Environment for ARC services (HED).

Apart from its main aim of further developing ARC, [13] it contributed to the development of standards, [27] and increased Grid and ARC usage in medicine and bioinformatics. [28] [29]

In July 2009, KnowARC announced it contributed to the integration of Grid technologies into official Linux repositories by adding Globus Toolkit components into Fedora and Debian repositories. [30]

See also

Related Research Articles

Grid computing is the use of widely distributed computer resources to reach a common goal. A computing grid can be thought of as a distributed system with non-interactive workloads that involve many files. Grid computing is distinguished from conventional high-performance computing systems such as cluster computing in that grid computers have each node set to perform a different task/application. Grid computers also tend to be more heterogeneous and geographically dispersed than cluster computers. Although a single grid can be dedicated to a particular application, commonly a grid is used for a variety of purposes. Grids are often constructed with general-purpose grid middleware software libraries. Grid sizes can be quite large.

Storage Resource Broker (SRB) is data grid management computer software used in computational science research projects. SRB is a logical distributed file system based on a client-server architecture which presents users with a single global logical namespace or file hierarchy. Essentially, the software enables a user to use a single mechanism to work with multiple data sources.

UNICORE (UNiform Interface to COmputing REsources) is a grid computing technology for resources such as supercomputers or cluster systems and information stored in databases. UNICORE was developed in two projects funded by the German ministry for education and research (BMBF). In European-funded projects UNICORE evolved to a middleware system used at several supercomputer centers. UNICORE served as a basis in other research projects. The UNICORE technology is open source under BSD licence and available at SourceForge.

The cancer Biomedical Informatics Grid (caBIG) was a US government program to develop an open-source, open access information network called caGrid for secure data exchange on cancer research. The initiative was developed by the National Cancer Institute and was maintained by the Center for Biomedical Informatics and Information Technology (CBIIT) and program managed by Booz Allen Hamilton. In 2011 a report on caBIG raised significant questions about effectiveness and oversight, and its budget and scope were significantly trimmed. In May 2012, the National Cancer Informatics Program (NCIP) was created as caBIG's successor program.

<span class="mw-page-title-main">Edinburgh Parallel Computing Centre</span> Supercomputing centre at the University of Edinburgh

EPCC, formerly the Edinburgh Parallel Computing Centre, is a supercomputing centre based at the University of Edinburgh. Since its foundation in 1990, its stated mission has been to accelerate the effective exploitation of novel computing throughout industry, academia and commerce.

<span class="mw-page-title-main">NorduGrid</span> Grid computing project

NorduGrid is a collaboration aiming at development, maintenance and support of the free Grid middleware, known as the Advanced Resource Connector (ARC).

GridWay is an open-source meta-scheduling technology that enables large-scale, secure, reliable and efficient sharing of computing resources, managed by different distributed resource management systems (DRMS), such as SGE, HTCondor, PBS or LSF, within a single organization or scattered across several administrative domains. To this end, GridWay supports several Grid middlewares.

<span class="mw-page-title-main">European Grid Infrastructure</span> Effort to provide access to high-throughput computing resources across Europe

European Grid Infrastructure (EGI) is a series of efforts to provide access to high-throughput computing resources across Europe using grid computing techniques. The EGI links centres in different European countries to support international research in many scientific disciplines. Following a series of research projects such as DataGrid and Enabling Grids for E-sciencE, the EGI Foundation was formed in 2010 to sustain the services of EGI.

The D-Grid Initiative was a government project to fund computer infrastructure for education and research (e-Science) in Germany. It uses the term grid computing. D-Grid started September 1, 2005 with six community projects and an integration project (DGI) as well as several partner projects.

The Simple API for Grid Applications (SAGA) is a family of related standards specified by the Open Grid Forum to define an application programming interface (API) for common distributed computing functionality.

GARUDA(Global Access to Resource Using Distributed Architecture) is India's Grid Computing initiative connecting 17 cities across the country. The 45 participating institutes in this nationwide project include all the IITs and C-DAC centers and other major institutes in India.

The Nordic Data Grid Facility, or NDGF, is a common e-Science infrastructure provided by the Nordic countries for scientific computing and data storage. It is the first and so far only internationally distributed WLCG Tier1 center, providing computing and storage services to experiments at CERN.

The INFN Grid project was an initiative of the Istituto Nazionale di Fisica Nucleare (INFN) —Italy's National Institute for Nuclear Physics—for grid computing. It was intended to develop and deploy grid middleware services to allow INFN's users to transparently and securely share the computing and storage resources together with applications and technical facilities for scientific collaborations.

gLite Grid computing software

gLite is a middleware computer software project for grid computing used by the CERN LHC experiments and other scientific domains. It was implemented by collaborative efforts of more than 80 people in 12 different academic and industrial research centers in Europe. gLite provides a framework for building applications tapping into distributed computing and storage resources across the Internet. The gLite services were adopted by more than 250 computing centres, and used by more than 15000 researchers in Europe and around the world.

<span class="mw-page-title-main">P-GRADE Portal</span> Grid computing software

The P-GRADE Grid Portal was software for web portals to manage the life-cycle of executing a parallel application in grid computing. It was developed by the MTA SZTAKI Laboratory of Parallel and Distributed Systems (LPDS) at the Hungarian Academy of Sciences, Hungary, from around 2005 through 2010.

The SHIWA project within grid computing was a project led by the LPDS of MTA Computer and Automation Research Institute. The project coordinator was Prof. Dr. Peter Kacsuk. It started on 1 July 2010 and lasted two years. SHIWA was supported by a grant from the European Commission's FP7 INFRASTRUCTURES-2010-2 call under grant agreement n°261585.

GridRPC in distributed computing, is Remote Procedure Call over a grid. This paradigm has been proposed by the GridRPC working group of the Open Grid Forum (OGF), and an API has been defined in order for clients to access remote servers as simply as a function call. It is used among numerous Grid middleware for its simplicity of implementation, and has been standardized by the OGF in 2007. For interoperability reasons between the different existing middleware, the API has been followed by a document describing good use and behavior of the different GridRPC API implementations. Works have then been conducted on the GridRPC Data Management, which has been standardized in 2011.

<span class="mw-page-title-main">European Middleware Initiative</span>

The European Middleware Initiative (EMI) is a computer software platform for high performance distributed computing. It is developed and distributed directly by the EMI project. It is the base for other grid middleware distributions used by scientific research communities and distributed computing infrastructures all over the world especially in Europe, South America and Asia. EMI supports broad scientific experiments and initiatives, such as the Worldwide LHC Computing Grid.

<span class="mw-page-title-main">Data grid</span> Set of services used to access, modify and transfer geographical data

A data grid is an architecture or set of services that gives individuals or groups of users the ability to access, modify and transfer extremely large amounts of geographically distributed data for research purposes. Data grids make this possible through a host of middleware applications and services that pull together data and resources from multiple administrative domains and then present it to users upon request. The data in a data grid can be located at a single site or multiple sites where each site can be its own administrative domain governed by a set of security restrictions as to who may access the data. Likewise, multiple replicas of the data may be distributed throughout the grid outside their original administrative domain and the security restrictions placed on the original data for who may access it must be equally applied to the replicas. Specifically developed data grid middleware is what handles the integration between users and the data they request by controlling access while making it available as efficiently as possible. The adjacent diagram depicts a high level view of a data grid.

DaviX is an open-source client for WebDAV and Amazon S3 available for Microsoft Windows, Apple MacOSX and Linux. DaviX is written in C++ and provide several command-line tools and a C++ shared library.

References

  1. 1 2 NorduGrid Downloads
  2. "ARC Computing Element System Administrator Guide" (PDF). NorduGrid. 25 June 2015. Retrieved 26 June 2015.
  3. Ellert, Mattias; et al. (February 2007). "Advanced Resource Connector middleware for lightweight computational Grids". Future Generation Computer Systems. 23 (2): 219–240. doi:10.1016/j.future.2006.05.008.
  4. Ellert, Mattias; Konstantinov, Aleksandr; Kónya, Balázs; Smirnova, Oxana; Wäänänen, Anders (2003). "The NorduGrid project: using Globus toolkit for building GRID infrastructure". Nuclear Instruments and Methods in Physics Research A. 502 (2–3): 407–410. Bibcode:2003NIMPA.502..407E. doi:10.1016/S0168-9002(03)00453-4.
  5. Wäänänen, Anders; Ellert, Mattias; Konstantinov, Aleksandr; Kónya, Balázs (2002). "An Overview of an Architecture Proposal for a High Energy Physics Grid". In Fagerholm, Juha; Haataja, Juha; Järvinen, Jari; Lyly, Mikko; Råback, Peter; Savolainen, Ville (eds.). Lecture Notes in Computer Science. Vol. 2367. Springer. pp. 76–86. doi:10.1007/3-540-48051-X_9. ISBN   978-3-540-43786-4.
  6. Eerola, Paula; et al. (2003). "Atlas Data-Challenge 1 on NorduGrid". Proceedings of 2003 Conference for Computing in High Energy and Nuclear Physics. arXiv: physics/0306013 . Bibcode:2003physics...6013E.
  7. ARC 0.4 Release Notes
  8. "SweGrid gets set for future challenges". CERN Courier. 2004.
  9. NorduGrid Web site
  10. "Grid-enabled know-how sharing technology based on ARC services and open standards".
  11. ARC 0.6 Release Notes
  12. "KnowARC report D5.1-2_07" (PDF). Archived from the original (PDF) on 2010-11-08. Retrieved 2009-08-22.
  13. 1 2 Smirnova, Oxana; et al. (2009). "ARC middleware:evolution towards standards-based interoperability" (PDF). Proceedings of the 17th International Conference on Computing in High Energy and Nuclear Physics.
  14. "NorduGrid ARC License".
  15. ARC 0.8 Release Notes
  16. ARC 0.8.2 Release Notes
  17. 1 2 ARC NOX Release Notes
  18. ARC 11.05 Release Notes
  19. ARC 6 Release Notes
  20. ARC 6 Documentation
  21. "NorduGrid ARC". GitHub.
  22. ARC releases table
  23. W. Qiang (31 October 2012). Transparent use of open standards in the EMI component ecosystem (Report). CERN.
  24. NeIC Web site
  25. KnowARC fact-sheet, EU IST database
  26. Hämmerle, Hannelore; Crémel, Nicole (November 2006). "KnowARC project gets going". CERN Courier. 46 (11). Geneva, Switzerland: 12.
  27. Field, Laurence; Andreozzi, Sergio; Kónya, Balázs (2008). "Grid Information System Interoperability: The Need for a Common Information Model". 2008 IEEE Fourth International Conference on eScience. pp. 501–507. doi:10.1109/eScience.2008.159. ISBN   978-1-4244-3380-3. S2CID   11545984.
  28. Zhou, Xin; et al. (2009). "An Easy Setup for Parallel Medical Image Processing: Using Taverna and ARC". Studies in Health Technology and Informatics. 147 (Healthgrid Research, Innovation and Business Case): 41–50. doi:10.3233/978-1-60750-027-8-41. PMID   19593043.
  29. Krabbenhöft, Hajo; Möller, Steffen; Bayer, Daniel (2008). "Integrating ARC grid middleware with Taverna workflows". Bioinformatics. 24 (9): 1221–1222. doi: 10.1093/bioinformatics/btn095 . PMID   18353787.
  30. "KnowARC Project Brings Grids to Debian". HPC Wire. July 9, 2009. Archived from the original on September 5, 2009.

Further reading