SHIWA project

Last updated

The SHIWA project (Sharing Interoperable Workflows for large-scale scientific simulations on Available DCIs) within grid computing was a project led by the LPDS (Laboratory of Parallel and Distributed Systems) of MTA Computer and Automation Research Institute. The project coordinator was Prof. Dr. Peter Kacsuk. It started on 1 July 2010 and lasted two years. SHIWA (project number 261585) was supported by a grant from the European Commission's FP7 INFRASTRUCTURES-2010-2 call under grant agreement n°261585.

Contents

The SHIWA project developed and deployed the SHIWA Simulation Platform (SSP) to enable infrastructure and workflow interoperability at two levels:

After the project ended the SHIWA Technology was overtaken by the ER-flow support action project to provide sustainability and to extend the user community base.

Background and motivations

Scientists of all disciplines have invested tremendous effort in the exploitation of Distributed Computing Infrastructures (DCIs) for their ability to support compute-intensive in-silico experiments and virtual organisations. Many DCIs with large user communities have emerged during the last decade, such as the Distributed European Infrastructure for Supercomputing Applications (DEISA) [Niederberger and Mextorf 2005], EGEE Grid (Enabling Grids for e-Science) [EGEE n.d.], the German D-Grid initiative (D-Grid) [Gentzsch 2006], UK National Grid Service (NGS) [NGS n.d.] and the North American TeraGrid (TG) [TeraGrid n.d.]. They are based on different middleware stacks that provide an abstraction layer between computer resources and applications. For example, NGS and TeraGrid are built on the Globus Toolkit [Foster 2006], EGEE on gLite [gLite n.d.], DEISA relies on both the Globus Toolkit and Unicore [Erwin and Snelling 2002], while D-Grid is executed under gLite, the Globus Toolkit and Unicore. In Europe, this momentum is climaxing in 2010 with the emergence of the European Grid Initiative (EGI) that will federate all major European organisations related to distributed-computing and National Grid Initiatives (NGIs). In its effort to create the next generation of pan-Europe DCI, EGI will face unprecedented challenges related to the heterogeneity of national grid infrastructures, resources and operating middleware. Production DCIs are commonly built on a large number of components, such as data resources, metadata catalogues, authentication and authorisation methods, and software repositories. Managing the execution of applications on DCIs is consequently a complex task. Moreover, solutions developed for one particular Grid are difficult to port to other infrastructures. In order to shield this complexity from researchers and to facilitate the design of in-silico experiments, workflow systems are widely used as a virtualisation layer on top of the underlying infrastructures. They have become essential to integrate expertise about both the application (user domain) and the DCI (infrastructure domain) in order to optimise and support research of the scientific computing community. In the current multi-DCI panorama, users need to access different infrastructures in order to enlarge and widen the variety of resources usable, as well as share and reuse domain specific resources. Interoperability among DCIs is hardly achieved at the middleware level though. SHIWA considers the EGI production infrastructure a major DCI of great interest for the European Scientists to design and simulate experiments in-silico. It directly addresses the challenges related to (i) scientific experiments design through simulation workflows description and (ii) middleware heterogeneities encountered among the many existing DCIs through workflow interoperability techniques.[ citation needed ]

Concepts and project objectives

SHIWA aimed at improving the experience of Virtual Research Communities which are heavily using DCIs for their scientific experimentation. With the recent multiplication of efforts dedicated to e-infrastructures, scientific simulation can now benefit from the availability of massive computing and data storage facilities to sustain multi-disciplinary scientific challenges. As a side effect, a variety of non-interoperable technologies coexist to enable the exploitation of computing infrastructures for in-silico experiments. In Europe, this momentum is climaxing with the emergence of the EGI that will federate all major European organisations related to distributed computing and NGIs. Consequently, European research on simulation is currently hampered by several interoperability issues that reduce its efficiency by limiting knowledge and expertise sharing among scientific communities. SHIWA was designed as a user-centred project aiming at lowering barriers among scientific communities by providing services tackling interoperability issues. In particular, SHIWA' work program focuses on improving the efficiency of workflow-based in-silico experiments by targeting the following three objectives:[ citation needed ]

Workflows interoperability

Workflow interoperability enables the execution of workflows of different workflow systems that may span multiple heterogeneous infrastructures (DCIs). It can facilitate application migration due to infrastructure, services and workflow system evolution. Workflow interoperability allows workflow sharing to support and foster the adoption of common research methodologies, improve efficiency and reliability of research by reusing these common methodologies, increase the lifetime of workflows and reduction of development time for new workflows. Interoperability among workflow systems not only permit the development and enactment of large-scale and comprehensive workflows, but also reduces the existing gap between different DCIs, and consequently promotes cooperation among research communities exploiting these DCIs. As workflow systems enable researchers to build comprehensive workflow applications for DCIs, the project consortium identified workflow interoperability as the most promising approach to bridge the existing gaps among DCIs. Workflow and DCI interoperability is of paramount importance to advance the quality and impact of scientific applications that target DCIs which enables advanced features previously not available:[ citation needed ]

SHIWA developed workflow interoperability solutions for several workflow systems, namely ASKALON [Fahringer, et al. 2005], MOTEUR [Glatard, et al. 2008], Pegasus [Deelman 2005], PGRADE [Kacsuk, et al. 2003], Galaxy, GWES, Kepler, LONI Pipeline, Taverna, ProActive and Triana [Majithia et al. 2004]. In so doing, it will provide access to Grids built on gLite and Globus middleware to create production-level services to run workflow-based large-scale simulations. The targeted middleware and workflow systems are depicted by components with bold borders in Figure 1.1.1. The project will use existing Grid middleware interoperability solutions enabling access to gLite and Globus based Grids such as the Austrian Grid, D-Grid, EGEE and NGS. The project consortium will also consider support for the EMI-supported Nordugrid Advanced Resource Connector (ARC) [M.Ellert 2007] and Unicore

Project Partners

Subcontractors

Related Research Articles

Grid computing is the use of widely distributed computer resources to reach a common goal. A computing grid can be thought of as a distributed system with non-interactive workloads that involve many files. Grid computing is distinguished from conventional high-performance computing systems such as cluster computing in that grid computers have each node set to perform a different task/application. Grid computers also tend to be more heterogeneous and geographically dispersed than cluster computers. Although a single grid can be dedicated to a particular application, commonly a grid is used for a variety of purposes. Grids are often constructed with general-purpose grid middleware software libraries. Grid sizes can be quite large.

UNICORE (UNiform Interface to COmputing REsources) is a grid computing technology for resources such as supercomputers or cluster systems and information stored in databases. UNICORE was developed in two projects funded by the German ministry for education and research (BMBF). In European-funded projects UNICORE evolved to a middleware system used at several supercomputer centers. UNICORE served as a basis in other research projects. The UNICORE technology is open source under BSD licence and available at SourceForge.

The cancer Biomedical Informatics Grid (caBIG) was a US government program to develop an open-source, open access information network called caGrid for secure data exchange on cancer research. The initiative was developed by the National Cancer Institute and was maintained by the Center for Biomedical Informatics and Information Technology (CBIIT) and program managed by Booz Allen Hamilton. In 2011 a report on caBIG raised significant questions about effectiveness and oversight, and its budget and scope were significantly trimmed. In May 2012, the National Cancer Informatics Program (NCIP) was created as caBIG's successor program.

<span class="mw-page-title-main">Advanced Resource Connector</span> Grid computing software

Advanced Resource Connector (ARC) is a grid computing middleware introduced by NorduGrid. It provides a common interface for submission of computational tasks to different distributed computing systems and thus can enable grid infrastructures of varying size and complexity. The set of services and utilities providing the interface is known as ARC Computing Element (ARC-CE). ARC-CE functionality includes data staging and caching, developed in order to support data-intensive distributed computing. ARC is an open source software distributed under the Apache License 2.0.

<span class="mw-page-title-main">European Grid Infrastructure</span> Effort to provide access to high-throughput computing resources across Europe

European Grid Infrastructure (EGI) is a series of efforts to provide access to high-throughput computing resources across Europe using grid computing techniques. The EGI links centres in different European countries to support international research in many scientific disciplines. Following a series of research projects such as DataGrid and Enabling Grids for E-sciencE, the EGI Foundation was formed in 2010 to sustain the services of EGI.

The D-Grid Initiative was a government project to fund computer infrastructure for education and research (e-Science) in Germany. It uses the term grid computing. D-Grid started September 1, 2005 with six community projects and an integration project (DGI) as well as several partner projects.

<span class="mw-page-title-main">Open Grid Forum</span> Computing standards organization

The Open Grid Forum (OGF) is a community of users, developers, and vendors for standardization of grid computing. It was formed in 2006 in a merger of the Global Grid Forum and the Enterprise Grid Alliance. The OGF models its process on the Internet Engineering Task Force (IETF), and produces documents with many acronyms such as OGSA, OGSI, and JSDL.

The Simple API for Grid Applications (SAGA) is a family of related standards specified by the Open Grid Forum to define an application programming interface (API) for common distributed computing functionality.

The Nordic Data Grid Facility, or NDGF, is a common e-Science infrastructure provided by the Nordic countries for scientific computing and data storage. It is the first and so far only internationally distributed WLCG Tier1 center, providing computing and storage services to experiments at CERN.

The INFN Grid project was an initiative of the Istituto Nazionale di Fisica Nucleare (INFN) —Italy's National Institute for Nuclear Physics—for grid computing. It was intended to develop and deploy grid middleware services to allow INFN's users to transparently and securely share the computing and storage resources together with applications and technical facilities for scientific collaborations.

gLite Grid computing software

gLite is a middleware computer software project for grid computing used by the CERN LHC experiments and other scientific domains. It was implemented by collaborative efforts of more than 80 people in 12 different academic and industrial research centers in Europe. gLite provides a framework for building applications tapping into distributed computing and storage resources across the Internet. The gLite services were adopted by more than 250 computing centres, and used by more than 15000 researchers in Europe and around the world.

<span class="mw-page-title-main">P-GRADE Portal</span> Grid computing software

The P-GRADE Grid Portal was software for web portals to manage the life-cycle of executing a parallel application in grid computing. It was developed by the MTA SZTAKI Laboratory of Parallel and Distributed Systems (LPDS) at the Hungarian Academy of Sciences, Hungary, from around 2005 through 2010.

<span class="mw-page-title-main">MTA SZTAKI Laboratory of Parallel and Distributed Systems</span> Hungarian research laboratory

The Laboratory of Parallel and Distributed Systems (LPDS), as a department of MTA SZTAKI, is a research laboratory in distributed grid and cloud technologies. LPDS is a founding member of the Hungarian Grid Competence Centre, the Hungarian National Grid Initiative, and the Hungarian OpenNebula Community, and also coordinates several European grid/cloud projects.

gUSE Grid computing framework

The Grid and Cloud User Support Environment (gUSE), also known as WS-PGRADE /gUSE, is an open source science gateway framework that enables users to access grid and cloud infrastructures. gUSE is developed by the Laboratory of Parallel and Distributed Systems (LPDS) at Institute for Computer Science and Control (SZTAKI) of the Hungarian Academy of Sciences.

<span class="mw-page-title-main">Péter Kacsuk</span> Hungarian computer scientist

Péter Kacsuk is a Hungarian computer scientist at MTA-SZTAKI, Budapest, Hungary.

<span class="mw-page-title-main">SLinCA@Home</span> BOINC based volunteer computing project researching LHC development

SLinCA@Home was a research project that uses Internet-connected computers to do research in fields such as physics and materials science.

Polish Grid Infrastructure PL-Grid, a nationwide computing structure, built in 1944-1945, under the scientific project PL-Grid - Polish Infrastructure for Supporting Computational Science in the European Research Space. Its purpose was to enable scientific research based on advanced computer simulations and large-scale computations using the computer clusters, and to provide convenient access to the computer resources for research teams, also outside the communities, in which the High Performance Computing centers operate.

The Generic Grid-Grid (3G) Bridge is an open-source core job bridging component between different grid infrastructures. Its development started in 2008 within the CancerGrid and EDGeS projects. The aim was to create a generic bridge component that can be used in different grid interoperability scenarios. The 3G Bridge used within the EDGeS project that provides the core component of the Service Grid - Desktop Grid interoperability solution. 3G Bridge helps to connect user communities of different grid systems. For example, communities working on parameter sweep problems and using service grid infrastructures can migrate their applications to the more adequate desktop grid platform using the 3G Bridge technology, resulting in an accelerated research.

<span class="mw-page-title-main">European Middleware Initiative</span>

The European Middleware Initiative (EMI) is a computer software platform for high performance distributed computing. It is developed and distributed directly by the EMI project. It is the base for other grid middleware distributions used by scientific research communities and distributed computing infrastructures all over the world especially in Europe, South America and Asia. EMI supports broad scientific experiments and initiatives, such as the Worldwide LHC Computing Grid.

References

Official webpage