Pegasus (workflow management)

Last updated
Pegasus
Developer(s) University of Southern California, Information Sciences Institute, University of Wisconsin-Madison
Stable release
5.0 Beta1 / July 27, 2020;3 years ago (2020-07-27)
Written in Java, Python, C
Operating system macOS, Linux
Type Workflow management system
License Apache License 2.0
Website pegasus.isi.edu

Pegasus is an open-source workflow management system. [1] [2] [3] It provides the necessary abstractions for scientists to create scientific workflows [4] and allows for transparent execution of these workflows on a range of computing platforms including high performance computing clusters, clouds, and national cyberinfrastructure. [5] [6] In Pegasus, workflows are described abstractly as directed acyclic graphs (DAGs) using a provided API for Jupyter Notebooks, Python, R, or Java. [7] During execution, Pegasus translates the constructed abstract workflow into an executable workflow [8] [9] which is executed and managed by HTCondor. [10] [11]

Contents

Pegasus is being used in a number of different disciplines including astronomy, gravitational-wave physics, bioinformatics, earthquake engineering, and helioseismology. [12] Notably, the LIGO Scientific Collaboration has used it to directly detect a gravitational wave for the first time. [8] [13] [14]

Area of applications

Application examples: [5] [6]

History

The development of Pegasus started in 2001.

See also

Related Research Articles

<span class="mw-page-title-main">LIGO</span> Gravitational wave detector

The Laser Interferometer Gravitational-Wave Observatory (LIGO) is a large-scale physics experiment and observatory designed to detect cosmic gravitational waves and to develop gravitational-wave observations as an astronomical tool. Two large observatories were built in the United States with the aim of detecting gravitational waves by laser interferometry. These observatories use mirrors spaced four kilometers apart which are capable of detecting a change of less than one ten-thousandth the charge diameter of a proton.

<span class="mw-page-title-main">Kip Thorne</span> American physicist (born 1940)

Kip Stephen Thorne is an American theoretical physicist known for his contributions in gravitational physics and astrophysics.

<span class="mw-page-title-main">Einstein@Home</span> BOINC volunteer computing project that analyzes data from LIGO to detect gravitational waves

Einstein@Home is a volunteer computing project that searches for signals from spinning neutron stars in data from gravitational-wave detectors, from large radio telescopes, and from a gamma-ray telescope. Neutron stars are detected by their pulsed radio and gamma-ray emission as radio and/or gamma-ray pulsars. They also might be observable as continuous gravitational wave sources if they are rapidly spinning and non-axisymmetrically deformed. The project was officially launched on 19 February 2005 as part of the American Physical Society's contribution to the World Year of Physics 2005 event.

<span class="mw-page-title-main">GEO600</span> Gravitational wave detector in Germany

GEO600 is a gravitational wave detector located near Sarstedt, a town 20 km to the south of Hanover, Germany. It is designed and operated by scientists from the Max Planck Institute for Gravitational Physics, Max Planck Institute of Quantum Optics and the Leibniz Universität Hannover, along with University of Glasgow, University of Birmingham and Cardiff University in the United Kingdom, and is funded by the Max Planck Society and the Science and Technology Facilities Council (STFC). GEO600 is capable of detecting gravitational waves in the frequency range 50 Hz to 1.5 kHz, and is part of a worldwide network of gravitational wave detectors. This instrument, and its sister interferometric detectors, when operational, are some of the most sensitive gravitational wave detectors ever designed. They are designed to detect relative changes in distance of the order of 10−21, about the size of a single atom compared to the distance from the Sun to the Earth. Construction on the project began in 1995.

Vasant G. Honavar is an Indian-American computer scientist, and artificial intelligence, machine learning, big data, data science, causal inference, knowledge representation, bioinformatics and health informatics researcher and professor.

<span class="mw-page-title-main">Virgo interferometer</span> Gravitational wave detector in Santo Stefano a Macerata, Tuscany, Italy

The Virgo interferometer is a large Michelson interferometer designed to detect gravitational waves predicted by general relativity. It is located in Santo Stefano a Macerata, near the city of Pisa, Italy. The instrument's two arms are three kilometres long, housing its mirrors and instrumentation inside an ultra-high vacuum.

<span class="mw-page-title-main">Gravitational wave</span> Propagating spacetime ripple

Gravitational waves are waves of the intensity of gravity that are generated by the accelerated masses of binary stars and other motions of gravitating masses, and propagate as waves outward from their source at the speed of light. They were first proposed by Oliver Heaviside in 1893 and then later by Henri Poincaré in 1905 as the gravitational equivalent of electromagnetic waves.

<span class="mw-page-title-main">Carole Goble</span> British computer scientist

Carole Anne Goble, is a British academic who is Professor of Computer Science at the University of Manchester. She is principal investigator (PI) of the myGrid, BioCatalogue and myExperiment projects and co-leads the Information Management Group (IMG) with Norman Paton.

The myGrid consortium produces and uses a suite of tools design to “help e-Scientists get on with science and get on with scientists”. The tools support the creation of e-laboratories and have been used in domains as diverse as systems biology, social science, music, astronomy, multimedia and chemistry.

<span class="mw-page-title-main">Apache Taverna</span>

Apache Taverna was an open source software tool for designing and executing workflows, initially created by the myGrid project under the name Taverna Workbench, then a project under the Apache incubator. Taverna allowed users to integrate many different software components, including WSDL SOAP or REST Web services, such as those provided by the National Center for Biotechnology Information, the European Bioinformatics Institute, the DNA Databank of Japan (DDBJ), SoapLab, BioMOBY and EMBOSS. The set of available services was not finite and users could import new service descriptions into the Taverna Workbench.

<span class="mw-page-title-main">Gravitational-wave observatory</span> Device used to measure gravitational waves

A gravitational-wave detector is any device designed to measure tiny distortions of spacetime called gravitational waves. Since the 1960s, various kinds of gravitational-wave detectors have been built and constantly improved. The present-day generation of laser interferometers has reached the necessary sensitivity to detect gravitational waves from astronomical sources, thus forming the primary tool of gravitational-wave astronomy.

<span class="mw-page-title-main">Galaxy (computational biology)</span>

Galaxy is a scientific workflow, data integration, and data and analysis persistence and publishing platform that aims to make computational biology accessible to research scientists that do not have computer programming or systems administration experience. Although it was initially developed for genomics research, it is largely domain agnostic and is now used as a general bioinformatics workflow management system.

Kepler is a free software system for designing, executing, reusing, evolving, archiving, and sharing scientific workflows. Kepler's facilities provide process and data monitoring, provenance information, and high-speed data movement. Workflows in general, and scientific workflows in particular, are directed graphs where the nodes represent discrete computational components, and the edges represent paths along which data and results can flow between components. In Kepler, the nodes are called 'Actors' and the edges are called 'channels'. Kepler includes a graphical user interface for composing workflows in a desktop environment, a runtime engine for executing workflows within the GUI and independently from a command-line, and a distributed computing option that allows workflow tasks to be distributed among compute nodes in a computer cluster or computing grid. The Kepler system principally targets the use of a workflow metaphor for organizing computational tasks that are directed towards particular scientific analysis and modeling goals. Thus, Kepler scientific workflows generally model the flow of data from one step to another in a series of computations that achieve some scientific goal.

A scientific workflow system is a specialized form of a workflow management system designed specifically to compose and execute a series of computational or data manipulation steps, or workflow, in a scientific application.

The SHIWA project within grid computing was a project led by the LPDS of MTA Computer and Automation Research Institute. The project coordinator was Prof. Dr. Peter Kacsuk. It started on 1 July 2010 and lasted two years. SHIWA was supported by a grant from the European Commission's FP7 INFRASTRUCTURES-2010-2 call under grant agreement n°261585.

A bioinformatics workflow management system is a specialized form of workflow management system designed specifically to compose and execute a series of computational or data manipulation steps, or a workflow, that relate to bioinformatics.

<span class="mw-page-title-main">GW170817</span> Gravitational-wave signal detected in 2017

GW 170817 was a gravitational wave (GW) signal observed by the LIGO and Virgo detectors on 17 August 2017, originating from the shell elliptical galaxy NGC 4993. The signal was produced by the last minutes of a binary pair of neutron stars' inspiral process, ending with a merger. It is the first GW observation that has been confirmed by non-gravitational means. Unlike the five previous GW detections, which were of merging black holes not expected to produce a detectable electromagnetic signal, the aftermath of this merger was also seen by 70 observatories on 7 continents and in space, across the electromagnetic spectrum, marking a significant breakthrough for multi-messenger astronomy. The discovery and subsequent observations of GW 170817 were given the Breakthrough of the Year award for 2017 by the journal Science.

Michela Taufer is an Italian-American computer scientist and holds the Jack Dongarra Professorship in High Performance Computing within the Department of Electrical Engineering and Computer Science at the University of Tennessee, Knoxville. She is an ACM Distinguished Scientist and an IEEE Senior Member. In 2021, together with a team al Lawrence Livermore National Laboratory, she earned a R&D 100 Award for the Flux workload management software framework in the Software/Services category.

<span class="mw-page-title-main">Ewa Deelman</span> American computer scientist

Ewa Deelman is an American computer scientist specializing in distributed computing and cloud computing for applications in scientific computing. Her contributions include leading the design of the Pegasus scientific workflow management system, used by the LIGO scientific collaboration to detect gravitational waves from binary black holes. She is a research professor of computer science in the USC Viterbi School of Engineering, and a principal scientist at the Information Sciences Institute, both part of the University of Southern California.

References

  1. E. Deelman, K. Vahi, G. Juve, M. Rynge, S. Callaghan, P. J. Maechling, R. Mayani, W. Chen, R. Ferreira da Silva, M. Livny, and K. Wenger, "Future Generation Computer Systems", Elsevier ; 46, pp. 17-35 (2015)
  2. E.A. Huerta, R. Haas, E. Fajardo, D.S. Katz, S. Anderson, P. Couvares ,J. Willis, T. Bouvet, J. Enos, W.T.C. Kramer, H.W. Leong, and D. Wheeler, "BOSS-LDG: A Novel Computational Framework That Brings Together Blue Waters, Open Science Grid, Shifter and the LIGO Data Grid to Accelerate Gravitational Wave Discovery", 2017 IEEE 13th International Conference on e-Science (e-Science) ; pp. 335-344 (2017)
  3. B. Riedel, B. Bauermeister, L. Bryant, J. Conrad, P. de Perio, R. W. Gardner ,L. Grandi, F. Lombardi, A. Rizzo, G. Sartorelli, M. Selvi, E. Shockley, J. Stephen, S. Thapa, and C. Tunnell "Distributed Data and Job Management for the XENON1T Experiment", PEARC '18: Proceedings of the Practice and Experience on Advanced Research Computing;9, pp. 1-8 (2018)
  4. G. Amalarethinam, T. Lucia, A. Beena, “Scheduling Framework for Regular Scientific Workflows in Cloud”, International Journal of Applied Engineering Research ; 10, no. 82 (2015)
  5. 1 2 E. Deelman, G. Singh, M. Su, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, K. Vahi, B. G. Berriman, J. Good, A. Laity, J. C. Jacob, and D. S. Katz, “Pegasus: a Framework for Mapping Complex Scientific Workflows onto Distributed Systems”, Scientific Programming ; 13, pp. 19 (2005)
  6. 1 2 The Scientific Workflow Integrity with Pegasus (SWIP), by Center for Applied Cybersecurity Research; published 16 September 2016; retrieved 1 May 2020
  7. D. Weitzel, B. Bockelman, D. Brown, P. Couvares, F. Würthwein, and E.F. Hernandez, “Data Access for LIGO on the OSG”, Proceedings of the Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and Impact - PEARC17; 24, no. 1-6 (2017)
  8. 1 2 "Testing LIGO's Sensitivity". Research.gov. September 1, 2007. Retrieved April 30, 2020.
  9. Duncan Brown and Ewa Deelman, "Looking for gravitational waves: A computing perspective", at Science Node; published June 8, 2011; retrieved April 30, 2020
  10. $1M NSF award goes to IU-led data integrity project, by Indiana University; published 16 September 2016; retrieved 1 May 2020
  11. Brian Mattmiller, "High Throughput Computing helps LIGO confirm Einstein's last unproven theory", at Morgridge Institute for Research ; published March 7, 2016; retrieved May 1, 2020
  12. Sanden Totten, "Caltech Wasn't the Only SoCal School Helping Discover Gravitational Waves", at KPCC ; published 11 February 2016; retrieved May 1, 2020
  13. D.A. Brown, P.R. Brady, A. Dietz, J. Cao, B. Johnson, J. McNabb, “A Case Study on the Use of Workflow Technologies for Scientific Analysis: Gravitational Wave Data Analysis. In: I.J Taylor, E. Deelman, D.B. Gannon, M. Shields (eds) Workflows for e-Science”, Springer, London ; 13, pp. 39-59 (2007)
  14. D. Davis, T. Massinger, A. Lundgren, J.C. Driggers, A.L. Urban, and L. Nuttall, “Improving the sensitivity of Advanced LIGO using noise subtraction”, Classical and Quantum Gravity ; 36, no. 5 (2019)