Cyberinfrastructure

Last updated

United States federal research funders use the term cyberinfrastructure to describe research environments that support advanced data acquisition, data storage, data management, data integration, data mining, data visualization and other computing and information processing services distributed over the Internet beyond the scope of a single institution. In scientific usage, cyberinfrastructure is a technological and sociological solution to the problem of efficiently connecting laboratories, data, computers, and people with the goal of enabling derivation of novel scientific theories and knowledge.

Contents

Origin

The term National Information Infrastructure had been popularized by Al Gore in the 1990s. This use of the term "cyberinfrastructure" evolved from the same thinking that produced Presidential Decision Directive NSC-63 [1] on Protecting America's Critical Infrastructures (PDD-63). PDD-63 focuses on the security and vulnerability of the nation's "cyber-based information systems" as well as the critical infrastructures on which America's military strength and economic well-being depend, such as the electric power grid, transportation networks, potable water and wastewater infrastructures.

The term "cyberinfrastructure" was used in a press briefing on PDD-63 on May 22, 1998 [2] with Richard A. Clarke, then national coordinator for security, infrastructure protection, and counter-terrorism, and Jeffrey Hunker, who had just been named director of the critical infrastructure assurance office. Hunker stated:

"One of the key conclusions of the President's commission that laid the intellectual framework for the President's announcement today was that while we certainly have a history of some real attacks, some very serious, to our cyber-infrastructure, the real threat lay in the future. And we can't say whether that's tomorrow or years hence. But we've been very successful as a country and as an economy in wiring together our critical infrastructures. This is a development that's taken place really over the last 10 or 15 years—the Internet, most obviously, but electric power, transportation systems, our banking and financial systems." [2]

The term "cyberinfrastructure" was used by a US National Science Foundation (NSF) blue-ribbon committee in 2003 in response to the question: how can NSF, as the nation's premier agency funding basic research, remove existing barriers to the rapid evolution of high performance computing, making it truly usable by all the nation's scientists, engineers, scholars, and citizens? The NSF use of the term focuses on the integrated assemblage of these information technologies with one another.

A workshop on cyberinfrastructure for the social sciences was held in San Diego, California in May 2005. [3] Another conference was held in January 2007 in Washington, D.C. [4] A "CyberInfrastructure Partnership" existed from February 2005 until 2009. [5] A collaboration led by the University of Wisconsin–Madison and Boston University had a web site called "Engaging People in Cyberinfrastructure" (EPIC) which existed from 2005 through 2007. [6] Two NSF sponsored workshops on Financial Cyberinfrastructure were organized in 2010 and 2012 by Louiqa Raschid and Albert "Pete" Kyle University of Maryland, H.V. Jagadish University of Michigan and Mark Flood Office of Financial Research, Department of the Treasury.

Complementing the technical construction of cyberinfrastructure, social scientists in the field of computer supported cooperative work investigate the organizational and social aspects of building these large-scale, distributed resources to support science. Related to this research space is the notion of the collaboratory, originally coined by William Wulf.

Cyberinfrastructure is more often called e-Science or e-Research. [7] In particular, the United Kingdom started an e-Science initiative in 2001.; [8] the Systems Geology initiative of the British Geological Survey is an example. Others distinguish e-Science as the work that is done using the cyberinfrastructure. [9]

There are many inter-governmental advisory groups related to Cyberinfrastructure aspects like E-Infrastructures Reflection Group and European Strategy Forum on Research Infrastructures dealing with policies on electronic infrastructures for research, i.e. research networks, computing, software and data infrastructures that mainly serve students, researchers and scientists. They advise and recommend actions towards the European Commission (DG CONNECT), the EU Member states governments (Research or Science Ministries), e-Infrastructure providers and users.

Examples

NSF's Office of Cyberinfrastructure, for example, supported the TeraGrid project in which the Grid Infrastructure Group led by University of Chicago provided integration of resources and services that were operated by some of the US's supercomputing centers. This project has now evolved to the Extreme Science and Engineering Discovery Environment (XSEDE) project, led by the National Center for Supercomputing Applications.

The nanoHUB and its HUBzero software originally funded in 2002 is an important cyberinfrastructure that is seeing continued usage. [10] [11] Cyberinfrastructure is often specialized toward domains in science and engineering. For example, NSF funded a large cyberinfrastructure for earthquake engineering called NEEShub at Purdue University from 2009-15. [12] NSF funded the iPlant Collaborative in 2008 to support plant sciences, including data-intensive plant genomics and phylogenetics. [13] Mississippi State University created an Integrated Computational Materials Engineering (ICME) cyberinfrastructure in 2010 that focuses on multiscale modeling.

The United States Department of Energy supports e-Science through high performance computing and other initiatives involving its laboratories, including:

The Department of Energy (Office of Science SciDAC-2 program from the High Energy Physics, Nuclear Physics and Advanced Software and Computing Research programs) and NSF (Math and Physical Sciences, Office of Cyberinfrastructure and Office of International Science and Engineering Directorates) support the Open Science Grid which is a consortium of more than 80 member institutions and alliances.

Other examples include:

See also

Related Research Articles

<span class="mw-page-title-main">National Center for Supercomputing Applications</span> Illinois-based applied supercomputing research organization

The National Center for Supercomputing Applications (NCSA) is a state-federal partnership to develop and deploy national-scale computer infrastructure that advances research, science and engineering based in the United States. NCSA operates as a unit of the University of Illinois Urbana-Champaign, and provides high-performance computing resources to researchers across the country. Support for NCSA comes from the National Science Foundation, the state of Illinois, the University of Illinois, business and industry partners, and other federal agencies.

<span class="mw-page-title-main">Cornell University Center for Advanced Computing</span>

The Cornell University Center for Advanced Computing (CAC), housed at Frank H. T. Rhodes Hall on the campus of Cornell University, is one of five original centers in the National Science Foundation's Supercomputer Centers Program. It was formerly called the Cornell Theory Center.

<span class="mw-page-title-main">San Diego Supercomputer Center</span> Supercomputer at UC San Diego.

E-Science or eScience is computationally intensive science that is carried out in highly distributed network environments, or science that uses immense data sets that require grid computing; the term sometimes includes technologies that enable distributed collaboration, such as the Access Grid. The term was created by John Taylor, the Director General of the United Kingdom's Office of Science and Technology in 1999 and was used to describe a large funding initiative starting in November 2000. E-science has been more broadly interpreted since then, as "the application of computer technology to the undertaking of modern scientific investigation, including the preparation, experimentation, data collection, results dissemination, and long-term storage and accessibility of all materials generated through the scientific process. These may include data modeling and analysis, electronic/digitized laboratory notebooks, raw and fitted data sets, manuscript production and draft versions, pre-prints, and print and/or electronic publications." In 2014, IEEE eScience Conference Series condensed the definition to "eScience promotes innovation in collaborative, computationally- or data-intensive research across all disciplines, throughout the research lifecycle" in one of the working definitions used by the organizers. E-science encompasses "what is often referred to as big data [which] has revolutionized science... [such as] the Large Hadron Collider (LHC) at CERN... [that] generates around 780 terabytes per year... highly data intensive modern fields of science...that generate large amounts of E-science data include: computational biology, bioinformatics, genomics" and the human digital footprint for the social sciences.

<span class="mw-page-title-main">TeraGrid</span>

TeraGrid was an e-Science grid computing infrastructure combining resources at eleven partner sites. The project started in 2001 and operated from 2004 through 2011.

<span class="mw-page-title-main">Charlie Catlett</span> American computer scientist

Charlie Catlett is a senior computer scientist at Argonne National Laboratory and a visiting senior fellow at the Mansueto Institute for Urban Innovation at the University of Chicago. From 2020 to 2022 he was a senior research scientist at the University of Illinois Discovery Partners Institute. He was previously a senior computer scientist at Argonne National Laboratory and a senior fellow in the Computation Institute, a joint institute of Argonne National Laboratory and The University of Chicago, and a senior fellow at the University of Chicago's Harris School of Public Policy.

The Pittsburgh Supercomputing Center (PSC) is a high performance computing and networking center founded in 1986 and one of the original five NSF Supercomputing Centers. PSC is a joint effort of Carnegie Mellon University and the University of Pittsburgh in Pittsburgh, Pennsylvania, United States.

The Texas Advanced Computing Center (TACC) at the University of Texas at Austin, United States, is an advanced computing research center that is based on comprehensive advanced computing resources and supports services to researchers in Texas and across the U.S. The mission of TACC is to enable discoveries that advance science and society through the application of advanced computing technologies. Specializing in high performance computing, scientific visualization, data analysis & storage systems, software, research & development and portal interfaces, TACC deploys and operates advanced computational infrastructure to enable the research activities of faculty, staff, and students of UT Austin. TACC also provides consulting, technical documentation, and training to support researchers who use these resources. TACC staff members conduct research and development in applications and algorithms, computing systems design/architecture, and programming tools and environments.

The George E. Brown, Jr. Network for Earthquake Engineering Simulation (NEES) was created by the National Science Foundation (NSF) to improve infrastructure design and construction practices to prevent or minimize damage during an earthquake or tsunami. Its headquarters were at Purdue University in West Lafayette, Indiana as part of cooperative agreement #CMMI-0927178, and it ran from 2009 till 2014. The mission of NEES is to accelerate improvements in seismic design and performance by serving as a collaboratory for discovery and innovation.

Edward Seidel is an American academic administrator and scientist serving as the president of the University of Wyoming since July 1, 2020. He previously served as the Vice President for Economic Development and Innovation for the University of Illinois System, as well as a Founder Professor in the Department of Physics and a professor in the Department of Astronomy at the University of Illinois at Urbana-Champaign. He was the director of the National Center for Supercomputing Applications at Illinois from 2014 to 2017.

nanoHUB

nanoHUB.org is a science and engineering gateway comprising community-contributed resources and geared toward education, professional networking, and interactive simulation tools for nanotechnology. Funded by the United States National Science Foundation (NSF), it is a product of the Network for Computational Nanotechnology (NCN). NCN supports research efforts in nanoelectronics; nanomaterials; nanoelectromechanical systems (NEMS); nanofluidics; nanomedicine, nanobiology; and nanophotonics.

Integrated computational materials engineering (ICME) involves the integration of experimental results, design models, simulations, and other computational data related to a variety of materials used in multiscale engineering and design. Central to the achievement of ICME goals has been the creation of a cyberinfrastructure, a Web-based, collaborative platform which provides the ability to accumulate, organize and disseminate knowledge pertaining to materials science and engineering to facilitate this information being broadly utilized, enhanced, and expanded.

iPlant Collaborative

The iPlant Collaborative, renamed Cyverse in 2017, is a virtual organization created by a cooperative agreement funded by the US National Science Foundation (NSF) to create cyberinfrastructure for the plant sciences (botany). The NSF compared cyberinfrastructure to physical infrastructure, "... the distributed computer, information and communication technologies combined with the personnel and integrating components that provide a long-term platform to empower the modern scientific research endeavor". In September 2013 it was announced that the National Science Foundation had renewed iPlant's funding for a second 5-year term with an expansion of scope to all non-human life science research.

A data infrastructure is a digital infrastructure promoting data sharing and consumption.

<span class="mw-page-title-main">Francine Berman</span> American computer scientist

Francine Berman is an American computer scientist, and a leader in digital data preservation and cyber-infrastructure. In 2009, she was the inaugural recipient of the IEEE/ACM-CS Ken Kennedy Award "for her influential leadership in the design, development and deployment of national-scale cyberinfrastructure, her inspiring work as a teacher and mentor, and her exemplary service to the high performance community". In 2004, Business Week called her the "reigning teraflop queen".

HUBzero is an open source software platform for building websites that support scientific activities.

Data Infrastructure Building Blocks (DIBBs) is a U.S. National Science Foundation program.

Science gateways provide access to advanced resources for science and engineering researchers, educators, and students. Through streamlined, online, user-friendly interfaces, gateways combine a variety of cyberinfrastructure (CI) components in support of a community-specific set of tools, applications, and data collections.: In general, these specialized, shared resources are integrated as a Web portal, mobile app, or a suite of applications. Through science gateways, broad communities of researchers can access diverse resources which can save both time and money for themselves and their institutions. As listed below, functions and resources offered by science gateways include shared equipment and instruments, computational services, advanced software applications, collaboration capabilities, data repositories, and networks.

The Open Knowledgebase of Interatomic Models (OpenKIM). is a cyberinfrastructure funded by the United States National Science Foundation (NSF) focused on improving the reliability and reproducibility of molecular and multi-scale simulations in computational materials science. It includes a repository of interatomic potentials that are exhaustively tested with user-developed integrity tests, tools to help select among existing potentials and develop new ones, extensive metadata on potentials and their developers, and standard integration methods for using interatomic potentials in major simulation codes. OpenKIM is a member of DataCite and provides unique DOIs (Digital object identifier) for all archived content on the site (fitted models, validation tests, etc.) in order to properly document and provide recognition to content contributors. OpenKIM is also an eXtreme Science and Engineering Discovery Environment (XSEDE) Science Gateway, and all content on openkim.org is available under open source licenses in support of the open science initiative.

References

  1. Presidential Decision Directive NSC-63
  2. 1 2 "Press Briefing by Richard Clarke, National Coordinator for Security, Infrastructure Protection and Counter-terrorism; and Jeffrey Hunker, Director of the Critical Infrastructure Assurance Office". News release. The White House Office of the Press Secretary. May 22, 1998. Retrieved September 18, 2011.
  3. "NSF Workshop on Cyberinfrastructure for the Social Sciences, 2005". San Diego Supercomputer Center. Archived from the original on January 5, 2006. Retrieved September 18, 2011.
  4. "Designing Cyberinfrastructure for Collaboration and Innovation". University of Michigan. January 2007. Archived from the original on July 2, 2010. Retrieved September 19, 2011.
  5. "CyberInfrastructure Partnership". Archived from the original on June 12, 2009. Retrieved September 19, 2011.
  6. "Engaging People in Cyberinfrastructure". December 28, 2007. Archived from the original on January 29, 2009. Retrieved September 19, 2011.
  7. James R. Bottum; James F. Davis; Peter M. Siegel; Brad Wheeler; Diana G. Oblinger (July–August 2008). "Cyberinfrastructure: In Tune for the Future". Educause Review. Vol. 43, no. 4. Archived from the original on 2008-09-07. Retrieved September 19, 2011.
  8. "e-Science". Research Councils UK. Archived from the original on September 25, 2010. Retrieved September 19, 2011.
  9. Harvey B. Newman; Mark H. Ellisman; John A. Orcutt (November 2003). "Data-Intensive e-Science Frontier Research in the Coming Decade". Communications. Association for Computing Machinery. 46 (11): 68. CiteSeerX   10.1.1.72.5841 . doi:10.1145/948383.948411. S2CID   7633315.
  10. Michael McLennan (December 9, 2010). "The HUBzero Platform for Scientific Collaboration". Cyber Infrastructure Days 2010 at Purdue University. Retrieved September 19, 2011.
  11. Diana G. Oblinger (August 2007). "nanoHUB" (PDF). ELI Paper 7. Educause Learning Initiative. Archived from the original (PDF) on October 5, 2011. Retrieved September 19, 2011.
  12. Hacker, T. J.; Eigenmann, R.; Bagchi, S.; Irfanoglu, A.; Pujol, S.; Catlin, A.; Rathje, E. (2011-07-01). "The NEEShub Cyberinfrastructure for Earthquake Engineering". Computing in Science & Engineering. 13 (4): 67–78. Bibcode:2011CSE....13d..67H. doi:10.1109/MCSE.2011.70. ISSN   1521-9615. S2CID   22196398.
  13. "PSCIC Full Proposal: The iPlant Collaborative: A Cyberinfrastructure-Centered Community for a New Plant Biology". Award Abstract #0735191. National Science Foundation. August 22, 2011. Retrieved September 21, 2011.