This article has multiple issues. Please help improve it or discuss these issues on the talk page . (Learn how and when to remove these template messages)
|
Data Infrastructure Building Blocks (DIBBs) is a U.S. National Science Foundation program.
On April 27, 2012, the U.S. National Science Foundation Office of Cyberinfrastructure announced a request for proposals with the name "Data Infrastructure Building Blocks (DIBBs)". The solicitation (NSF 12-557) "incorporated some but not all of the goals of the former DataNet and InterOp programs." [1]
DIBBs is part of NSF's vision for a Cyberinfrastructure Framework for 21st Century Science (CIF21). The introduction in this solicitation states:
NSF's Cyberinfrastructure Framework for 21st Century Science and Engineering (CIF21) investment focuses on the interconnected cyberinfrastructure components necessary to realize the research potential of theoretical, experimental, observational and simulation-based research efforts.
The [DIBBs] Program Description describes the goals of the program as such:
. . . to support the development or expansion of new types of digital data storage, preservation, and access that: (1) enable engagement at the frontiers of science and engineering research and education; (2) work cooperatively and in coordination to overcome conventional barriers due to data type and format, discipline or subject area, and time and place to facilitate sharing of data; (3) combine expertise in cyberinfrastructure; library and archival sciences; computer, computational, and information sciences; and various domain sciences; (4) lead to long-term governance models for economic and technological sustainability over multiple decades. [1]
The solicitation divided the DIBBs awards into three areas: Conceptualization, Implementation, and Interoperability. These three tracks were distinguished as follows:
. . . planning awards aimed at further defining disciplinary and interdisciplinary communities' data storage and management requirements. [1]
. . .will support development and implementation of technologies related to the data preservation and access lifecycle, including acquisition; documentation; security and integrity; storage; access, analysis and dissemination; migration; and deaccession. Implementation awards must also address how they will relate to and support other CIF21 components essential to the given community . [1]
. . .support community efforts to provide broad interoperability of datasets, enhancing interaction and information sharing to benefit all areas of NSF-funded science, engineering and education. [1]
The anticipated funding amount for this solicitation was listed at $41,500,000 pending availability of funds. The anticipated average award size for conceptualization awards was $100,000 for one year; for implementation awards was approximately $8 million total over 5 years; and for interoperability awards was estimated to be up to $1.5 million total over 3 years. [1]
Awards [2] were given in two rounds. In the first round which dealt only with the Conceptualization track, for which full proposals were due on July 26, 2012, three DIBBs proposals were awarded:
The second round of awards covered the Implementation and Interoperability Tracks for which full proposals were due on August 30, 2012. Four more proposals were awarded:
A total of about $26.8M was distributed among these seven awards.
The National Center for Supercomputing Applications (NCSA) is a state-federal partnership to develop and deploy national-scale cyberinfrastructure that advances research, science and engineering based in the United States. NCSA operates as a unit of the University of Illinois Urbana-Champaign, and provides high-performance computing resources to researchers across the country. Support for NCSA comes from the National Science Foundation, the state of Illinois, the University of Illinois, business and industry partners, and other federal agencies.
The National Science Foundation Network (NSFNET) was a program of coordinated, evolving projects sponsored by the National Science Foundation (NSF) from 1985 to 1995 to promote advanced research and education networking in the United States. The program created several nationwide backbone computer networks in support of these initiatives. It was created to link researchers to the NSF-funded supercomputing centers. Later, with additional public funding and also with private industry partnerships, the network developed into a major part of the Internet backbone.
The U.S. National Science Foundation (NSF) is an independent agency of the United States federal government that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National Institutes of Health. With an annual budget of about $9.9 billion, the NSF funds approximately 25% of all federally supported basic research conducted by the United States' colleges and universities. In some fields, such as mathematics, computer science, economics, and the social sciences, the NSF is the major source of federal backing.
The Computer Science Network (CSNET) was a computer network that began operation in 1981 in the United States. Its purpose was to extend networking benefits, for computer science departments at academic and research institutions that could not be directly connected to ARPANET, due to funding or authorization limitations. It played a significant role in spreading awareness of, and access to, national networking and was a major milestone on the path to development of the global Internet. CSNET was funded by the National Science Foundation for an initial three-year period from 1981 to 1984.
The San Diego Supercomputer Center (SDSC) is an organized research unit of the University of California, San Diego (UCSD). SDSC is located at the UCSD campus' Eleanor Roosevelt College east end, immediately north the Hopkins Parking Structure.
E-Science or eScience is computationally intensive science that is carried out in highly distributed network environments, or science that uses immense data sets that require grid computing; the term sometimes includes technologies that enable distributed collaboration, such as the Access Grid. The term was created by John Taylor, the Director General of the United Kingdom's Office of Science and Technology in 1999 and was used to describe a large funding initiative starting in November 2000. E-science has been more broadly interpreted since then, as "the application of computer technology to the undertaking of modern scientific investigation, including the preparation, experimentation, data collection, results dissemination, and long-term storage and accessibility of all materials generated through the scientific process. These may include data modeling and analysis, electronic/digitized laboratory notebooks, raw and fitted data sets, manuscript production and draft versions, pre-prints, and print and/or electronic publications." In 2014, IEEE eScience Conference Series condensed the definition to "eScience promotes innovation in collaborative, computationally- or data-intensive research across all disciplines, throughout the research lifecycle" in one of the working definitions used by the organizers. E-science encompasses "what is often referred to as big data [which] has revolutionized science... [such as] the Large Hadron Collider (LHC) at CERN... [that] generates around 780 terabytes per year... highly data intensive modern fields of science...that generate large amounts of E-science data include: computational biology, bioinformatics, genomics" and the human digital footprint for the social sciences.
United States federal research funders use the term cyberinfrastructure to describe research environments that support advanced data acquisition, data storage, data management, data integration, data mining, data visualization and other computing and information processing services distributed over the Internet beyond the scope of a single institution. In scientific usage, cyberinfrastructure is a technological and sociological solution to the problem of efficiently connecting laboratories, data, computers, and people with the goal of enabling derivation of novel scientific theories and knowledge.

TeraGrid was an e-Science grid computing infrastructure combining resources at eleven partner sites. The project started in 2001 and operated from 2004 through 2011.
The Earth System Modeling Framework (ESMF) is open-source software for building climate, numerical weather prediction, data assimilation, and other Earth science software applications. These applications are computationally demanding and usually run on supercomputers. The ESMF is considered a technical layer, integrated into a sophisticated common modeling infrastructure for interoperability. Other aspects of interoperability and shared infrastructure include: common experimental protocols, common analytic methods, common documentation standards for data and data provenance, shared workflow, and shared model components.
The Texas Advanced Computing Center (TACC) at the University of Texas at Austin, United States, is an advanced computing research center that is based on comprehensive advanced computing resources and supports services to researchers in Texas and across the U.S. The mission of TACC is to enable discoveries that advance science and society through the application of advanced computing technologies. Specializing in high-performance computing, scientific visualization, data analysis & storage systems, software, research & development, and portal interfaces, TACC deploys and operates advanced computational infrastructure to enable the research activities of faculty, staff, and students of UT Austin. TACC also provides consulting, technical documentation, and training to support researchers who use these resources. TACC staff members conduct research and development in applications and algorithms, computing systems design/architecture, and programming tools and environments.
The George E. Brown, Jr. Network for Earthquake Engineering Simulation (NEES) was created by the National Science Foundation (NSF) to improve infrastructure design and construction practices to prevent or minimize damage during an earthquake or tsunami. Its headquarters were at Purdue University in West Lafayette, Indiana as part of cooperative agreement #CMMI-0927178, and it ran from 2009 till 2014. The mission of NEES is to accelerate improvements in seismic design and performance by serving as a collaboratory for discovery and innovation.
DataNet, or Sustainable Digital Data Preservation and Access Network Partner, was a research program of the U.S. National Science Foundation Office of Cyberinfrastructure. The office announced a request for proposals with this title on September 28, 2007. The lead paragraph of its synopsis describes the program as:
Science and engineering research and education are increasingly digital and increasingly data-intensive. Digital data are not only the output of research but provide input to new hypotheses, enabling new scientific insights and driving innovation. Therein lies one of the major challenges of this scientific generation: how to develop the new methods, management structures and technologies to manage the diversity, size, and complexity of current and future data sets and data streams. This solicitation addresses that challenge by creating a set of exemplar national and global data research infrastructure organizations that provide unique opportunities to communities of researchers to advance science and/or engineering research and learning.
Integrated computational materials engineering (ICME) involves the integration of experimental results, design models, simulations, and other computational data related to a variety of materials used in multiscale engineering and design. Central to the achievement of ICME goals has been the creation of a cyberinfrastructure, a Web-based, collaborative platform which provides the ability to accumulate, organize and disseminate knowledge pertaining to materials science and engineering to facilitate this information being broadly utilized, enhanced, and expanded.
The iPlant Collaborative, renamed Cyverse in 2017, is a virtual organization created by a cooperative agreement funded by the US National Science Foundation (NSF) to create cyberinfrastructure for the plant sciences (botany). The NSF compared cyberinfrastructure to physical infrastructure, "... the distributed computer, information and communication technologies combined with the personnel and integrating components that provide a long-term platform to empower the modern scientific research endeavor". In September 2013 it was announced that the National Science Foundation had renewed iPlant's funding for a second 5-year term with an expansion of scope to all non-human life science research.
Francine Berman is an American computer scientist, and a leader in digital data preservation and cyber-infrastructure. In 2009, she was the inaugural recipient of the IEEE/ACM-CS Ken Kennedy Award "for her influential leadership in the design, development and deployment of national-scale cyberinfrastructure, her inspiring work as a teacher and mentor, and her exemplary service to the high performance community". In 2004, Business Week called her the "reigning teraflop queen".
Daniel E. Atkins III is the W. K. Kellogg Professor of Community Informatics at University of Michigan.
NCSA Brown Dog is a research project to develop a method for easily accessing historic research data stored in order to maintain the long-term viability of large bodies of scientific research. It is supported by the National Center for Supercomputing Applications (NCSA) that is funded by the National Science Foundation (NSF).
Helen Aristar-Dry is an American linguist who currently serves as the series editor for SpringerBriefs in Linguistics. Most notably, from 1991 to 2013 she co-directed The LINGUIST List with Anthony Aristar. She has served as principal investigator or co-Principal Investigator on over $5,000,000 worth of research grants from the National Science Foundation and the National Endowment for the Humanities. She retired as Professor of English Language and Literature from Eastern Michigan University in 2013.
Xiaogang Ma or Marshall Ma is a data science and geoinformatics researcher at the University of Idaho (UI), United States. He is an associate professor in the department of computer science at UI, and also affiliates with the department of earth and spatial sciences and several research institutes and centers at the university.
Manish Parashar is a Presidential Professor in the School of Computing, Director of the Scientific Computing and Imaging Institute and Chair in Computational Science and Engineering at the University of Utah. He also currently serves as Office Director in the US National Science Foundation’s Office of Advanced Cyberinfrastructure. Parashar is the editor-in-chief of IEEE Transactions on Parallel and Distributed Systems, and Founding Chair of the IEEE Technical Community on High Performance Computing. He is an AAAS Fellow, ACM Fellow, and IEEE Fellow.