Mercury is a distributed metadata management, data discovery and access system. [1] It is a scientific data search system to capture and manage biogeochemical and ecological data in support of the Earth science programs funded by the United States Department of Energy (DOE) and United States Geological Survey (USGS) - Department of Interior (DOI). Mercury was originally developed for NASA, but the consortium is now supported by USGS and the DOE. Ongoing development of Mercury is done through an informal consortium at Oak Ridge National Laboratory.
Mercury is a part of the Oak Ridge National Laboratory Environmental Sciences Division (ESD) [2] and a contributor to the Climate Change Science Institute (CCSI). [3]
Mercury supports data archiving, data discovery through various search strategies (text string, fielded, spatial, temporal), data reuse, and longer-term scientific digital data stewardship, and supports a range of recognized data exchange and interoperability protocols and supports various metadata standards including XML, Z39.50, FGDC, Dublin Core, Darwin Core, Ecological Metadata Language, and ISO. Mercury also uses OAI-PMH to index metadata records from Global Change Master Directory (GCMD) and redistribute them other data providers. [4]
The Open Archives Initiative (OAI) was an informal organization, in the circle around the colleagues Herbert Van de Sompel, Carl Lagoze, Michael L. Nelson and Simeon Warner, to develop and apply technical interoperability standards for archives to share catalogue information (metadata). The group got together in the late late 1990s and was active for around twenty years. OAI coordinated in particular three specification activities: OAI-PMH, OAI-ORE and ResourceSync. All along the group worked towards building a "low-barrier interoperability framework" for archives containing digital content to allow people harvest metadata. Such sets of metadata are since then harvested to provide "value-added services", often by combining different data sets.
Ecoinformatics, or ecological informatics, is the science of information in ecology and environmental science. It integrates environmental and information sciences to define entities and natural processes with language common to both humans and computers. However, this is a rapidly developing area in ecology and there are alternative perspectives on what constitutes ecoinformatics.
The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a protocol developed for harvesting metadata descriptions of records in an archive so that services can be built using metadata from many archives. An implementation of OAI-PMH must support representing metadata in Dublin Core, but may also support additional representations.
The Earth System Modeling Framework (ESMF) is open-source software for building climate, numerical weather prediction, data assimilation, and other Earth science software applications. These applications are computationally demanding and usually run on supercomputers. The ESMF is considered a technical layer, integrated into a sophisticated common modeling infrastructure for interoperability. Other aspects of interoperability and shared infrastructure include: common experimental protocols, common analytic methods, common documentation standards for data and data provenance, shared workflow, and shared model components.
In metadata, metadata discovery is the process of using automated tools to discover the semantics of a data element in data sets. This process usually ends with a set of mappings between the data source elements and a centralized metadata registry. Metadata discovery is also known as metadata scanning.
The Metadata Encoding and Transmission Standard (METS) is a metadata standard for encoding descriptive, administrative, and structural metadata regarding objects within a digital library, expressed using the XML schema language of the World Wide Web Consortium (W3C). The standard is maintained as part of the MARC standards of the Library of Congress, and is being developed as an initiative of the Digital Library Federation (DLF).
BASE is a multi-disciplinary search engine to scholarly internet resources, created by Bielefeld University Library in Bielefeld, Germany. It is based on free and open-source software such as Apache Solr and VuFind. It harvests OAI metadata from institutional repositories and other academic digital libraries that implement the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), and then normalizes and indexes the data for searching. In addition to OAI metadata, the library indexes selected web sites and local data collections, all of which can be searched via a single search interface.
The Petrological Database of the Ocean Floor (PetDB) is a relational database for global geochemical data on igneous and metamorphic rocks generated at mid-ocean ridges including back-arc basins, young seamounts, and old oceanic crust, as well as ophiolites and terrestrial xenoliths from the mantle and lower crust and diamond geochemistry. These data are obtained by analyses of whole rock powders, volcanic glasses, and minerals by a wide range of techniques including mass spectrometry, atomic emission spectrometry, x-ray fluorescence spectrometry, and wet chemical analyses. Data are compiled from the scientific literature by PetDB data managers, and entered after methodical metadata review. Members of the scientific community can also suggest entry of specific data that has been entered into the EarthChem Library. PetDB is administered by the EarthChem group under the IEDA facility at LDEO headed by K. Lehnert. PetDB is supported by the U.S. National Science Foundation.
Nature Precedings was an open access electronic preprint repository of scholarly work in the fields of biomedical sciences, chemistry, and earth sciences. It ceased accepting new submissions as of April 3, 2012.
Data curation is the organization and integration of data collected from various sources. It involves annotation, publication and presentation of the data so that the value of the data is maintained over time, and the data remains available for reuse and preservation. Data curation includes "all the processes needed for principled and controlled data creation, maintenance, and management, together with the capacity to add value to data". In science, data curation may indicate the process of extraction of important information from scientific texts, such as research articles by experts, to be converted into an electronic format, such as an entry of a biological database.
Invenio is an open source software framework for large-scale digital repositories that provides the tools for management of digital assets in an institutional repository and research data management systems. The software is typically used for open access repositories for scholarly and/or published digital content and as a digital library.
The Wind Energy Data and Information (WENDI) Gateway was established by the Environmental Sciences Division of Oak Ridge National Laboratory in March 2010 to support the United States Department of Energy's Wind and Hydropower Technologies Program. It provided a digital library for wind energy-related data and information from a wide spectrum of sources—including data centers, scientific and technical journals, and geographic information systems (GIS), as well as the websites of government agencies, corporations and trade organizations. The gateway was taken offline after funding ended in August 2012. Much of its content was incorporated into the OpenEI site.
REDCap is a browser-based, metadata-driven EDC software and workflow methodology for designing clinical and translational research databases. It is widely used in the academic research community: the REDCap Consortium is a collaborative, international network of more than 5900 institutional partners in 145 countries, with more than 2.1 million total end-users employing the software. Over 19,000 journal articles cite REDCap.
Astroinformatics is an interdisciplinary field of study involving the combination of astronomy, data science, machine learning, informatics, and information/communications technologies. The field is closely related to astrostatistics.
The ORNL DAAC for Biogeochemical Dynamics is a National Aeronautics and Space Administration (NASA) Earth Observing System Data and Information System (EOSDIS) data center managed by the Earth Science Data and Information System (ESDIS) Project. Established in 1993, the ORNL DAAC is operated by Oak Ridge National Laboratory in Oak Ridge, Tennessee, under an interagency agreement between NASA and the Department of Energy (DOE). Within the ORNL, the ORNL DAAC is part of the Remote Sensing and Environmental Informatics Group of the Environmental Sciences Division (ESD) and a contributor to the Climate Change Science Institute (CCSI).
DataONE is a network of interoperable data repositories facilitating data sharing, data discovery, and open science. Originally supported by $21.2 million in funding from the US National Science Foundation as one of the initial DataNet programs in 2009, funding was renewed in 2014 through 2020 with an additional $15 million. DataONE helps preserve, access, use, and reuse of multi-discipline scientific data through the construction of primary cyberinfrastructure and an education and outreach program. DataONE provides scientific data archiving for ecological and environmental data produced by scientists. DataONE's goal is to preserve and provide access to multi-scale, multi-discipline, and multi-national data. Users include scientists, ecosystem managers, policy makers, students, educators, librarians, and the public.
An open repository or open-access repository is a digital platform that holds research output and provides free, immediate and permanent access to research results for anyone to use, download and distribute. To facilitate open access such repositories must be interoperable according to the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). Search engines harvest the content of open access repositories, constructing a database of worldwide, free of charge available research. Data repositories are the cornerstone for FAIR data practices and are used expeditiously within the scientific community.
Islandora is a free and open-source software digital repository system based on Drupal and integrating with additional applications, including Fedora Commons. It is open source software. Islandora was originally developed at the University of Prince Edward Island by the Robertson Library and is now maintained by the Islandora Foundation, which has a mission to, "promote collaboration through transparency and consensus building among Islandora community members, and to steward their shared vision for digital curation features through a body of software and knowledge."
The Space Physics Archive Search and Extract (SPASE) effort is an international consortium formed in 2001. Its mission is to define standards and services to enable the establishment and operation of discipline specific Virtual Observatories. The main focus of the consortium is to define and maintain a standard data model to enable data sharing and interoperability within the Space and Solar Physics community. Another goal of the consortium is to facilitate data search and retrieval across the Space and Solar Physics data environment by providing conventions, tools and services to assist data providers, researchers and general users. The SPASE consortium also encourages collaboration between agencies and groups interested in sharing space and solar physics data.
ResCarta Toolkit is an open source software package used to create open access repositories for local history and published digital content. ResCarta Toolkit focuses on the use of open standard file formats and metadata standards to create archives that are sustainable over time. It includes software for creation of digital objects, indexing of metadata and content, display tools and checksum validation.