IDigBio

Last updated

Integrated Digitized Biocollections (iDigBio)
USA Florida location map.svg
Red pog.svg
Gainesville, Florida
Established2011
Location105 NW 16th St., Gainesville, Florida
Coordinates 29°38′09″N82°22′13″W / 29.63583°N 82.37028°W / 29.63583; -82.37028
TypeNSF Funded Grant
FounderPIs: Larry Page, Pam Soltis, Bruce MacFadden, José Fortes and Greg Riccardi
Website Official website

iDigBio, Integrated Digitized Biocollections, is the National Resource funded by the National Science Foundation for Advancing Digitization of Biodiversity Collections (ADBC). [1] [2] Through iDigBio, data and images for millions of biological specimens are being curated, connected and made available in electronic format for the biological research community, government agencies, students, educators, and the general public. [3] The mission of iDigBio is to develop a national infrastructure that supports the vision of ADBC by overseeing implementation of standards and best practices for digitization; building and deploying a customized cloud computing environment for collections; recruiting and training personnel, including underserved groups; engaging the research community, collections community, citizen scientists, and the public through education and outreach activities; and planning for long-term sustainability of the national digitization effort. In addition to the iDigBio central digitization HUB, there are partner institutions referred to as “Thematic Collections Networks” (TCNs) and associated “Partners To Existing Networks (PENs) which consists of networks of institutions with a strategy for digitizing information that addresses a particular research theme. Through the iDigBio HUB cyberinfrastructure, compilation and the inter-linking of data from the TCNs and existing collaborative databases will create opportunities to address research questions and education interests regarding biodiversity, climate change, species invasions, natural disasters, and the spread of pests and diseases. New TCNs will be funded in succeeding years based on solicitations from NSF. The iDigBio HUB is based at the University of Florida (UF), in partnership with Florida State University (FSU). [4]

Contents

Digitization of mammal specimens using a lightbox and camera Ebox zach USE(1).jpg
Digitization of mammal specimens using a lightbox and camera

Project Mission

The mission of iDigBio is to develop a national infrastructure that supports the vision of ADBC by overseeing implementation of standards and best practices for digitization; building and deploying a customized cloud computing environment for collections; recruiting and training personnel, including underserved groups; engaging the research community, collections community, citizen scientists, and the public through education and outreach activities; and planning for long-term sustainability of the national digitization effort.

iDigBio will enable digitization of data from all U.S. biological collections and integrate those data to make them broadly available and useful with shared standards and formats. Ultimately, ADBC will further the discovery and understanding of biological diversity, and iDigBio will engage the research, collections, and education communities in a spirit of collaboration that will open biological research collections to new downstream user communities. The vision for ADBC is a permanent repository of digitized information from all U.S. biological collections that leads to new discoveries through research and a better understanding and appreciation of biodiversity through improved outreach, which then leads to improved environmental and economic policies. Creation of the permanent digitized repository is occurring in four stages:

  1. An initial stage where the effort to digitize U.S. biological collections is catalyzed by funding from NSF and enabled by iDigBio activities that foster collaborations, identify priorities, demonstrate the value of biodiversity and collections, and generate information on best practices related to standards, workflows, and data management.
  2. An intermediate stage where digitization at Thematic Collections Networks (TCNs), Partners to Existing Networks (PENs), and other participating institutions/networks improves methods and strategies and demonstrates the scientific and societal benefits of validated and readily accessible data.
  3. A third stage in which the vision for ADBC is realized through participation by all U.S. institutions with biological collections.
  4. A fourth stage in which digitization is a routine and sustained practice in all institutions with biological collections, and the national database is easily accessible as an up-to-date source of information on biodiversity

Project Scope

Digitizing herbarium specimens 2012-760-049 Herbarium Anna Monfils.JPG
Digitizing herbarium specimens

iDigBio is the national resource for digitized information about vouchered natural history collections within the context established by the NIBA community strategic plan and is supported through funds from the NSF ADBC program. As such, iDigBio serves as the administrative home for the national digitization effort; fosters partnerships and innovations; facilitates the determination and dissemination of digitization practices and workflows; establishes integration and interconnectivity among the data generated by collection digitization projects; and promotes the uses of biological/paleontological collections data by the scientific community and stakeholders including government agencies, educational institutions, non-governmental organizations (NGOs), and other national and international entities to benefit science and society through enhanced research, educational, and outreach activities. iDigBio provides these services to all stakeholders with clarity, simplicity, transparency, intuitive methodology, and intuitive design.

Related Research Articles

Bioinformatics Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combines biology, computer science, information engineering, mathematics and statistics to analyze and interpret the biological data. Bioinformatics has been used for in silico analyses of biological queries using mathematical and statistical techniques.

Field Museum of Natural History museum in Chicago, Illinois, United States

The Field Museum of Natural History (FMNH), also known as The Field Museum, is a natural history museum in Chicago, Illinois, and is one of the largest such museums in the world. The museum maintains its status as a premier natural-history museum through the size and quality of its educational and scientific programs, as well as due to its extensive scientific-specimen and artifact collections. The diverse, high-quality permanent exhibitions, which attract up to two million visitors annually, range from the earliest fossils to past and current cultures from around the world to interactive programming demonstrating today's urgent conservation needs. The museum is named in honor of its first major benefactor, the department-store magnate Marshall Field. The museum and its collections originated from the 1893 World’s Columbian Exposition and the artifacts displayed at the fair.

In library and archival science, digital preservation is a formal endeavor to ensure that digital information of continuing value remains accessible and usable. It involves planning, resource allocation, and application of preservation methods and technologies, and it combines policies, strategies and actions to ensure access to reformatted and "born-digital" content, regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time. The Association for Library Collections and Technical Services Preservation and Reformatting Section of the American Library Association, defined digital preservation as combination of "policies, strategies and actions that ensure access to digital content over time." According to the Harrod's Librarian Glossary, digital preservation is the method of keeping digital material alive so that they remain usable as technological advances render original hardware and software specification obsolete.

The Consortium for the Barcode of Life (CBOL) is an international initiative dedicated to supporting the development of DNA barcoding as a global standard for species identification. CBOL's Secretariat Office is hosted by the National Museum of Natural History, Smithsonian Institution, in Washington, DC. Barcoding was proposed in 2003 by Prof. Paul Hebert of the University of Guelph in Ontario as a way of distinguishing and identifying species with a short standardized gene sequence. Hebert proposed the 648 bases of the Folmer region of the mitochondrial gene cytochrome-C oxidase-1 as the standard barcode region. Dr. Hebert is the Director of the Biodiversity Institute of Ontario, the Canadian Centre for DNA Barcoding, and the International Barcode of Life Project (iBOL), all headquartered at the University of Guelph. The Barcode of Life Data Systems (BOLD) is also located at the University of Guelph.

TeraGrid

TeraGrid was an e-Science grid computing infrastructure combining resources at eleven partner sites. The project started in 2001 and operated from 2004 through 2011.

The US National Virtual Observatory'-NVO- was conceived to allow scientists to access data from multiple astronomical observatories, including ground and space-based facilities, through a single portal. Originally, the National Science Foundation (NSF) funded the information technology research that created the basic NVO infrastructure through a multi-organization collaborative effort. The NVO was more than a “digital library”; it was a vibrant, growing online research facility akin to a bricks-and-mortar observatory for professional astronomers.

Biodiversity Informatics is the application of informatics techniques to biodiversity information for improved management, presentation, discovery, exploration and analysis. It typically builds on a foundation of taxonomic, biogeographic, or ecological information stored in digital form, which, with the application of modern computer techniques, can yield new ways to view and analyse existing information, as well as predictive models for information that does not yet exist. Biodiversity informatics is a relatively young discipline but has hundreds of practitioners worldwide, including the numerous individuals involved with the design and construction of taxonomic databases. The term "Biodiversity Informatics" is generally used in the broad sense to apply to computerized handling of any biodiversity information; the somewhat broader term "bioinformatics" is often used synonymously with the computerized handling of data in the specialized area of molecular biology.

National Ecological Observatory Network organization

National Ecological Observatory Network (NEON) is a large facility program operated by Battelle Memorial Institute and funded by the National Science Foundation. In full operation since 2019, NEON gathers and provides long-term, standardized data on ecological responses of the biosphere to changes in land use and climate, and on feedback with the geosphere, hydrosphere, and atmosphere. NEON is a continental-scale research platform for understanding how and why our ecosystems are changing.

The National Center for Ecological Analysis and Synthesis (NCEAS) is a research center at the University of California, Santa Barbara, in Santa Barbara, California. Better known by its acronym, NCEAS opened in May 1995. Funding for NCEAS is diverse and includes supporters such as the U.S. National Science Foundation, the State of California, and the University of California, Santa Barbara.

Biodiversity Heritage Library digital library, online database and large-scale digitization project for biodiversity literature

The Biodiversity Heritage Library (BHL) is a consortium of natural history and botanical libraries that cooperate to digitize and make accessible the legacy literature of biodiversity held in their collections and to make that literature available for open access and responsible use as a part of a global "biodiversity commons". The BHL consortium works with the international taxonomic community, rights holders, and other interested parties to ensure that this biodiversity heritage is made available to a global audience through open access principles. In partnership with the Internet Archive and through local digitization efforts, the BHL has digitized millions of pages of taxonomic literature, representing tens of thousands of titles and more than 100,000 volumes.

Renaissance Computing Institute

Renaissance Computing Institute (RENCI) was launched in 2004 as a collaboration involving the State of North Carolina, University of North Carolina at Chapel Hill (UNC-CH), Duke University, and North Carolina State University. RENCI is organizationally structured as a research institute within UNC-CH, and its main campus is located in Chapel Hill, NC, a few miles from the UNC-CH campus. RENCI has engagement centers at UNC-CH, Duke University (Durham), and North Carolina State University (Raleigh).

Digital curation is the selection, preservation, maintenance, collection and archiving of digital assets. Digital curation establishes, maintains and adds value to repositories of digital data for present and future use. This is often accomplished by archivists, librarians, scientists, historians, and scholars. Enterprises are starting to use digital curation to improve the quality of information and data within their operational and strategic processes. Successful digital curation will mitigate digital obsolescence, keeping the information accessible to users indefinitely. Digital curation includes digital asset management, data curation, digital preservation, and electronic records management.

Darwin Core is an extension of Dublin Core for biodiversity informatics. It is meant to provide a stable standard reference for sharing information on biological diversity. The terms described in this standard are a part of a larger set of vocabularies and technical specifications under development and maintained by Biodiversity Information Standards (TDWG).

The Consortium of European Taxonomic Facilities (CETAF) is a taxonomic research network formed by scientific institutions in Europe. It was formed in December 1996 by ten of the largest European natural history museums and botanical gardens to be a voice for taxonomy and systematic biology in Europe, to promote scientific research and access to European natural history collections, and to exploit European funding opportunities. Since then, CETAF has served as a meeting point for major European natural history museums and botanical gardens, and has initiated and played an important role in a number of projects.

Plazi is a Swiss-based international non-profit association supporting and promoting the development of persistent and openly accessible digital bio-taxonomic literature. Plazi is maintaining a digital taxonomic literature repository to enable archiving of taxonomic treatments, enhances submitted taxonomic treatments by creating version in the XML formats TaxonX and Taxpub, and educates about the importance of maintaining open access to scientific discourse and data. It is a contributor to the evolving e-taxonomy in the field of Biodiversity Informatics.

Data Observation Network for Earth (DataONE) is a platform for environmental and ecological science, to provide access to Earth observational data. Supported by funding from the US National Science Foundation as one of the initial DataNet programs in 2009, funding was renewed in 2014 through April 2015. DataONE helps preserve, access, use, and reuse of multi-discipline scientific data through the construction of primary cyberinfrastructure and an education and outreach program. DataONE provides scientific data archiving for ecological and environmental data produced by scientists. DataONE's goal is to preserve and provide access to multi-scale, multi-discipline, and multi-national data. Users include scientists, ecosystem managers, policy makers, students, educators, librarians, and the public.

iPlant Collaborative project launched in 2008

The iPlant Collaborative is a virtual organization created by a cooperative agreement funded by the US National Science Foundation (NSF) to create cyberinfrastructure for the plant sciences (botany). The NSF compared cyberinfrastructure to physical infrastructure, "... the distributed computer, information and communication technologies combined with the personnel and integrating components that provide a long-term platform to empower the modern scientific research endeavor". In September 2013 it was announced that the National Science Foundation had renewed iPlant's funding for a second 5-year term with an expansion of scope to all non-human life science research.

BioMart

BioMart is a community-driven project to provide a single point of access to distributed research data. The BioMart project contributes open source software and data services to the international scientific community. Although the BioMart software is primarily used by the biomedical research community, it is designed in such a way that any type of data can be incorporated into the BioMart framework. The BioMart project originated at the European Bioinformatics Institute as a data management solution for the Human Genome Project. Since then, BioMart has grown to become a multi-institute collaboration involving various database projects on five continents.

iNaturalist is a citizen science project and online social network of naturalists, citizen scientists, and biologists built on the concept of mapping and sharing observations of biodiversity across the globe. iNaturalist may be accessed via its website or from its mobile applications. Observations recorded with iNaturalist provide valuable open data to scientific research projects, conservation agencies, other organizations, and the public. The project has been called "a standard-bearer for natural history mobile applications."

Tracy Teal American bioinformatician

Tracy Teal is an American bioinformatician and the Executive Director of Data Carpentry. She is known for her work in open science and biomedical data science education.

References

  1. "Integrated Digitized Biocollections Phase 2". National Science Foundation.
  2. Soltis, Pamela S.; Nelson, Gil; James, Shelley A. (February 2018). "Green digitization: Online botanical collections data answering real-world questions". Applications in Plant Sciences. 6 (2): e1028. doi:10.1002/aps3.1028. PMC   5851568 .
  3. Marshall, Charles (September 17, 2018). "Digitizing the vast 'dark data' in museum fossil collections". The Conversation. Retrieved 25 July 2019.
  4. Olsen, Erik (2015-10-19). "Museum Specimens Find New Life Online". The New York Times. ISSN   0362-4331 . Retrieved 2019-07-25.