SciCrunch

Last updated

SciCrunch is a collaboratively edited knowledge base about scientific resources. It is a community portal for researchers and a content management system for data and databases. It is intended to provide a common source of data to the research community and the data about Research Resource Identifiers (RRIDs), which can be used in scientific publications. After starting as a pilot of two journals in 2014, by 2022 over 1,000 journals have been using them and over half a million RRIDs have been quoted in the scientific literature. [1] In some respect, it is for science and scholarly publishing, similar to what Wikidata is for Wikimedia Foundation projects. Hosted by the University of California, San Diego, SciCrunch was also designed to help communities of researchers create their own portals to provide access to resources, databases and tools of relevance to their research areas [2]

Contents

Research Resource Identifiers

Research Resource Identifiers (RRID) are globally unique and persistent. [3] They were introduced and are promoted by the Resource Identification Initiative. [3] Resources in this context are research resources like reagents, tools or materials. [3] [4] An example for such a resource would be a cell line used in an experiment or software tool used in a computational analysis. The Resource Identification Portal (https://scicrunch.org/resources) was created in support of this initiative and is a central service where these identifiers can be searched and created. [3] [5] These identifiers should be fully searchable by data mining unlike supplementary files, and can be updated to new versions as basic methodology changes over time.

Format for RRID citations

The recommendation for citing research resources is shown below for key biological resources:

The Resource Identification Portal lists existing RRIDs and instructions for creating a new one if an RRID matching the resource does not already exist.

Features of RRIDs

Description: Each RRID contains an ID, a type, a URL, and a name. There are hundreds of other attributes but most are specific to the type, for example antibody type RRIDs include an attribute called clonality, denoting whether the reagent is monoclonal or polyclonal, while cell lines have an attribute of "parental cell line" denoting the origin of the cell line being described.

RRID Citations: RRIDs denote those research resources that have been used in the conduct of a study. They are not intended to be casual citations. RRIDs that have been used in scientific papers have been mined from the literature using both automated tools [6] and semi-automated tools thanks to a partnership with Hypothes.is. The data that defines which paper cites a particular RRID is usually available on the resolver page for that RRID, for example: https://scicrunch.org/resolver/CVCL_0038 shows the list of 44 papers (as of April 11, 2023) that have used this cell line in research. Each reference will show how authors have used the RRID by including a short snippet of the sentence in which the resource is defined by authors.

External Resolver Services for RRIDs: Name to thing resolver from the California Digital Library can resolve any RRID using the following pattern https://n2t.net/[RRID] example https://n2t.net/RRID:NXR_1049

The Identifiers.org resolver can also resolve any RRID using the following pattern https://identifiers.org/RRID/[RRID] example https://identifiers.org/RRID/RRID:NXR_1049

Institutions and publishers recommending use of RRIDs

A number of publishing houses, initiatives, and research institutions encourage using SciCrunch‘s RRIDs: Common Citation Format Article in Nature, [7] Cell Press, eLife, FORCE11, Frontiers Media, [8] GigaScience, [9] MIRIAM Registry, [10] NIH, [11] PLOS Biology and PLOS Genetics . [12]

See also

Related Research Articles

<span class="mw-page-title-main">Semantic Web</span> Extension of the Web to facilitate data exchange

The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.

A Uniform Resource Identifier (URI) is a unique sequence of characters that identifies a logical or physical resource used by web technologies. URIs may be used to identify anything, including real-world objects, such as people and places, concepts, or information resources such as web pages and books. Some URIs provide a means of locating and retrieving information resources on a network ; these are Uniform Resource Locators (URLs). A URL provides the location of the resource. A URI identifies the resource by name at the specified location or URL. Other URIs provide only a unique name, without a means of locating or retrieving the resource or information about it; these are Uniform Resource Names (URNs). The web technologies that use URIs are not limited to web browsers. URIs are used to identify anything described using the Resource Description Framework (RDF), for example, concepts that are part of an ontology defined using the Web Ontology Language (OWL), and people who are described using the Friend of a Friend vocabulary would each have an individual URI.

<span class="mw-page-title-main">Consortium for Functional Glycomics</span> Glycomics research initiative

The Consortium for Functional Glycomics (CFG) is a large research initiative funded in 2001 by a glue grant from the National Institute of General Medical Sciences (NIGMS) to “define paradigms by which protein-carbohydrate interactions mediate cell communication”. To achieve this goal, the CFG studies the functions of:

<span class="mw-page-title-main">Digital object identifier</span> ISO standard unique string identifier for a digital object

A digital object identifier (DOI) is a persistent identifier or handle used to uniquely identify various objects, standardized by the International Organization for Standardization (ISO). DOIs are an implementation of the Handle System; they also fit within the URI system. They are widely used to identify academic, professional, and government information, such as journal articles, research reports, data sets, and official publications. DOIs have also been used to identify other types of information resources, like commercial videos.

<span class="mw-page-title-main">Chemical Abstracts Service</span> Division of the American Chemical Society

Chemical Abstracts Service (CAS) is a division of the American Chemical Society. It is a source of chemical information and is located in Columbus, Ohio, United States.

A persistent uniform resource locator (PURL) is a uniform resource locator (URL) that is used to redirect to the location of the requested web resource. PURLs redirect HTTP clients using HTTP status codes.

The Institute of Microbial Technology (IMTECH), based in Chandigarh, India, is one of the constituent establishments of the Council of Scientific & Industrial Research (CSIR). It was established in 1984.

<span class="mw-page-title-main">Neuroscience Information Framework</span>

The Neuroscience Information Framework is a repository of global neuroscience web resources, including experimental, clinical, and translational neuroscience databases, knowledge bases, atlases, and genetic/genomic resources and provides many authoritative links throughout the neuroscience portal of Wikipedia.

<span class="mw-page-title-main">Human Microbiome Project</span> Former research initiative

The Human Microbiome Project (HMP) was a United States National Institutes of Health (NIH) research initiative to improve understanding of the microbiota involved in human health and disease. Launched in 2007, the first phase (HMP1) focused on identifying and characterizing human microbiota. The second phase, known as the Integrative Human Microbiome Project (iHMP) launched in 2014 with the aim of generating resources to characterize the microbiome and elucidating the roles of microbes in health and disease states. The program received $170 million in funding by the NIH Common Fund from 2007 to 2016.

The Handle System is the Corporation for National Research Initiatives's proprietary registry assigning persistent identifiers, or handles, to information resources, and for resolving "those handles into the information necessary to locate, access, and otherwise make use of the resources".

Xenbase is a Model Organism Database (MOD), providing informatics resources, as well as genomic and biological data on Xenopus frogs. Xenbase has been available since 1999, and covers both X. laevis and X. tropicalis Xenopus varieties. As of 2013 all of its services are running on virtual machines in a private cloud environment, making it one of the first MODs to do so. Other than hosting genomics data and tools, Xenbase supports the Xenopus research community though profiles for researchers and laboratories, and job and events postings.

<span class="mw-page-title-main">ORCID</span> Code to uniquely identify scientific and other academic authors

The ORCID is a nonproprietary alphanumeric code to uniquely identify authors and contributors of scholarly communication as well as ORCID's website and services to look up authors and their bibliographic output.

<span class="mw-page-title-main">MIRIAM Registry</span>

The MIRIAM Registry, a by-product of the MIRIAM Guidelines, is a database of namespaces and associated information that is used in the creation of uniform resource identifiers. It contains the set of community-approved namespaces for databases and resources serving, primarily, the biological sciences domain. These shared namespaces, when combined with 'data collection' identifiers, can be used to create globally unique identifiers for knowledge held in data repositories. For more information on the use of URIs to annotate models, see the specification of SBML Level 2 Version 2.

ELIXIR is an initiative that allows life science laboratories across Europe to share and store their research data as part of an organised network. Its goal is to bring together Europe's research organisations and data centres to help coordinate the collection, quality control and storage of large amounts of biological data produced by life science experiments. ELIXIR aims to ensure that biological data is integrated into a federated system easily accessible by the scientific community.

Enhanced publications or enhanced ebooks are a form of electronic publishing for the dissemination and sharing of research outcomes, whose first formal definition can be tracked back to 2009. As many forms of digital publications, they typically feature a unique identifier and descriptive metadata information. Unlike traditional digital publications, enhanced publications are often tailored to serve specific scientific domains and are generally constituted by a set of interconnected parts corresponding to research assets of several kinds and to textual descriptions of the research. The nature and format of such parts and of the relationships between them, depends on the application domain and may largely vary from case to case.

Identifiers.org is a project providing stable and perennial identifiers for data records used in the Life Sciences. The identifiers are provided in the form of Uniform Resource Identifiers (URIs). Identifiers.org is also a resolving system, that relies on collections listed in the MIRIAM Registry to provide direct access to different instances of the identified records.

Cellosaurus is an online knowledge base on cell lines, which attempts to document all cell lines used in biomedical research. It is provided by the Swiss Institute of Bioinformatics (SIB). It is an ELIXIR Core Data Resource as well as an IRDiRC's Recognized Resource. It is the contributing resource for cell lines on the Resource Identification Portal. As of December 2022, it contains information for more than 144,000 cell lines.

The Office of Portfolio Analysis was established in the Division of Program Coordination, Planning, and Strategic Initiatives in 2011 to assist NIH Institutes and Centers with scientific portfolio analysis. Per the Federal Register, the Office of Portfolio Analysis serves the following goals:

  1. Prepare and analyze data on NIH sponsored biomedical research to inform trans-NIH planning and coordination;
  2. Serve as a resource for portfolio management at the programmatic level;
  3. Employ databases, analytic tools, methodologies and other resources to conduct assessments in support of portfolio analyses and priority setting in scientific areas of interest across NIH;
  4. Research and develop new analytic tools, support systems, and specifications for new resources in coordination with other NIH organizations to enhance the management of the NIH's scientific portfolio; and
  5. Provide, in coordination with other NIH organizations, training on portfolio analysis tools, procedures, and methodology.
<span class="mw-page-title-main">FORCE11</span> Non-profit organisation to enhance research publishing and communication

FORCE11 is an international coalition of researchers, librarians, publishers and research funders working to reform or enhance the research publishing and communication system. Initiated in 2011 as a community of interest on scholarly communication, FORCE11 is a registered 501(c)(3) organization based in the United States but with members and partners around the world. Key activities include an annual conference, the Scholarly Communications Institute and a range of working groups.

Biocuration is the field of life sciences dedicated to organizing biomedical data, information and knowledge into structured formats, such as spreadsheets, tables and knowledge graphs. The biocuration of biomedical knowledge is made possible by the cooperative work of biocurators, software developers and bioinformaticians and is at the base of the work of biological databases.

References

  1. Bandrowski, Anita (2022). "A decade of GigaScience: What can be learned from half a million RRIDs in the scientific literature?". GigaScience. 11. doi:10.1093/gigascience/giac058. ISSN   2047-217X. PMC   9197678 . PMID   35701373.
  2. Jeffrey, Grethe; Anita, Bandrowski; Davis, Banks; Christopher, Condit; Amarnath, Gupta; Stephen, Larson; Yueling, Li; Ibrahim, Ozyurt; Andrea, Stagg; Patricia, Whetzel; Luis, Marenco; Perry, Miller; Rixin, Wang; Gordon, Shepherd; Maryann, Martone (2014). "SciCrunch: A cooperative and collaborative data and resource discovery platform for scientific communities". Frontiers in Neuroinformatics. 8. doi: 10.3389/conf.fninf.2014.18.00069 .
  3. 1 2 3 4 Bandrowski, Anita; Brush, Matthew; Grethe, Jeffery S.; Haendel, Melissa A.; Kennedy, David N.; Hill, Sean; Hof, Patrick R.; Martone, Maryann E.; Pols, Maaike; Tan, Serena; Washington, Nicole; Zudilova-Seinstra, Elena; Vasilevsky, Nicole (19 November 2015). "The Resource Identification Initiative: A cultural shift in publishing". F1000Research . 2. 4 (134): 134. doi: 10.12688/f1000research.6555.2 . PMC   4648211 . PMID   26594330.
  4. "What is a Resource?". scicrunch.org. SciCrunch. Retrieved 11 April 2023.
  5. "Resource Identification Portal". scicrunch.org. SciCrunch. Retrieved 11 April 2023.
  6. Ozyurt (2016). "Resource Disambiguator for the Web: Extracting Biomedical Resources and Their Citations from the Scientific Literature". PLoS One . 11 (1): e0146300. Bibcode:2016PLoSO..1146300O. doi: 10.1371/journal.pone.0146300 . PMC   5156472 . PMID   26730820.
  7. Singh Chawla, Dalmeet (29 May 2015). "Researchers argue for standard format to cite lab resources". Nature. doi:10.1038/nature.2015.17652. S2CID   211730637.
  8. "Frontiers Author Guidelines". Frontiers. Retrieved 19 April 2016.
  9. "Technical Note | GigaScience | Oxford Academic". academic.oup.com. Retrieved 2019-07-15.
  10. "identifiers.org". Data collection: RRID. Retrieved 18 April 2016.
  11. "NIDA supports SciCrunch and RRIDs in making research resources visible in science". FORCE11. 2016-03-07. Retrieved 18 April 2016.
  12. "Introducing the Research Resource Identification Initiative at PLOS Biology & PLOS Genetics". PLOS Biologue. 29 January 2015.