OpenCitations

Last updated
OpenCitations
OpenCitations logo.jpg
History2010-present
Access
CostFree
Links
Website opencitations.net

OpenCitations (established in 2010) [1] [2] is a project aiming to publish open bibliographic citation information in RDF. [3] It produces the "OpenCitations Corpus" citation database in the process. [4]

Contents

Datasets

OpenCitations publishes the following datasets which encompass bibliographic data, citation metadata, and in-text reference data. The datasets can be accessed via SPARQL, a REST API, as dumps on Figshare, as individual bibliographic entities, or using OSCAR (OpenCitations RDF Search Application) or Lucinda (The OpenCitations RDF Resource Browser). [5]

OpenCitations Corpus

The OpenCitations Corpus (OCC) is an open repository of scholarly citation data. The repository is released under the CC0 public domain to ensure that the scholarly citation data is open to all. [2]

As of March 19, 2022, the OCC has ingested the references from 326,743 citing bibliographic resources and contains information about 13,964,148 citation links to 7,565,367 cited resources. [4]

OpenCitations Indexes

The OpenCitations Indexes are collections of citations, which treat citations as first-class data objects that include citation metadata, as well as identifiers to the citing and cited works. [6] For example, COCI is the OpenCitations Index of Crossref open DOI-to-DOI citations. A 2021 comparison with other citations tools found that COCI was the smallest in coverage, [7] and a 2020 study found that 54% of the citation links in Web of Science were also in COCI. [8]

Open Biomedical Citations in Context Corpus

The Open Biomedical Citations in Context Corpus (CCC) is a database of citations providing in-text references, [9] extending OpenCitations records with in-text reference pointer information.

See also

Related Research Articles

In information science, an ontology encompasses a representation, formal naming, and definitions of the categories, properties, and relations between the concepts, data, or entities that pertain to one, many, or all domains of discourse. More simply, an ontology is a way of showing the properties of a subject area and how they are related, by defining a set of terms and relational expressions that represent the entities in that subject area. The field which studies ontologies so conceived is sometimes referred to as applied ontology.

<span class="mw-page-title-main">Citation</span> Reference to a source

A citation is a reference to a source. More precisely, a citation is an abbreviated alphanumeric expression embedded in the body of an intellectual work that denotes an entry in the bibliographic references section of the work for the purpose of acknowledging the relevance of the works of others to the topic of discussion at the spot where the citation appears.

CiteSeerX is a public search engine and digital library for scientific and academic papers, primarily in the fields of computer and information science.

Reference management software, citation management software, or bibliographic management software is software that stores a database of bibliographic records and produces bibliographic citations (references) for those records, needed in scholarly research. Once a record has been stored, it can be used time and again in generating bibliographies, such as lists of references in scholarly books and articles. Modern reference management applications can usually be integrated with word processors so that a reference list in one of the many different bibliographic formats required by publishers and scholarly journals is produced automatically as an article is written, reducing the risk that a cited source is not included in the reference list. They will also have a facility for importing bibliographic records from bibliographic databases.

<span class="mw-page-title-main">Bibliometrics</span> Statistical analysis of written publications

Bibliometrics is the use of statistical methods to analyse books, articles and other publications, especially in scientific contents. Bibliometric methods are frequently used in the field of library and information science. Bibliometrics is closely associated with scientometrics, the analysis of scientific metrics and indicators, to the point that both fields largely overlap.

<span class="mw-page-title-main">Google Scholar</span> Academic search service by Google

Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes peer-reviewed online academic journals and books, conference papers, theses and dissertations, preprints, abstracts, technical reports, and other scholarly literature, including court opinions and patents.

Semantic publishing on the Web, or semantic web publishing, refers to publishing information on the web as documents accompanied by semantic markup. Semantic publication provides a way for computers to understand the structure and even the meaning of the published information, making information search and data integration more efficient.

<span class="mw-page-title-main">Crossref</span> Organization

Crossref is a nonprofit open digital infrastructure organisation for the global scholarly research community. Uniquely and persistently recording and connecting knowledge through open metadata and identifiers for all research objects such as grants and articles. It is the largest digital object identifier (DOI) Registration Agency of the International DOI Foundation. It has 19,000 members from 150 countries representing publishers, libraries, research institutions, and funders and was launched in early 2000 as a cooperative effort among publishers to enable persistent cross-platform citation linking in online academic journals. As of July 2023, Crossref identifies and connects 150 million records of metadata about research objects made openly available for reuse without restriction. They facilitate an average of 1.1 billion DOI resolutions every month, and they see 1 billion queries of the metadata every month.

<span class="mw-page-title-main">Mendeley</span> Reference management software

Mendeley is a reference manager software founded in 2007 by PhD students Paul Foeckler, Victor Henning, Jan Reichelt and acquired by Elsevier in 2013. It is used to manage and share research papers and generate bibliographies for scholarly articles.

AGRIS is a global public domain database with more than 12 million structured bibliographical records on agricultural science and technology. It became operational in 1975 and the database was maintained by Coherence in Information for Agricultural Research for Development, and its content is provided by more than 150 participating institutions from 65 countries. The AGRIS Search system, allows scientists, researchers and students to perform sophisticated searches using keywords from the AGROVOC thesaurus, specific journal titles or names of countries, institutions, and authors.

<span class="mw-page-title-main">ORCID</span> Code to uniquely identify scientific and other academic authors

The ORCID is a nonproprietary alphanumeric code to uniquely identify authors and contributors of scholarly communication as well as ORCID's website and services to look up authors and their bibliographic output.

Open scientific data or open research data is a type of open data focused on publishing observations and results of scientific activities available for anyone to analyze and reuse. A major purpose of the drive for open data is to allow the verification of scientific claims, by allowing others to look at the reproducibility of results, and to allow data from many sources to be integrated to give new knowledge.

OurResearch, formerly known as ImpactStory, is a nonprofit organization that creates and distributes tools and services for libraries, institutions and researchers. The organization follows open practices with their data, code, and governance. OurResearch is funded by the Alfred P. Sloan Foundation, the National Science Foundation, and Arcadia Fund.

Enhanced publications or enhanced ebooks are a form of electronic publishing for the dissemination and sharing of research outcomes, whose first formal definition can be tracked back to 2009. As many forms of digital publications, they typically feature a unique identifier and descriptive metadata information. Unlike traditional digital publications, enhanced publications are often tailored to serve specific scientific domains and are generally constituted by a set of interconnected parts corresponding to research assets of several kinds and to textual descriptions of the research. The nature and format of such parts and of the relationships between them, depends on the application domain and may largely vary from case to case.

Data publishing is the act of releasing research data in published form for use by others. It is a practice consisting in preparing certain data or data set(s) for public use thus to make them available to everyone to use as they wish. This practice is an integral part of the open science movement. There is a large and multidisciplinary consensus on the benefits resulting from this practice.

<span class="mw-page-title-main">Academic Torrents</span> File-sharing website

Academic Torrents is a website which enables the sharing of research data using the BitTorrent protocol. The site was founded in November 2013, and is a project of the Institute for Reproducible Research. The project is said to be similar to LOCKSS but with a focus on "offering researchers the opportunity to distribute the hosting of their papers and datasets to authors and readers, providing easy access to scholarly works and simultaneously backing them up on computers around the world."

In natural language processing, linguistics, and neighboring fields, Linguistic Linked Open Data (LLOD) describes a method and an interdisciplinary community concerned with creating, sharing, and (re-)using language resources in accordance with Linked Data principles. The Linguistic Linked Open Data Cloud was conceived and is being maintained by the Open Linguistics Working Group (OWLG) of the Open Knowledge Foundation, but has been a point of focal activity for several W3C community groups, research projects, and infrastructure efforts since then.

<span class="mw-page-title-main">Initiative for Open Citations</span>

The Initiative for Open Citations (I4OC) is a project launched publicly in April 2017, that describes itself as: "a collaboration between scholarly publishers, researchers, and other interested parties to promote the unrestricted availability of scholarly citation data and to make these data available." It is intended to facilitate improved citation analysis.

In linguistics and language technology, a language resource is a "[composition] of linguistic material used in the construction, improvement and/or evaluation of language processing applications, (...) in language and language-mediated research studies and applications."

References

  1. "OpenCitations - About". opencitations.net. Retrieved 2019-05-28.
  2. 1 2 Peroni, Silvio; Shotton, David; Vitali, Fabio (2017). "One year of the OpenCitations Corpus: Releasing RDF-based scholarly citation data into the Public Domain". Proceedings of the 16th International Semantic Web Conference (ISWC 2017). 184 (192). doi:10.1007/978-3-319-68204-4_19 . Retrieved 2019-05-28.
  3. Silvio Peroni; David Shotton (February 2020). "OpenCitations, an infrastructure organization for open scholarship". Quantitative Science Studies. 1 (1): 428–444. arXiv: 1906.11964 . doi:10.1162/QSS_A_00023. ISSN   2641-3337. Wikidata   Q86246929.
  4. 1 2 David Shotton (January 29, 2018), OpenCitations and the Initiative for Open Citations: A Clarification
  5. Chiara Di Giambattista (2022-09-02). "OpenCitations Access Tokens: how they work and why they are important". opencitations.hypothesis.org. Retrieved 2022-11-05.
  6. David Shotton, Citations as First-Class Data Entities: Introduction , retrieved 2022-11-05
  7. Martín-Martín, Alberto; Thelwall, Mike; Orduna-Malea, Enrique; Delgado López-Cózar, Emilio (January 2021). "Google Scholar, Microsoft Academic, Scopus, Dimensions, Web of Science, and OpenCitations' COCI: a multidisciplinary comparison of coverage via citations". Scientometrics. 126 (1): 871–906. doi: 10.1007/s11192-020-03690-4 . hdl: 10251/192850 .
  8. Chudlarský, Tomáš; Dvořák, Jan (1 November 2020). "Can Crossref Citations Replace Web of Science for Research Evaluation? The Share of Open Citations". Journal of Data and Information Science. 5 (4): 35–42. doi: 10.2478/jdis-2020-0037 . hdl: 10467/90574 .
  9. "Open Biomedical Citations in Context Corpus". wellcome.org. Retrieved 2022-11-05.