General Index (academia)

Last updated

The General Index is a publicly available index to a collection of 107 million academic journal articles. [1] [2] It consists of a table of n-grams derived from the full text of the articles along with tables of associated keywords and metadata. [3] It is intended to facilitate computerized analysis of the scientific literature, which has been hindered by widespread copyright restrictions limiting access by researchers to the full text.

Contents

The initial version, comprising the raw database tables without any search engine front-end, was released by the Internet Archive on October 7, 2021. [1]

See also

Related Research Articles

CiteSeerX is a public search engine and digital library for scientific and academic papers, primarily in the fields of computer and information science.

<span class="mw-page-title-main">Internet Archive</span> American non-profit digital archive

The Internet Archive is an American digital library with the stated mission of "universal access to all knowledge". It provides free public access to collections of digitized materials, including websites, software applications/games, music, movies/videos, moving images, and millions of books. In addition to its archiving function, the Archive is an activist organization, advocating a free and open Internet. As of January 1, 2023, the Internet Archive holds over 36 million books and texts, 11.6 million movies, videos and TV shows and clips, 950 thousand software programs, 15 million audio files, 4.5 million images, 251 thousand concerts, and 780 billion web pages in the Wayback Machine.

<span class="mw-page-title-main">JSTOR</span> Distributor of eBooks and other digital media

JSTOR is a digital library founded in 1995 in New York City. Originally containing digitized back issues of academic journals, it now encompasses books and other primary sources as well as current issues of journals in the humanities and social sciences. It provides full-text searches of almost 2,000 journals.

The SAO/NASA Astrophysics Data System (ADS) is an online database of over 16 million astronomy and physics papers from both peer reviewed and non-peer reviewed sources. Abstracts are available free online for almost all articles, and full scanned articles are available in Graphics Interchange Format (GIF) and Portable Document Format (PDF) for older articles. It was developed by the National Aeronautics and Space Administration (NASA), and is managed by the Smithsonian Astrophysical Observatory.

PubMed is a free search engine accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The United States National Library of Medicine (NLM) at the National Institutes of Health maintain the database as part of the Entrez system of information retrieval.

<span class="mw-page-title-main">Questia Online Library</span> Former online research library

Questia was an online commercial digital repository of books and articles that had an academic orientation, with a particular emphasis on books and journal articles in the humanities and social sciences. All the text in all the Questia books and articles were available to subscribers; the site also included integrated research tools. As of December 21, 2020, it ceased operations.

<span class="mw-page-title-main">Google Scholar</span> Academic search service by Google

Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes peer-reviewed online academic journals and books, conference papers, theses and dissertations, preprints, abstracts, technical reports, and other scholarly literature, including court opinions and patents.

The following tables compare general and technical information for a number of relational database management systems. Please see the individual products' articles for further information. Unless otherwise specified in footnotes, comparisons are based on the stable versions without any add-ons, extensions or external programs.

<span class="mw-page-title-main">Budapest Open Access Initiative</span>

The Budapest Open Access Initiative (BOAI) is a public statement of principles relating to open access to the research literature, which was released to the public on February 14, 2002. It arose from a conference convened in Budapest by the Open Society Institute on December 1–2, 2001 to promote open access which at that time was also known as Free Online Scholarship. This small gathering of individuals has been recognised as one of the major defining events of the open access movement. As of 2021, the text of the initiative had been translated to 13 languages.

PubMed Central (PMC) is a free digital repository that archives open access full-text scholarly articles that have been published in biomedical and life sciences journals. As one of the major research databases developed by the National Center for Biotechnology Information (NCBI), PubMed Central is more than a document repository. Submissions to PMC are indexed and formatted for enhanced metadata, medical ontology, and unique identifiers which enrich the XML structured data for each article. Content within PMC can be linked to other NCBI databases and accessed via Entrez search and retrieval systems, further enhancing the public's ability to discover, read and build upon its biomedical knowledge.

A bibliographic database is a database of bibliographic records, an organized digital collection of references to published literature, including journal and newspaper articles, conference proceedings, reports, government and legal publications, patents, books, etc. In contrast to library catalogue entries, a large proportion of the bibliographic records in bibliographic databases describe articles, conference papers, etc., rather than complete monographs, and they generally contain very rich subject descriptions in the form of keywords, subject classification terms, or abstracts.

<span class="mw-page-title-main">Web of Science</span> Online subscription index of citations

The Web of Science is a paid-access platform that provides access to multiple databases that provide reference and citation data from academic journals, conference proceedings, and other documents in various academic disciplines. It was originally produced by the Institute for Scientific Information. It is currently owned by Clarivate.

CAB Direct is a source of references for the applied life sciences It incorporates two bibliographic databases: CAB Abstracts and Global Health. CAB Direct is an access point for multiple bibliographic databases produced by CABI. This database contains 8.8 million bibliographic records, which includes 85,000 full text articles. It also includes noteworthy literature reviews. News articles and reports are also part of this combined database.

<span class="mw-page-title-main">CORE (research service)</span>

CORE is a service provided by the Knowledge Media Institute based at The Open University, United Kingdom. The goal of the project is to aggregate all open access content distributed across different systems, such as repositories and open access journals, enrich this content using text mining and data mining, and provide free access to it through a set of services. The CORE project also aims to promote open access to scholarly outputs. CORE works closely with digital libraries and institutional repositories.

NewsBank is a news database resource that provides archives of media publications as reference materials to libraries.

Sci-Hub is a shadow library website that provides free access to millions of research papers and books, without regard to copyright, by bypassing publishers' paywalls in various ways. Sci-Hub was founded in Kazakhstan by Alexandra Elbakyan in 2011, in response to the high cost of research papers behind paywalls. The site is extensively used worldwide. In September 2019, the site's owners said that it served approximately 400,000 requests per day. Sci-Hub has been estimated to contain 95% of all scholarly publications with issued DOI numbers. Sci-Hub reported on 15 July 2022 that its collection comprises 88,343,822 files.

Semantic Scholar is an artificial intelligence–powered research tool for scientific literature developed at the Allen Institute for AI and publicly released in November 2015. It uses advances in natural language processing to provide summaries for scholarly papers. The Semantic Scholar team is actively researching the use of artificial-intelligence in natural language processing, machine learning, Human-Computer interaction, and information retrieval.

<span class="mw-page-title-main">Clarivate</span> American analytics company

Clarivate Plc is a British-American publicly traded analytics company that operates a collection of subscription-based services, in the areas of bibliometrics and scientometrics; business / market intelligence, and competitive profiling for pharmacy and biotech, patents, and regulatory compliance; trademark protection, and domain and brand protection. In the academy and the scientific community, Clarivate is known for being the company which calculates the impact factor, using data from its Web of Science product family, that also includes services/applications such as Publons, EndNote, EndNote Click, and ScholarOne. Its other product families are Cortellis, DRG, CPA Global, Derwent, MarkMonitor, CompuMark, and Darts-ip, and also the various ProQuest products and services.

The Internet Archive Scholar is a scholarly search engine created by the Internet Archive in 2020, it contained by the time some 25 million research articles with full text access. The materials available comes from three different forms - content identified by the Wayback Machine, by digitized print material and sources such as uploads from users and collection from partnerships.

References

  1. 1 2 Else, Holly (2021-10-26). "Giant, free index to world's research papers released online". Nature. doi:10.1038/d41586-021-02895-8. PMID   34703019. S2CID   240000069.
  2. ""The General Index": New tool allows you to search 107 million research papers for free". Big Think. Retrieved 2021-11-12.
  3. By (2021-11-03). "Malamud's General Index: Research Gist, No Slap On The Wrist". Hackaday. Retrieved 2021-11-12.

Official website