Type of site | Search engine |
---|---|
Created by | Allen Institute for Artificial Intelligence |
URL | semanticscholar |
Launched | November 2, 2015 [1] |
Semantic Scholar is a research tool for scientific literature powered by artificial intelligence. It is developed at the Allen Institute for AI and was publicly released in November 2015. [2] Semantic Scholar uses modern techniques in natural language processing to support the research process, for example by providing automatically generated summaries of scholarly papers. [3] The Semantic Scholar team is actively researching the use of artificial intelligence in natural language processing, machine learning, human–computer interaction, and information retrieval. [4]
Semantic Scholar began as a database for the topics of computer science, geoscience, and neuroscience. [5] In 2017, the system began including biomedical literature in its corpus. [5] As of September 2022 [update] , it includes over 200 million publications from all fields of science. [6]
Semantic Scholar provides a one-sentence summary of scientific literature. One of its aims was to address the challenge of reading numerous titles and lengthy abstracts on mobile devices. [7] It also seeks to ensure that the three million scientific papers published yearly reach readers, since it is estimated that only half of this literature is ever read. [8]
Artificial intelligence is used to capture the essence of a paper, generating it through an "abstractive" technique. [3] The project uses a combination of machine learning, natural language processing, and machine vision to add a layer of semantic analysis to the traditional methods of citation analysis, and to extract relevant figures, tables, entities, and venues from papers. [9] [10]
Another key AI-powered feature is Research Feeds, an adaptive research recommender that uses AI to quickly learn what papers users care about reading and recommends the latest research to help scholars stay up to date. It uses a state-of-the-art paper embedding model trained using contrastive learning to find papers similar to those in each Library folder. [11]
Semantic Scholar also offers Semantic Reader, an augmented reader with the potential to revolutionize scientific reading by making it more accessible and richly contextual. [12] Semantic Reader provides in-line citation cards that allow users to see citations with TLDR summaries as they read and skimming highlights that capture key points of a paper so users can digest faster.
In contrast with Google Scholar and PubMed, Semantic Scholar is designed to highlight the most important and influential elements of a paper. [13] The AI technology is designed to identify hidden connections and links between research topics. [14] Like the previously cited search engines, Semantic Scholar also exploits graph structures, which include the Microsoft Academic Knowledge Graph, Springer Nature's SciGraph, and the Semantic Scholar Corpus. [15]
Each paper hosted by Semantic Scholar is assigned a unique identifier called the Semantic Scholar Corpus ID (abbreviated S2CID). The following entry is an example:
Liu, Ying; Gayle, Albert A; Wilder-Smith, Annelies; Rocklöv, Joacim (March 2020). "The reproductive number of COVID-19 is higher compared to SARS coronavirus". Journal of Travel Medicine. 27 (2). doi:10.1093/jtm/taaa021. PMID 32052846. S2CID 211099356.
Semantic Scholar is free to use and unlike similar search engines (i.e. Google Scholar) does not search for material that is behind a paywall. [5] [ citation needed ]
One study compared the index scope of Semantic Scholar to Google Scholar, and found that for the papers cited by secondary studies in computer science, the two indices had comparable coverage, each only missing a handful of the papers. [16]
As of January 2018, following a 2017 project that added biomedical papers and topic summaries, the Semantic Scholar corpus included more than 40 million papers from computer science and biomedicine. [17] In March 2018, Doug Raymond, who developed machine learning initiatives for the Amazon Alexa platform, was hired to lead the Semantic Scholar project. [18] As of August 2019 [update] , the number of included papers metadata (not the actual PDFs) had grown to more than 173 million [19] after the addition of the Microsoft Academic Graph records. [20] In 2020, a partnership between Semantic Scholar and the University of Chicago Press Journals made all articles published under the University of Chicago Press available in the Semantic Scholar corpus. [21] At the end of 2020, Semantic Scholar had indexed 190 million papers. [22]
In 2020, Semantic Scholar reached seven million users per month. [7] It is problematic in that it uses AI, and cannot distinguish therefore between reviews of books and books themselves, frequently listing only the reviews and refusing to include the books reviewed. Furthermore, because it uses science standards—initials plus last name—it cannot distinguish between authors with common names. The field assigned to papers is also not clearly distinguished, listing "History" for Literature papers and similar issues. No citation is necessary for this assertion as users can verify it for themselves: for example, search for "Divine Comedy" (Dante's fourteenth century masterpiece) and get feedback such as a book review appearing as "Art" (see line three, an actual response below).
Terry Eagleton's Divine Comedy
S. Connor Art Theory Now Journal of Literature Critique and... 29 July 2022
This essay reflects on the links between comedy and religion in Terry Eagleton's writing since 2000. It proposes that religious thought provides the same kind of occasion and imperative to...
Artificial intelligence (AI) is the intelligence of machines or software, as opposed to the intelligence of humans or animals. It is a field of study in computer science that develops and studies intelligent machines. Such machines may be called AIs.
Knowledge representation and reasoning is the field of artificial intelligence (AI) dedicated to representing information about the world in a form that a computer system can use to solve complex tasks such as diagnosing a medical condition or having a dialog in a natural language. Knowledge representation incorporates findings from psychology about how humans solve problems, and represent knowledge in order to design formalisms that will make complex systems easier to design and build. Knowledge representation and reasoning also incorporates findings from logic to automate various kinds of reasoning, such as the application of rules or the relations of sets and subsets.
The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.
CiteSeerX is a public search engine and digital library for scientific and academic papers, primarily in the fields of computer and information science.
Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes peer-reviewed online academic journals and books, conference papers, theses and dissertations, preprints, abstracts, technical reports, and other scholarly literature, including court opinions and patents.
Deborah Louise McGuinness is an American computer scientist and researcher at Rensselaer Polytechnic Institute (RPI). She is a professor of Computer, Cognitive and Web Sciences, Industrial and Systems Engineering, and an endowed chair in the Tetherless World Constellation, a multidisciplinary research institution within RPI that focuses on the study of theories, methods and applications of the World Wide Web. Her fields of expertise include interdisciplinary data integration, artificial intelligence, specifically in knowledge representation and reasoning, description logics, the semantic web, explanation, and trust.
Semantic analytics, also termed semantic relatedness, is the use of ontologies to analyze content in web resources. This field of research combines text analytics and Semantic Web technologies like RDF. Semantic analytics measures the relatedness of different ontological concepts.
The Web of Science is a paid-access platform that provides access to multiple databases that provide reference and citation data from academic journals, conference proceedings, and other documents in various academic disciplines. Until 1997, it was originally produced by the Institute for Scientific Information. It is currently owned by Clarivate.
Eric Joel Horvitz is an American computer scientist, and Technical Fellow at Microsoft, where he serves as the company's first Chief Scientific Officer. He was previously the director of Microsoft Research Labs, including research centers in Redmond, WA, Cambridge, MA, New York, NY, Montreal, Canada, Cambridge, UK, and Bangalore, India.
The Allen Institute for AI is a 501(c)3 non-profit research institute founded by late Microsoft co-founder and philanthropist Paul Allen in 2014. The institute seeks to conduct high-impact AI research and engineering in service of the common good. Oren Etzioni was appointed by Paul Allen in September 2013 to direct the research at the institute. After leading the organization for nine years, Oren Etzioni stepped down from his role as CEO on September 30, 2022. He was replaced in an interim capacity by the leading researcher of the company's Aristo project, Peter Clark. On June 20, 2023, AI2 announced Ali Farhadi as its next CEO starting July 31, 2023. The company's board formed a search committee for a new CEO. AI2 also has an active office in Tel Aviv, Israel.
A citation graph, in information science and bibliometrics, is a directed graph that describes the citations within a collection of documents.
Maluuba is a Canadian technology company conducting research in artificial intelligence and language understanding. Founded in 2011, the company was acquired by Microsoft in 2017.
The following outline is provided as an overview of and topical guide to natural-language processing:
Oren Etzioni is an American entrepreneur, Professor Emeritus of computer science, and founding CEO of the Allen Institute for Artificial Intelligence (AI2). On June 15, 2022, he announced that he will step down as CEO of AI2 effective September 30, 2022. After that time, he will continue as a board member and advisor. Etzioni will also take the position of Technical Director of the AI2 Incubator.
Meta ULC was a Canadian unlimited liability corporation performing big data analysis of scientific literature, which was acquired by the Chan Zuckerberg Initiative and shut down in 2021, effective in 2022.
Microsoft Academic was a free internet-based academic search engine for academic publications and literature, developed by Microsoft Research, shut down in 2022. At the same time, OpenAlex was launched and claimed to be a successor to Microsoft Academic.
In knowledge representation and reasoning, a knowledge graph is a knowledge base that uses a graph-structured data model or topology to represent and operate on data. Knowledge graphs are often used to store interlinked descriptions of entities – objects, events, situations or abstract concepts – while also encoding the semantics or relationships underlying these entities.
Ontotext GraphDB is a graph database and knowledge discovery tool compliant with RDF and SPARQL and available as a high-availability cluster. Ontotext GraphDB is used in various European research projects.
...the publicly available corpus compiled by Semantic Scholar – a tool set up in 2015 by the Allen Institute for Artificial Intelligence in Seattle, Washington – amounting to around 200 million articles, including preprints.
{{cite book}}
: |journal=
ignored (help)