Type of site | Bibliographic database |
---|---|
Owner | |
URL | scholar |
Registration | Optional |
Launched | November 20, 2004 |
Current status | Active |
Google Scholar is a freely accessible web search engine that indexes the full text or metadata of scholarly literature across an array of publishing formats and disciplines. Released in beta in November 2004, the Google Scholar index includes peer-reviewed online academic journals and books, conference papers, theses and dissertations, preprints, abstracts, technical reports, and other scholarly literature, including court opinions and patents. [1]
Google Scholar uses a web crawler, or web robot, to identify files for inclusion in the search results. [2] For content to be indexed in Google Scholar, it must meet certain specified criteria. [3] An earlier statistical estimate published in PLOS One using a mark and recapture method estimated approximately 79–90% coverage of all articles published in English with an estimate of 100 million. [4] This estimate also determined how many online documents were available. Google Scholar has been criticized for not vetting journals and for including predatory journals in its index. [5]
The University of Michigan Library and other libraries whose collections Google scanned for Google Books and Google Scholar retained copies of the scans and have used them to create the HathiTrust Digital Library. [6] [7]
Google Scholar arose out of a discussion between Alex Verstak and Anurag Acharya, [8] both of whom were then working on building Google's main web index. [9] [10] Their goal was to "make the world's problem solvers 10% more efficient" [11] by allowing easier and more accurate access to scientific knowledge. This goal is reflected in the Google Scholar's advertising slogan "Stand on the shoulders of giants", which was taken from an idea attributed to Bernard of Chartres, quoted by Isaac Newton, and is a nod to the scholars who have contributed to their fields over the centuries, providing the foundation for new intellectual achievements. [12] One of the sources for the texts in Google Scholar is the University of Michigan's print collection. [6]
Scholars have gained a range of features over time. In 2006, a citation importing feature was implemented supporting bibliography managers, such as RefWorks, RefMan, EndNote, and BibTeX. In 2007, Acharya announced that Google Scholar had started a program to digitize and host journal articles in agreement with their publishers, an effort separate from Google Books, whose scans of older journals do not include the metadata required for identifying specific articles in specific issues. [13] In 2011, Google removed Scholar from the toolbars on its search pages, [14] making it both less easily accessible and less discoverable for users not already aware of its existence. Around this period, sites with similar features such as CiteSeer, Scirus, and Microsoft Windows Live Academic search were developed. Some of these are now defunct; in 2016, Microsoft launched a new competitor, Microsoft Academic. [15]
A major enhancement was rolled out in 2012, with the possibility for individual scholars to create personal "Scholar Citations profiles". [16] A feature introduced in November 2013 allows logged-in users to save search results into the "Google Scholar library", a personal collection which the user can search separately and organize by tags. [17] Via the "metrics" button, it reveals the top journals in a field of interest, and the articles generating these journal's impact can also be accessed. A metrics feature now supports viewing the impact of whole fields of science and academic journals. [18] Google also included profiles for some posthumous academics, including Albert Einstein [19] and Richard Feynman. [20] For several years, the profile for Isaac Newton [21] indicated he was as a "professor at MIT", with a "verified email at mit.edu". [22]
Google Scholar allows users to search for digital or physical copies of articles, whether online or in libraries. [23] It indexes "full-text journal articles, technical reports, preprints, theses, books, and other documents, including selected Web pages that are deemed to be 'scholarly.'" [24] Because many of Google Scholar's search results link to commercial journal articles, most people will be able to access only an abstract and the citation details of an article, and have to pay a fee to access the entire article. [24] The most relevant results for the searched keywords will be listed first, in order of the author's ranking, the number of references that are linked to it and their relevance to other scholarly literature, and the ranking of the publication that the journal appears in. [25]
Using its "group of" feature, it shows the available links to journal articles. In the 2005 version, this feature provided a link to both subscription-access versions of an article and to free full-text versions of articles; for most of 2006, it provided links to only the publishers' versions. Since December 2006, it has provided links to both published versions and major open access repositories, including all those posted on individual faculty web pages and other unstructured sources identified by similarity. On the other hand, Google Scholar does not allow to filter explicitly between toll access and open access resources, a feature offered Unpaywall and the tools which embed its data, such as Web of Science, Scopus and Unpaywall Journals, used by libraries to calculate the real costs and value of their collections. [26]
Through its "cited by" feature, Google Scholar provides access to abstracts of articles that have cited the article being viewed. [27] It is this feature in particular that provides the citation indexing previously only found in CiteSeer, Scopus, and Web of Science. Google Scholar also provides links so that citations can be either copied in various formats or imported into user-chosen reference managers such as Zotero.
"Scholar Citations profiles" are public author profiles that are editable by authors themselves. [16] Individuals, logging on through a Google account with a bona fide address usually linked to an academic institution, can now create their own page giving their fields of interest and citations. Google Scholar automatically calculates and displays the individual's total citation count, h-index, and i10-index. According to Google, "three-quarters of Scholar search results pages ... show links to the authors' public profiles" as of August 2014. [16]
Through its "Related articles" feature, Google Scholar presents a list of closely related articles, ranked primarily by how similar these articles are to the original result, but also taking into account the relevance of each paper. [28]
Google Scholar's legal database of US cases is extensive. Users can search and read published opinions of US state appellate and supreme court cases since 1950, US federal district, appellate, tax, and bankruptcy courts since 1923 and US Supreme Court cases since 1791. [27] Google Scholar embeds clickable citation links within the case and the How Cited tab allows lawyers to research prior case law and the subsequent citations to the court decision. [29]
While most academic databases and search engines allow users to select one factor (e.g. relevance, citation counts, or publication date) to rank results, Google Scholar ranks results with a combined ranking algorithm in a "way researchers do, weighing the full text of each article, the author, the publication in which the article appears, and how often the piece has been cited in other scholarly literature". [25] Research has shown that Google Scholar puts high weight especially on citation counts, [30] as well as words included in a document's title. [31] In searches by author or year, the first search results are often highly cited articles, as the number of citations is highly determinant, whereas in keyword searches the number of citations is probably the factor with the most weight, but other factors also participate. [32]
Some searchers found Google Scholar to be of comparable quality and utility to subscription-based databases when looking at citations of articles in some specific journals. [33] [34] The reviews recognize that its "cited by" feature in particular poses serious competition to Scopus and Web of Science. A study looking at the biomedical field found citation information in Google Scholar to be "sometimes inadequate, and less often updated". [35] The coverage of Google Scholar may vary by discipline compared to other general databases. [36] Google Scholar strives to include as many journals as possible, including predatory journals, which may lack academic rigor. Specialists on predatory journals say that these kinds of journals "have polluted the global scientific record with pseudo-science" and "that Google Scholar dutifully and perhaps blindly includes in its central index." [37]
Google Scholar does not publish a list of journals crawled or publishers included, and the frequency of its updates is uncertain. Bibliometric evidence suggests Google Scholar's coverage of the sciences and social sciences is competitive with other academic databases; as of 2017, Scholar's coverage of the arts and humanities has not been investigated empirically and Scholar's utility for disciplines in these fields remains ambiguous. [38] Especially early on, some publishers did not allow Scholar to crawl their journals. Elsevier journals have been included since mid-2007, when Elsevier began to make most of its ScienceDirect content available to Google Scholar and Google's web search. [39] However, a 2014 study [4] estimates that Google Scholar can find almost 90% (approximately 100 million) of all scholarly documents on the Web written in English. Large-scale longitudinal studies have found between 40 and 60 percent of scientific articles are available in full text via Google Scholar links. [40]
Google Scholar puts high weight on citation counts in its ranking algorithm and therefore is being criticized for strengthening the Matthew effect; [30] as highly cited papers appear in top positions they gain more citations while new papers hardly appear in top positions and therefore get less attention by the users of Google Scholar and hence fewer citations. Google Scholar effect is a phenomenon when some researchers pick and cite works appearing in the top results on Google Scholar regardless of their contribution to the citing publication because they automatically assume these works' credibility and believe that editors, reviewers, and readers expect to see these citations. [41] Google Scholar has problems identifying publications on the arXiv preprint server correctly. Interpunctuation characters in titles produce wrong search results, and authors are assigned to wrong papers, which leads to erroneous additional search results. Some search results are even given without any comprehensible reason. [42] [43]
Google Scholar is vulnerable to spam. [44] [45] Researchers from the University of California, Berkeley and Otto-von-Guericke University Magdeburg demonstrated that citation counts on Google Scholar can be manipulated and complete non-sense articles created with SCIgen were indexed within Google Scholar. [46] These researchers concluded that citation counts from Google Scholar should be used with care, especially when used to calculate performance metrics such as the h-index or impact factor, which is in itself a poor predictor of article quality. [47] Google Scholar started computing an h-index in 2012 with the advent of individual Scholar pages. Several downstream packages like Harzing's Publish or Perish also use its data. [48] The practicality of manipulating h-index calculators by spoofing Google Scholar was demonstrated in 2010 by Cyril Labbe from Joseph Fourier University, who managed to rank "Ike Antkare" ahead of Albert Einstein by means of a large set of SCIgen-produced documents citing each other (effectively an academic link farm). [49] As of 2010, Google Scholar was not able to shepardize case law, as Lexis could. [50] Unlike other indexes of academic work such as Scopus and Web of Science, Google Scholar does not maintain an Application Programming Interface that may be used to automate data retrieval. Use of web scrapers to obtain the contents of search results is also severely restricted by the implementation of CAPTCHAs. Google Scholar does not display or export Digital Object Identifiers (DOIs), [51] a de facto standard implemented by all major academic publishers to uniquely identify and refer to individual pieces of academic work. [52]
Search engine optimization (SEO) for traditional web search engines such as Google has been popular for many years. For several years, SEO has also been applied to academic search engines such as Google Scholar. [53] SEO for academic articles is also called "academic search engine optimization" (ASEO) and defined as "the creation, publication, and modification of scholarly literature in a way that makes it easier for academic search engines to both crawl it and index it". [53] ASEO has been adopted by several organizations, among them Elsevier, [54] OpenScience, [55] Mendeley, [56] and SAGE Publishing, [57] to optimize their articles' rankings in Google Scholar. ASEO has negatives. [46]
CiteSeerX is a public search engine and digital library for scientific and academic papers, primarily in the fields of computer and information science.
An academic journal or scholarly journal is a periodical publication in which scholarship relating to a particular academic discipline is published. They serve as permanent and transparent forums for the presentation, scrutiny, and discussion of research. They nearly universally require peer review for research articles or other scrutiny from contemporaries competent and established in their respective fields.
A citation index is a kind of bibliographic index, an index of citations between publications, allowing the user to easily establish which later documents cite which earlier documents. A form of citation index is first found in 12th-century Hebrew religious literature. Legal citation indexes are found in the 18th century and were made popular by citators such as Shepard's Citations (1873). In 1961, Eugene Garfield's Institute for Scientific Information (ISI) introduced the first citation index for papers published in academic journals, first the Science Citation Index (SCI), and later the Social Sciences Citation Index (SSCI) and the Arts and Humanities Citation Index (AHCI). American Chemical Society converted its printed Chemical Abstract Service into internet-accessible SciFinder in 2008. The first automated citation indexing was done by CiteSeer in 1997 and was patented. Other sources for such data include Google Scholar, Microsoft Academic, Elsevier's Scopus, and the National Institutes of Health's iCite.
Scopus is a scientific abstract and citation database, launched by the academic publisher Elsevier as a competitor to older Web of Science in 2004. An ensuing competition between the two databases has been characterized as "intense" and is considered to significantly benefit their users in terms of continuous improvent in coverage, search/analysis capabilities, but not in price. Free database The Lens completes the triad of main universal academic research databases.
Bibliometrics is the application of statistical methods to the study of bibliographic data, especially in scientific and library and information science contexts, and is closely associated with scientometrics to the point that both fields largely overlap.
Scientometrics is a subfield of informetrics that studies quantitative aspects of scholarly literature. Major research issues include the measurement of the impact of research papers and academic journals, the understanding of scientific citations, and the use of such measurements in policy and management contexts. In practice there is a significant overlap between scientometrics and other scientific fields such as information systems, information science, science of science policy, sociology of science, and metascience. Critics have argued that overreliance on scientometrics has created a system of perverse incentives, producing a publish or perish environment that leads to low-quality research.
The Arts and Humanities Citation Index (AHCI), also known as Arts and Humanities Search, is a citation index, with abstracting and indexing for more than 1,700 arts and humanities academic journals, and coverage of disciplines that includes social and natural science journals. Part of this database is derived from Current Contents.
Scirus was a comprehensive science-specific search engine, first launched in 2001. Like CiteSeerX and Google Scholar, it was focused on scientific information. Unlike CiteSeerX, Scirus was not only for computer sciences and IT and not all of the results included full text. It also sent its scientific search results to Scopus, an abstract and citation database covering scientific research output globally. Scirus was owned and operated by Elsevier. In 2013 an announcement appeared, on the Scirus homepage, announcing the site's retirement in 2014:
Citation impact or citation rate is a measure of how many times an academic journal article or book or author is cited by other articles, books or authors. Citation counts are interpreted as measures of the impact or influence of academic work and have given rise to the field of bibliometrics or scientometrics, specializing in the study of patterns of academic impact through citation analysis. The importance of journals can be measured by the average citation rate, the ratio of number of citations to number articles published within a given time period and in a given index, such as the journal impact factor or the citescore. It is used by academic institutions in decisions about academic tenure, promotion and hiring, and hence also used by authors in deciding which journal to publish in. Citation-like measures are also used in other fields that do ranking, such as Google's PageRank algorithm, software metrics, college and university rankings, and business performance indicators.
The h-index is an author-level metric that measures both the productivity and citation impact of the publications, initially used for an individual scientist or scholar. The h-index correlates with success indicators such as winning the Nobel Prize, being accepted for research fellowships and holding positions at top universities. The index is based on the set of the scientist's most cited papers and the number of citations that they have received in other publications. The index has more recently been applied to the productivity and impact of a scholarly journal as well as a group of scientists, such as a department or university or country. The index was suggested in 2005 by Jorge E. Hirsch, a physicist at UC San Diego, as a tool for determining theoretical physicists' relative quality and is sometimes called the Hirsch index or Hirsch number.
The Science Citation Index Expanded is a citation index originally produced by the Institute for Scientific Information (ISI) and created by Eugene Garfield.
The Social Sciences Citation Index (SSCI) is a commercial citation index product of Clarivate Analytics. It was originally developed by the Institute for Scientific Information from the Science Citation Index. The Social Sciences Citation Index is a multidisciplinary index which indexes over 3,400 journals across 58 social science disciplines – 1985 to present, and it has 122 million cited references – 1900 to present. It also includes a range of 3,500 selected items from some of the world's finest scientific and technical journals. It has a range of useful search functions such as 'cited reference searching', searching by author, subject, or title. Whilst the Social Sciences Citation Index provides extensive support in bibliographic analytics and research, a number of academic scholars have expressed criticisms relating to ideological bias and its English-dominant publishing nature.
The Web of Science is a paid-access platform that provides access to multiple databases that provide reference and citation data from academic journals, conference proceedings, and other documents in various academic disciplines. Until 1997, it was originally produced by the Institute for Scientific Information. It is currently owned by Clarivate, and currently contains 79 million records in the core collection and 171 million records on the platform.
ResearcherID is an identifying system for scientific authors. The system was introduced in January 2008 by Thomson Reuters Corporation.
In scholarly and scientific publishing, altmetrics are non-traditional bibliometrics proposed as an alternative or complement to more traditional citation impact metrics, such as impact factor and h-index. The term altmetrics was proposed in 2010, as a generalization of article level metrics, and has its roots in the #altmetrics hashtag. Although altmetrics are often thought of as metrics about articles, they can be applied to people, journals, books, data sets, presentations, videos, source code repositories, web pages, etc.
Jurn is a free online search tool for the finding and downloading of free full-text scholarly works. It was established by David Haden in a public online open beta version in February 2009, initially for finding open access electronic journal articles in the arts and humanities. An additional public directory of web links to the indexed journals was placed online in mid 2009. The Jurn search service and directory has since been continually updated and cleaned. In March 2014 Jurn expanded to index topics in science, biomedicine, business and economics, plus selected university repository services for open access deposit papers and full-text theses. Jurn is powered by a Google Custom Search Engine (CSE) and is run without any adverts.
Semantic Scholar is a research tool for scientific literature powered by artificial intelligence. It is developed at the Allen Institute for AI and was publicly released in November 2015. Semantic Scholar uses modern techniques in natural language processing to support the research process, for example by providing automatically generated summaries of scholarly papers. The Semantic Scholar team is actively researching the use of artificial intelligence in natural language processing, machine learning, human–computer interaction, and information retrieval.
Microsoft Academic was a free internet-based academic search engine for academic publications and literature, developed by Microsoft Research in 2016 as a successor of Microsoft Academic Search. Microsoft Academic was shut down in 2022. Both OpenAlex and The Lens claim to be successors to Microsoft Academic.
Digital Science is a technology company with its headquarters in London, United Kingdom. The company focuses on strategic investments into startup companies that support the research lifecycle.
[T]he University of Michigan's work with Google encompasses several activities and Google products (e.g., Google Scholar).
{{cite web}}
: CS1 maint: unfit URL (link)