Relevance (information retrieval)

Last updated

In information science and information retrieval, relevance denotes how well a retrieved document or set of documents meets the information need of the user. Relevance may include concerns such as timeliness, authority or novelty of the result.

Contents

History

The concern with the problem of finding relevant information dates back at least to the first publication of scientific journals in the 17th century.[ citation needed ]

The formal study of relevance began in the 20th century with the study of what would later be called bibliometrics. In the 1930s and 1940s, S. C. Bradford used the term "relevant" to characterize articles relevant to a subject (cf., Bradford's law). In the 1950s, the first information retrieval systems emerged, and researchers noted the retrieval of irrelevant articles as a significant concern. In 1958, B. C. Vickery made the concept of relevance explicit in an address at the International Conference on Scientific Information. [1]

Since 1958, information scientists have explored and debated definitions of relevance. A particular focus of the debate was the distinction between "relevance to a subject" or "topical relevance" and "user relevance". [1]

Evaluation

The information retrieval community has emphasized the use of test collections and benchmark tasks to measure topical relevance, starting with the Cranfield Experiments of the early 1960s and culminating in the TREC evaluations that continue to this day as the main evaluation framework for information retrieval research. [2]

In order to evaluate how well an information retrieval system retrieved topically relevant results, the relevance of retrieved results must be quantified. In Cranfield-style evaluations, this typically involves assigning a relevance level to each retrieved result, a process known as relevance assessment. Relevance levels can be binary (indicating a result is relevant or that it is not relevant), or graded (indicating results have a varying degree of match between the topic of the result and the information need). Once relevance levels have been assigned to the retrieved results, information retrieval performance measures can be used to assess the quality of a retrieval system's output.

In contrast to this focus solely on topical relevance, the information science community has emphasized user studies that consider user relevance. [3] These studies often focus on aspects of human-computer interaction (see also human-computer information retrieval).

Clustering and relevance

The cluster hypothesis, proposed by C. J. van Rijsbergen in 1979, asserts that two documents that are similar to each other have a high likelihood of being relevant to the same information need. With respect to the embedding similarity space, the cluster hypothesis can be interpreted globally or locally. [4] The global interpretation assumes that there exist some fixed set of underlying topics derived from inter-document similarity. These global clusters or their representatives can then be used to relate relevance of two documents (e.g. two documents in the same cluster should both be relevant to the same request). Methods in this spirit include:

A second interpretation, most notably advanced by Ellen Voorhees, [8] focuses on the local relationships between documents. The local interpretation avoids having to model the number or size of clusters in the collection and allow relevance at multiple scales. Methods in this spirit include:

Local methods require an accurate and appropriate document similarity measure.

Problems and alternatives

The documents which are most relevant are not necessarily those which are most useful to display in the first page of search results. For example, two duplicate documents might be individually considered quite relevant, but it is only useful to display one of them. A measure called "maximal marginal relevance" (MMR) has been proposed to manage this shortcoming. It considers the relevance of each document only in terms of how much new information it brings given the previous results. [13]

In some cases, a query may have an ambiguous interpretation, or a variety of potential responses. Providing a diversity of results can be a consideration when evaluating the utility of a result set. [14]

See also

Related Research Articles

Information retrieval (IR) in computing and information science is the process of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds.

Personal information management (PIM) is the study and implementation of the activities that people perform in order to acquire or create, store, organize, maintain, retrieve, and use informational items such as documents, web pages, and email messages for everyday use to complete tasks and fulfill a person's various roles ; it is information management with intrapersonal scope.

<span class="mw-page-title-main">Browsing</span>

Browsing is a kind of orienting strategy. It is supposed to identify something of relevance for the browsing organism. When used about human beings it is a metaphor taken from the animal kingdom. It is used, for example, about people browsing open shelves in libraries, window shopping, or browsing databases or the Internet.

A backlink is a link from some other website to that web resource. A web resource may be a website, web page, or web directory.

A recommender system, or a recommendation system, is a subclass of information filtering system that provide suggestions for items that are most pertinent to a particular user. Typically, the suggestions refer to various decision-making processes, such as what product to purchase, what music to listen to, or what online news to read. Recommender systems are particularly useful when an individual needs to choose an item from a potentially overwhelming number of items that a service may offer.

Latent semantic analysis (LSA) is a technique in natural language processing, in particular distributional semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. LSA assumes that words that are close in meaning will occur in similar pieces of text. A matrix containing word counts per document is constructed from a large piece of text and a mathematical technique called singular value decomposition (SVD) is used to reduce the number of rows while preserving the similarity structure among columns. Documents are then compared by cosine similarity between any two columns. Values close to 1 represent very similar documents while values close to 0 represent very dissimilar documents.

<span class="mw-page-title-main">Automatic image annotation</span>

Automatic image annotation is the process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image. This application of computer vision techniques is used in image retrieval systems to organize and locate images of interest from a database.

Bibliographic coupling, like co-citation, is a similarity measure that uses citation analysis to establish a similarity relationship between documents. Bibliographic coupling occurs when two works reference a common third work in their bibliographies. It is an indication that a probability exists that the two works treat a related subject matter.

Exploratory search is a specialization of information exploration which represents the activities carried out by searchers who are:

Relevance feedback is a feature of some information retrieval systems. The idea behind relevance feedback is to take the results that are initially returned from a given query, to gather user feedback, and to use information about whether or not those results are relevant to perform a new query. We can usefully distinguish between three types of feedback: explicit feedback, implicit feedback, and blind or "pseudo" feedback.

Query expansion (QE) is the process of reformulating a given query to improve retrieval performance in information retrieval operations, particularly in the context of query understanding. In the context of search engines, query expansion involves evaluating a user's input and expanding the search query to match additional documents. Query expansion involves techniques such as:

A web query or web search query is a query that a user enters into a web search engine to satisfy their information needs. Web search queries are distinctive in that they are often plain text and boolean search directives are rarely used. They vary greatly from standard query languages, which are governed by strict syntax rules as command languages with keyword or positional parameters.

Human–computer information retrieval (HCIR) is the study and engineering of information retrieval techniques that bring human intelligence into the search process. It combines the fields of human-computer interaction (HCI) and information retrieval (IR) and creates systems that improve search by taking into account the human context, or through a multi-step search process that provides the opportunity for human feedback.

A concept search is an automated information retrieval method that is used to search electronically stored unstructured text for information that is conceptually similar to the information provided in a search query. In other words, the ideas expressed in the information retrieved in response to a concept search query are relevant to the ideas contained in the text of the query.

The Cranfield experiments were a series of experimental studies in information retrieval conducted by Cyril W. Cleverdon at the College of Aeronautics, today known as Cranfield University, in the 1960s to evaluate the efficiency of indexing systems. The experiments were broken into two main phases, neither of which was computerized. The entire collection of abstracts, resulting indexes and results were later distributed in electronic format and were widely used for decades.

XML retrieval, or XML information retrieval, is the content-based retrieval of documents structured with XML. As such it is used for computing relevance of XML documents.

<span class="mw-page-title-main">Learning to rank</span> Use of machine learning to rank items

Learning to rank or machine-learned ranking (MLR) is the application of machine learning, typically supervised, semi-supervised or reinforcement learning, in the construction of ranking models for information retrieval systems. Training data consists of lists of items with some partial order specified between items in each list. This order is typically induced by giving a numerical or ordinal score or a binary judgment for each item. The goal of constructing the ranking model is to rank new, unseen lists in a similar way to rankings in the training data.

David Ron Karger is an American computer scientist who is professor and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL) at the Massachusetts Institute of Technology.

Evaluation measures for an information retrieval (IR) system assess how well an index, search engine or database returns results from a collection of resources that satisfy a user's query. They are therefore fundamental to the success of information systems and digital platforms. The success of an IR system may be judged by a range of criteria including relevance, speed, user satisfaction, usability, efficiency and reliability. However, the most important factor in determining a system's effectiveness for users is the overall relevance of results retrieved in response to a query. Evaluation measures may be categorised in various ways including offline or online, user-based or system-based and include methods such as observed user behaviour, test collections, precision and recall, and scores from prepared benchmark test sets.

ChengXiang Zhai is a computer scientist. He is a Donald Biggar Willett Professor in Engineering in the Department of Computer Science at the University of Illinois at Urbana-Champaign.

References

  1. 1 2 Mizzaro, Stefano (1997). "Relevance: The whole history" (PDF). Journal of the American Society for Information Science. 48 (9): 810–832. doi:10.1002/(SICI)1097-4571(199709)48:9<810::AID-ASI6>3.0.CO;2-U.
  2. Sanderson, P. Clough, M. (2013-06-15). "Evaluating the performance of information retrieval systems using test collections". informationr.net. Retrieved 2020-05-28.{{cite web}}: CS1 maint: multiple names: authors list (link)
  3. Yunjie, Xu (2006). "Relevance judgment: What do information users consider beyond topicality?". Journal of the American Society for Information Science and Technology. 57 (7): 961–973. doi:10.1002/asi.20361.
  4. F. Diaz, Autocorrelation and Regularization of Query-Based Retrieval Scores. PhD thesis, University of Massachusetts Amherst, Amherst, MA, February 2008, Chapter 3.
  5. Croft, W.Bruce (1980). "A model of cluster searching based on classification". Information Systems. 5 (3): 189–195. doi:10.1016/0306-4379(80)90010-1.
  6. 1 2 Griffiths, Alan; Luckhurst, H. Claire; Willett, Peter (1986). "Using interdocument similarity information in document retrieval systems" (PDF). Journal of the American Society for Information Science. 37: 3–11. doi:10.1002/(SICI)1097-4571(198601)37:1<3::AID-ASI1>3.0.CO;2-O.
  7. X. Liu and W. B. Croft, “Cluster-based retrieval using language models,” in SIGIR ’04: Proceedings of the 27th annual international conference on Research and development in information retrieval, (New York, NY, USA), pp. 186–193, ACM Press, 2004.
  8. 1 2 E. M. Voorhees, “The cluster hypothesis revisited,” in SIGIR ’85: Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval, (New York, NY, USA), pp. 188–196, ACM Press, 1985.
  9. S. Preece, A spreading activation network model for information retrieval. PhD thesis, University of Illinois, Urbana-Champaign, 1981.
  10. T. Qin, T.-Y. Liu, X.-D. Zhang, Z. Chen, and W.-Y. Ma, “A study of relevance propagation for web search,” in SIGIR ’05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, (New York, NY, USA), pp. 408–415, ACM Press, 2005.
  11. A. Singhal and F. Pereira, “Document expansion for speech retrieval,” in SIGIR ’99: Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, (New York, NY, USA), pp. 34–41, ACM Press, 1999.
  12. Qin, Tao; Liu, Tie-Yan; Zhang, Xu-Dong; Chen, Zheng; Ma, Wei-Ying (2005). "A study of relevance propagation for web search" (PDF). Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. p. 408. doi:10.1145/1076034.1076105. ISBN   1595930345. S2CID   15310025.
  13. Carbonell, Jaime; Goldstein, Jade (1998). "The use of MMR, diversity-based reranking for reordering documents and producing summaries". Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. pp. 335–336. CiteSeerX   10.1.1.50.2490 . doi:10.1145/290941.291025. ISBN   978-1581130157. S2CID   6334682.
  14. "Diversity in Document Retrieval (DDR) 2012".

Further reading