Cognitive models of information retrieval

Last updated

Cognitive models of information retrieval rest on the mix of areas such as cognitive science, human-computer interaction, information retrieval, and library science. They describe the relationship between a person's cognitive model of the information sought and the organization of this information in an information system. These models attempt to understand how a person is searching for information so that the database and the search of this database can be designed in such a way as to best serve the user. Information retrieval may incorporate multiple tasks and cognitive problems, particularly because different people may have different methods for attempting to find this information and expect the information to be in different forms. Cognitive models of information retrieval may be attempts at something as apparently prosaic as improving search results or may be something more complex, such as attempting to create a database which can be queried with natural language search.

Contents

Berrypicking

One way of understanding how users search for information has been described by Marcia Bates [1] at the University of California at Los Angeles. Bates argues that "berrypicking" better reflects how users search for information than previous models of information retrieval. This may be because previous models were strictly linear and did not incorporate cognitive questions. For instance, one typical model is of a simple linear match between a query and a document. However, Bates points out that there are simple modifications that can be made to this process. For instance, Salton has argued that user feedback may help improve the search results. [2]

Bates argues that searches are evolving and occur bit by bit. That is to say, a person constantly changes their search terms in response to the results returned from the information retrieval system. Thus, a simple linear model does not capture the nature of information retrieval because the very act of searching causes feedback which causes the user to modify his or her cognitive model of the information being searched for. In addition, information retrieval can be bit by bit. Bates gives a number of examples. For instance, a user may look through footnotes and follow these sources. Or, a user may scan through recent journal articles on the topic. In each case, the user's question may change and thus the search evolves.

Researchers in the areas of human-computer interaction and cognitive science focus on how people explore for information when interacting with the WWW. This kind of search, sometimes called exploratory search, focuses on how people iteratively refine their search activities and update their internal representations of the search problems. [3] Existing search engines were designed based on traditional library science theories related to retrieval of basic facts and simple information through an interface. However, exploratory information retrieval often involves ill-defined search goals and evolving criteria for evaluation of relevance. The interactions between humans and the information system will therefore involve more cognitive activity, and systems that support exploratory search will therefore need to take into account the cognitive complexities involved during the dynamic information retrieval process.

Natural language searching

Another way in which cognitive models of information may help in information retrieval is with natural language searching. For instance, How Stuff Works imagines a world in which, rather than searching for local movies, reading the reviews, then searching for local Mexican restaurants, and reading their reviews, you will simply type ""I want to see a funny movie and then eat at a good Mexican restaurant. What are my options?" into your browser, and you will receive a useful and relevant response. [4] Although such a thing is not possible today, it represents a holy grail for researchers into cognitive models of information retrieval. The goal is to somehow program information retrieval programs to respond to natural language searches. This would require a fuller understanding of how people structure queries.

Notes

  1. Marcia Bates (1989). "The Design of Browsing and Berrypicking Techniques for the Online Search Interface." https://pages.gseis.ucla.edu/faculty/bates/berrypicking.html
  2. Gerard Salton (1968). Automatic Information and Retrieval (Computer Science). Dubuque, Iowa: Mcgraw-Hill Inc.
  3. Qu, Yan & Furnas, George. "Model-driven formative evaluation of exploratory search: A study under a sensemaking framework"
  4. Strickland, J. (n.d.). HowStuffWorks "How Web 3.0 Will Work". Howstuffworks "Computer". Retrieved November 4, 2009, from http://computer.howstuffworks.com/web-30.htm

Related Research Articles

Information retrieval (IR) is the activity of obtaining information system resources that are relevant to an information need from a collection of those resources. Searches can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds.

Information science is an academic field which is primarily concerned with analysis, collection, classification, manipulation, storage, retrieval, movement, dissemination, and protection of information. Practitioners within and outside the field study the application and the usage of knowledge in organizations along with the interaction between people, organizations, and any existing information systems with the aim of creating, replacing, improving, or understanding information systems. Historically, information science is associated with computer science, psychology, technology and intelligence agencies. However, information science also incorporates aspects of diverse fields such as archival science, cognitive science, commerce, law, linguistics, museology, management, mathematics, philosophy, public policy, and social sciences.

Gerard A. "Gerry" Salton, was a Professor of Computer Science at Cornell University. Salton was perhaps the leading computer scientist working in the field of information retrieval during his time, and "the father of Information Retrieval". His group at Cornell developed the SMART Information Retrieval System, which he initiated when he was at Harvard. It was the very first system to use the now popular vector space model for Information Retrieval.

Browsing is a kind of orienting strategy. It is supposed to identify something of relevance for the browsing organism. When used about human beings it is a metaphor taken from the animal kingdom. It is used, for example, about people browsing open shelves in libraries, window shopping, or browsing databases or the Internet.

Content-based image retrieval

Content-based image retrieval, also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR), is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in large databases. Content-based image retrieval is opposed to traditional concept-based approaches.

The Gerard Salton Award is presented by the Association for Computing Machinery (ACM) Special Interest Group on Information Retrieval (SIGIR) every three years to an individual who has made "significant, sustained and continuing contributions to research in information retrieval". SIGIR also co-sponsors the Vannevar Bush Award, for the best paper at the Joint Conference on Digital Libraries.

Exploratory search is a specialization of information exploration which represents the activities carried out by searchers who are:

A search engine is an information retrieval software program that discovers, crawls, transforms and stores information for retrieval and presentation in response to user queries.

Search engine optimisation indexing collects, parses, and stores data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, and computer science. An alternate name for the process in the context of search engines designed to find web pages on the Internet is web indexing.

Marcia J. Bates is Professor VI Emerita of Information Studies in the Graduate School of Education and Information Studies, University of California, Los Angeles.

Nicholas J. Belkin is a professor at the School of Communication and Information at Rutgers University. Among the main themes of his research are digital libraries; information-seeking behaviors; and interaction between humans and information retrieval systems. Belkin is best known for his work on human-centered Information Retrieval and the hypothesis of Anomalous State of Knowledge (ASK). Belkin realized that in many cases, users of search systems are unable to precisely formulate what they need. They miss some vital knowledge to formulate their queries. In such cases it is more suitable to attempt to describe a user's anomalous state of knowledge than to ask the user to specify her/his need as a request to the system.

A web search query is a query based on a specific search term that a user enters into a web search engine to satisfy their information needs. Web search queries are distinctive in that they are often plain text or hypertext with optional search-directives. They vary greatly from standard query languages, which are governed by strict syntax rules as command languages with keyword or positional parameters.

Human–computer information retrieval (HCIR) is the study and engineering of information retrieval techniques that bring human intelligence into the search process. It combines the fields of human-computer interaction (HCI) and information retrieval (IR) and creates systems that improve search by taking into account the human context, or through a multi-step search process that provides the opportunity for human feedback.

Artificial imagination, also called synthetic imagination or machine imagination, is defined as the artificial simulation of human imagination by general or special purpose computers or artificial neural networks. The applied form of it is known as media synthesis or synthetic media.

Knowledge retrieval seeks to return information in a structured form, consistent with human cognitive processes as opposed to simple lists of data items. It draws on a range of fields including epistemology, cognitive psychology, cognitive neuroscience, logic and inference, machine learning and knowledge discovery, linguistics, and information technology.

Collaborative search engines (CSE) are Web search engines and enterprise searches within company intranets that let users combine their efforts in information retrieval (IR) activities, share information resources collaboratively using knowledge tags, and allow experts to guide less experienced people through their searches. Collaboration partners do so by providing query terms, collective tagging, adding comments or opinions, rating search results, and links clicked of former (successful) IR activities to users having the same or a related information need.

The following outlineof information science is provided as an overview of and topical guide to information science:

Collaborative tagging, also known as social tagging or folksonomy, allows users to apply public tags to online items, typically to make those items easier for themselves or others to find later. It has been argued that these tagging systems can provide navigational cues or "way-finders" for other users to explore information. The notion is that given that social tags are labels users create to represent topics extracted from online documents, the interpretation of these tags should allow other users to predict the contents of different documents efficiently. Social tags are arguably more important in exploratory search, in which the users may engage in iterative cycles of goal refinement and exploration of new information, and interpretation of information contents by others will provide useful cues for people to discover topics that are relevant.

Information behavior is a field of information science research that seeks to understand the way people search for and use information in various contexts. It can include Information seeking and information retrieval, but it also aims to understand why people seek information and how they use it. The term 'information behavior' was coined by Thomas D. Wilson in 1981 and sparked controversy upon its introduction. The term has now been adopted and Wilson's model of information behavior is widely cited in information behavior literature. In 2000, Wilson defined information behavior as "the totality of human behavior in relation to sources and channels of information".

Collaborative information seeking (CIS) is a field of research that involves studying situations, motivations, and methods for people working in collaborative groups for information seeking projects, as well as building systems for supporting such activities. Such projects often involve information searching or information retrieval (IR), information gathering, and information sharing. Beyond that, CIS can extend to collaborative information synthesis and collaborative sense-making.