Faceted search augments lexical search with a faceted navigation system, allowing users to narrow results by applying filters based on a faceted classification of the items. [1] It is a parametric search technique. [2] A faceted classification system classifies each information element along multiple explicit dimensions, facets, enabling the classifications to be accessed and ordered in multiple ways rather than in a single, predetermined, taxonomic order. [1]
Facets correspond to properties of the information elements. They are often derived by analysis of the text of an item using entity extraction techniques or from pre-existing fields in a database such as author, descriptor, language, and format. Thus, existing web-pages, product descriptions or online collections of articles can be augmented with navigational facets.
Faceted search interfaces were first developed in the academic world by Ben Shneiderman, Steven Pollitt, Marti Hearst, and Gary Marchionini in the 1990s and 2000s. [3] [4] [5] [6] The most well-known of these efforts was the Flamenco research project at University of California, Berkeley led by Marti Hearst. [7] Concurrently, there was development of commercial faceted search systems, notably Endeca and Spotfire.
Within the academic community, faceted search has attracted interest primarily among library and information science researchers, and to some extent among computer science researchers specializing in information retrieval.b [8]
Faceted search has become a popular technique in commercial search applications, particularly for online retailers and libraries. An increasing number of enterprise search vendors provide software for implementing faceted search applications.
Online retail catalogs pioneered the earliest applications of faceted search, reflecting both the faceted nature of product data (most products have a type, brand, price, etc.) and the ready availability of the data in retailers' existing information-systems. In the early 2000s retailers started using faceted search, in part due to published studies that evaluated user search experience on popular sites. [9]
As of 2014 [update] , among the 50 largest US-based online retailers, 40% had implemented faceted search. [10] Examples include the filtering options that appear in the left column on amazon.com or Google Shopping after a keyword search has been performed.
In 1933, the noted librarian Ranganathan proposed a faceted classification system for library materials, known as colon classification. In the pre-computer era, he did not succeed in replacing the pre-coordinated Dewey Decimal Classification system. [11]
Modern online library catalogs, also known as online public access catalogs (OPAC), have increasingly adopted faceted search interfaces. Noted examples include the North Carolina State University library catalog (part of the Triangle Research Libraries Network) and the OCLC Open WorldCat system. The CiteSeerX project [12] at the Pennsylvania State University allows faceted search for academic documents and continues to expand into other facets such as table search.
Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an information need. The information need can be specified in the form of a search query. In the case of document retrieval, queries can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds.
A library classification is a system used within a library to organize materials, including books, sound and video recordings, electronic materials, etc., both on shelves and in catalogs and indexes. Each item is typically assigned a call number, which identifies the location of the item within the system. Materials can be arrange by many different factors, typically in either a hierarchical tree structure based on the subject or using a faceted classification system, which allows the assignment of multiple classifications to an object, enabling the classifications to be ordered in many ways.
A library catalog is a register of all bibliographic items found in a library or group of libraries, such as a network of libraries at several locations. A catalog for a group of libraries is also called a union catalog. A bibliographic item can be any information entity that is considered library material, or a group of library materials, or linked from the catalog as far as it is relevant to the catalog and to the users (patrons) of the library.
This page is a glossary of library and information science.
The online public access catalog (OPAC), now frequently synonymous with library catalog, is an online database of materials held by a library or group of libraries. Online catalogs have largely replaced the analog card catalogs previously used in libraries.
A faceted classification is a classification scheme used in organizing knowledge into a systematic order. A faceted classification uses semantic categories, either general or subject-specific, that are combined to create the full classification entry. Many library classification systems use a combination of a fixed, enumerative taxonomy of concepts with subordinate facets that further refine the topic.
In information systems, a tag is a keyword or term assigned to a piece of information. This kind of metadata helps describe an item and allows it to be found again by browsing or searching. Tags are generally chosen informally and personally by the item's creator or by its viewer, depending on the system, although they may also be chosen from a controlled vocabulary.
Controlled vocabularies provide a way to organize knowledge for subsequent retrieval. They are used in subject indexing schemes, subject headings, thesauri, taxonomies and other knowledge organization systems. Controlled vocabulary schemes mandate the use of predefined, preferred terms that have been preselected by the designers of the schemes, in contrast to natural language vocabularies, which have no such restriction.
The following outline is provided as an overview of and topical guide to library and information science:
Library technical services are the ongoing maintenance activities of a library's collection, including the three broad areas of collection development, cataloging, and processing. Technical services are the infrastructure that enable the user's experience of many library services and are typically performed "behind the scenes."
Knowledge organization (KO), organization of knowledge, organization of information, or information organization is an intellectual discipline concerned with activities such as document description, indexing, and classification that serve to provide systems of representation and order for knowledge and information objects. According to The Organization of Information by Joudrey and Taylor, information organization:
examines the activities carried out and tools used by people who work in places that accumulate information resources for the use of humankind, both immediately and for posterity. It discusses the processes that are in place to make resources findable, whether someone is searching for a single known item or is browsing through hundreds of resources just hoping to discover something useful. Information organization supports a myriad of information-seeking scenarios.
The Human–Computer Interaction Lab (HCIL) at the University of Maryland, College Park is an academic research center specializing in the field of human-computer interaction (HCI). Founded in 1983 by Ben Shneiderman, it is one of the oldest HCI labs of its kind. The HCIL conducts research on the design, implementation, and evaluation of computer interface technologies. Additional research focuses on the development of user interfaces and design methods. Primary activities of the HCIL include collaborative research, publication and the sponsorship of open houses, workshops and annual symposiums.
Subject indexing is the act of describing or classifying a document by index terms, keywords, or other symbols in order to indicate what different documents are about, to summarize their contents or to increase findability. In other words, it is about identifying and describing the subject of documents. Indexes are constructed, separately, on three distinct levels: terms in a document such as a book; objects in a collection such as a library; and documents within a field of knowledge.
Audio mining is a technique by which the content of an audio signal can be automatically analyzed and searched. It is most commonly used in the field of automatic speech recognition, where the analysis tries to identify any speech within the audio. The term ‘audio mining’ is sometimes used interchangeably with audio indexing, phonetic searching, phonetic indexing, speech indexing, audio analytics, speech analytics, word spotting, and information retrieval. Audio indexing, however, is mostly used to describe the pre-process of audio mining, in which the audio file is broken down into a searchable index of words.
Human–computer information retrieval (HCIR) is the study and engineering of information retrieval techniques that bring human intelligence into the search process. It combines the fields of human-computer interaction (HCI) and information retrieval (IR) and creates systems that improve search by taking into account the human context, or through a multi-step search process that provides the opportunity for human feedback.
Folksonomy is a classification system in which end users apply public tags to online items, typically to make those items easier for themselves or others to find later. Over time, this can give rise to a classification system based on those tags and how often they are applied or searched for, in contrast to a taxonomic classification designed by the owners of the content and specified when it is published. This practice is also known as collaborative tagging, social classification, social indexing, and social tagging. Folksonomy was originally "the result of personal free tagging of information [...] for one's own retrieval", but online sharing and interaction expanded it into collaborative forms. Social tagging is the application of tags in an open online environment where the tags of other users are available to others. Collaborative tagging is tagging performed by a group of users. This type of folksonomy is commonly used in cooperative and collaborative projects such as research, content repositories, and social bookmarking.
Martha Alice Hearst is a professor in the School of Information at the University of California, Berkeley. She did early work in corpus-based computational linguistics, including some of the first work in automating sentiment analysis, and word sense disambiguation. She invented an algorithm that became known as "Hearst patterns" which applies lexico-syntactic patterns to recognize hyponymy (ISA) relations with high accuracy in large text collections, including an early application of it to WordNet; this algorithm is widely used in commercial text mining applications including ontology learning. Hearst also developed early work in automatic segmentation of text into topical discourse boundaries, inventing a now well-known approach called TextTiling.
In the context of information retrieval, a thesaurus is a form of controlled vocabulary that seeks to dictate semantic manifestations of metadata in the indexing of content objects. A thesaurus serves to minimise semantic ambiguity by ensuring uniformity and consistency in the storage and retrieval of the manifestations of content objects. ANSI/NISO Z39.19-2005 defines a content object as "any item that is to be described for inclusion in an information retrieval system, website, or other source of information". The thesaurus aids the assignment of preferred terms to convey semantic metadata associated with the content object.
Gary Marchionini is an American information scientist and educator at the University of North Carolina at Chapel Hill (1998–present).
Pauline Atherton Cochrane was an American librarian and one of the most highly cited authors in the field of library and information sciences. She is considered to have been a leading researcher in the campaign to redesign catalogues and indexes to provide improved online subject access in library and information services as well as "a leading teacher and theorist in cataloging, indexing, and information access."
{{cite book}}
: CS1 maint: multiple names: authors list (link)In our first study on ease of search experience for users, we concluded that '27% of task failures were a result of not being able to locate a suitable item on the site, even though all of our tasks were designed so there was always at least one item available.'