Text Retrieval Conference

Last updated
Text REtrieval Conference
Text REtrieval Conference TREC logo.png
...to encourage research in information retrieval from large text collections.
AbbreviationTREC
Discipline information retrieval
Publication details
Publisher NIST
History1992;32 years ago (1992)
Frequencyannual
Website trec.nist.gov

The Text REtrieval Conference (TREC) is an ongoing series of workshops focusing on a list of different information retrieval (IR) research areas, or tracks. It is co-sponsored by the National Institute of Standards and Technology (NIST) and the Intelligence Advanced Research Projects Activity (part of the office of the Director of National Intelligence), and began in 1992 as part of the TIPSTER Text program. Its purpose is to support and encourage research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies and to increase the speed of lab-to-product transfer of technology.

Contents

TREC's evaluation protocols have improved many search technologies. A 2010 study estimated that "without TREC, U.S. Internet users would have spent up to 3.15 billion additional hours using web search engines between 1999 and 2009." [1] Hal Varian the Chief Economist at Google wrote that "The TREC data revitalized research on information retrieval. Having a standard, widely available, and carefully constructed set of data laid the groundwork for further innovation in this field." [2]

Each track has a challenge wherein NIST provides participating groups with data sets and test problems. Depending on track, test problems might be questions, topics, or target extractable features. Uniform scoring is performed so the systems can be fairly evaluated. After evaluation of the results, a workshop provides a place for participants to collect together thoughts and ideas and present current and future research work.Text Retrieval Conference started in 1992, funded by DARPA (US Defense Advanced Research Project) and run by NIST. Its purpose was to support research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies.

Goals

TREC is overseen by a program committee consisting of representatives from government, industry, and academia. For each TREC, NIST provide a set of documents and questions. Participants run their own retrieval system on the data and return to NIST a list of retrieved top-ranked documents .NIST pools the individual result judges the retrieved documents for correctness and evaluates the results. The TREC cycle ends with a workshop that is a forum for participants to share their experiences.

Relevance judgments in TREC

TREC defines relevance as: "If you were writing a report on the subject of the topic and would use the information contained in the document in the report, then the document is relevant." [3] Most TREC retrieval tasks use binary relevance: a document is either relevant or not relevant. Some TREC tasks use graded relevance, capturing multiple degrees of relevance. Most TREC collections are too large to perform complete relevance assessment; for these collections it is impossible to calculate the absolute recall for each query. To decide which documents to assess, TREC usually uses a method call pooling. In this method, the top-ranked n documents from each contributing run are aggregated, and the resulting document set is judged completely.

Various TRECs

In 1992 TREC-1 was held at NIST. The first conference attracted 28 groups of researchers from academia and industry. It demonstrated a wide range of different approaches to the retrieval of text from large document collections .Finally TREC1 revealed the facts that automatic construction of queries from natural language query statements seems to work. Techniques based on natural language processing were no better no worse than those based on vector or probabilistic approach.

TREC2 Took place in August 1993. 31 group of researchers participated in this. Two types of retrieval were examined. Retrieval using an ‘ad hoc’ query and retrieval using a ‘routing' query

In TREC-3 a small group experiments worked with Spanish language collection and others dealt with interactive query formulation in multiple databases

TREC-4 they made even shorter to investigate the problems with very short user statements

TREC-5 includes both short and long versions of the topics with the goal of carrying out deeper investigation into which types of techniques work well on various lengths of topics

In TREC-6 Three new tracks speech, cross language, high precision information retrieval were introduced. The goal of cross language information retrieval is to facilitate research on system that are able to retrieve relevant document regardless of language of the source document

TREC-7 contained seven tracks out of which two were new Query track and very large corpus track. The goal of the query track was to create a large query collection

TREC-8 contain seven tracks out of which two –question answering and web tracks were new. The objective of QA query is to explore the possibilities of providing answers to specific natural language queries

TREC-9 Includes seven tracks

In TREC-10 Video tracks introduced Video tracks design to promote research in content based retrieval from digital video

In TREC-11 Novelty tracks introduced. The goal of novelty track is to investigate systems abilities to locate relevant and new information within the ranked set of documents returned by a traditional document retrieval system

TREC-12 held in 2003 added three new tracks; Genome track, robust retrieval track, HARD (Highly Accurate Retrieval from Documents) [4]

Tracks

Current tracks

New tracks are added as new research needs are identified, this list is current for TREC 2018. [5]

Past tracks

In 1997, a Japanese counterpart of TREC was launched (first workshop in 1999), called NTCIR (NII Test Collection for IR Systems), and in 2000, CLEF, a European counterpart, specifically vectored towards the study of cross-language information retrieval was launched. Forum for Information Retrieval Evaluation (FIRE) started in 2008 with the aim of building a South Asian counterpart for TREC, CLEF, and NTCIR,

Conference contributions to search effectiveness

NIST claims that within the first six years of the workshops, the effectiveness of retrieval systems approximately doubled. [7] The conference was also the first to hold large-scale evaluations of non-English documents, speech, video and retrieval across languages. Additionally, the challenges have inspired a large body of publications. Technology first developed in TREC is now included in many of the world's commercial search engines. An independent report by RTII found that "about one-third of the improvement in web search engines from 1999 to 2009 is attributable to TREC. Those enhancements likely saved up to 3 billion hours of time using web search engines. ... Additionally, the report showed that for every $1 that NIST and its partners invested in TREC, at least $3.35 to $5.07 in benefits were accrued to U.S. information retrieval researchers in both the private sector and academia." [8] [9]

While one study suggests that the state of the art for ad hoc search did not advance substantially in the decade preceding 2009, [10] it is referring just to search for topically relevant documents in small news and web collections of a few gigabytes. There have been advances in other types of ad hoc search. For example, test collections were created for known-item web search which found improvements from the use of anchor text, title weighting and url length, which were not useful techniques on the older ad hoc test collections. In 2009, a new billion-page web collection was introduced, and spam filtering was found to be a useful technique for ad hoc web search, unlike in past test collections.

The test collections developed at TREC are useful not just for (potentially) helping researchers advance the state of the art, but also for allowing developers of new (commercial) retrieval products to evaluate their effectiveness on standard tests. In the past decade, TREC has created new tests for enterprise e-mail search, genomics search, spam filtering, e-Discovery, and several other retrieval domains.[ when? ][ citation needed ]

TREC systems often provide a baseline for further research. Examples include:

Participation

The conference is made up of a varied, international group of researchers and developers. [15] [16] [17] In 2003, there were 93 groups from both academia and industry from 22 countries participating.

See also

Related Research Articles

Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an information need. The information need can be specified in the form of a search query. In the case of document retrieval, queries can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds.

Music information retrieval (MIR) is the interdisciplinary science of retrieving information from music. Those involved in MIR may have a background in academic musicology, psychoacoustics, psychology, signal processing, informatics, machine learning, optical music recognition, computational intelligence, or some combination of these.

Cross-language information retrieval (CLIR) is a subfield of information retrieval dealing with retrieving information written in a language different from the language of the user's query. The term "cross-language information retrieval" has many synonyms, of which the following are perhaps the most frequent: cross-lingual information retrieval, translingual information retrieval, multilingual information retrieval. The term "multilingual information retrieval" refers more generally both to technology for retrieval of multilingual collections and to technology which has been moved to handle material in one language to another. The term Multilingual Information Retrieval (MLIR) involves the study of systems that accept queries for information in various languages and return objects of various languages, translated into the user's language. Cross-language information retrieval refers more specifically to the use case where users formulate their information need in one language and the system retrieves relevant documents in another. To do so, most CLIR systems use various translation techniques. CLIR techniques can be classified into different categories based on different translation resources:

In information science and information retrieval, relevance denotes how well a retrieved document or set of documents meets the information need of the user. Relevance may include concerns such as timeliness, authority or novelty of the result.

A query language, also known as data query language or database query language (DQL), is a computer language used to make queries in databases and information systems. In database systems, query languages rely on strict theory to retrieve information. A well known example is the Structured Query Language (SQL).

Relevance feedback is a feature of some information retrieval systems. The idea behind relevance feedback is to take the results that are initially returned from a given query, to gather user feedback, and to use information about whether or not those results are relevant to perform a new query. We can usefully distinguish between three types of feedback: explicit feedback, implicit feedback, and blind or "pseudo" feedback.

Multi-document summarization is an automatic procedure aimed at extraction of information from multiple texts written about the same topic. The resulting summary report allows individual users, such as professional information consumers, to quickly familiarize themselves with information contained in a large cluster of documents. In such a way, multi-document summarization systems are complementing the news aggregators performing the next step down the road of coping with information overload.

Query expansion (QE) is the process of reformulating a given query to improve retrieval performance in information retrieval operations, particularly in the context of query understanding. In the context of search engines, query expansion involves evaluating a user's input and expanding the search query to match additional documents. Query expansion involves techniques such as:

Human–computer information retrieval (HCIR) is the study and engineering of information retrieval techniques that bring human intelligence into the search process. It combines the fields of human-computer interaction (HCI) and information retrieval (IR) and creates systems that improve search by taking into account the human context, or through a multi-step search process that provides the opportunity for human feedback.

<span class="mw-page-title-main">Information Retrieval Facility</span> Organization in Vienna, Austria 2006–2012

The Information Retrieval Facility (IRF), founded 2006 and located in Vienna, Austria, was a research platform for networking and collaboration for professionals in the field of information retrieval. It ceased operations in 2012.

A concept search is an automated information retrieval method that is used to search electronically stored unstructured text for information that is conceptually similar to the information provided in a search query. In other words, the ideas expressed in the information retrieved in response to a concept search query are relevant to the ideas contained in the text of the query.

Ranking of query is one of the fundamental problems in information retrieval (IR), the scientific/engineering discipline behind search engines. Given a query q and a collection D of documents that match the query, the problem is to rank, that is, sort, the documents in D according to some criterion so that the "best" results appear early in the result list displayed to the user. Ranking in terms of information retrieval is an important concept in computer science and is used in many different applications such as search engine queries and recommender systems. A majority of search engines use ranking algorithms to provide users with accurate and relevant results.

The Cranfield experiments were a series of experimental studies in information retrieval conducted by Cyril W. Cleverdon at the College of Aeronautics, today known as Cranfield University, in the 1960s to evaluate the efficiency of indexing systems. The experiments were broken into two main phases, neither of which was computerized. The entire collection of abstracts, resulting indexes and results were later distributed in electronic format and were widely used for decades.

XML retrieval, or XML information retrieval, is the content-based retrieval of documents structured with XML. As such it is used for computing relevance of XML documents.

RetrievalWare is an enterprise search engine emphasizing natural language processing and semantic networks which was commercially available from 1992 to 2007 and is especially known for its use by government intelligence agencies.

<span class="mw-page-title-main">LGTE</span>

Lucene Geographic and Temporal (LGTE) is an information retrieval tool developed at Technical University of Lisbon which can be used as a search engine or as evaluation system for information retrieval techniques for research purposes. The first implementation powered by LGTE was the search engine of DIGMAP, a project co-funded by the community programme eContentplus between 2006 and 2008, which was aimed to provide services available on the web over old digitized maps from a group of partners over Europe including several National Libraries.

Learning to rank or machine-learned ranking (MLR) is the application of machine learning, typically supervised, semi-supervised or reinforcement learning, in the construction of ranking models for information retrieval systems. Training data may, for example, consist of lists of items with some partial order specified between items in each list. This order is typically induced by giving a numerical or ordinal score or a binary judgment for each item. The goal of constructing the ranking model is to rank new, unseen lists in a similar way to rankings in the training data.

<span class="mw-page-title-main">Conference and Labs of the Evaluation Forum</span>

The Conference and Labs of the Evaluation Forum, or CLEF, is an organization promoting research in multilingual information access. Its specific functions are to maintain an underlying framework for testing information retrieval systems and to create repositories of data for researchers to use in developing comparable standards. The organization holds a conference every September in Europe since a first constituting workshop in 2000. From 1997 to 1999, TREC, the similar evaluation conference organised annually in the US, included a track for the evaluation of Cross-Language IR for European languages. This track was coordinated jointly by NIST and by a group of European volunteers that grew over the years. At the end of 1999, a decision by some of the participants was made to transfer the activity to Europe and set it up independently. The aim was to expand coverage to a larger number of languages and to focus on a wider range of issues, including monolingual system evaluation for languages other than English. Over the years, CLEF has been supported by a number of various EU funded projects and initiatives.

Temporal information retrieval (T-IR) is an emerging area of research related to the field of information retrieval (IR) and a considerable number of sub-areas, positioning itself, as an important dimension in the context of the user information needs.

Evaluation measures for an information retrieval (IR) system assess how well an index, search engine, or database returns results from a collection of resources that satisfy a user's query. They are therefore fundamental to the success of information systems and digital platforms.

References

  1. Brent R. Rowe; Dallas W. Wood; Albert N. Link; Diglio A. Simoni (July 2010). "Economic Impact Assessment of NIST's Text REtrieval Conference (TREC) Program" (PDF). RTI International.
  2. Hal Varian (March 4, 2008). "Why data matters".
  3. "Data - English Relevance Judgements". National Institute of Standards and Technology. Retrieved 18 September 2023.
  4. Chowdhury, G. G (2003). Introduction to modern information retrieval. Landon: Facet publishing. pp. 269–279. ISBN   978-1856044806.
  5. "TREC Tracks". trec.nist.gov. Archived from the original on March 31, 2019. Retrieved 2024-07-19.
  6. "Knowledge Base Acceleration Track". NIST.gov. 2014-06-30. Retrieved 2020-11-04.
  7. From TREC homepage: "... effectiveness approximately doubled in the first six years of TREC"
  8. "NIST Investment Significantly Improved Search Engines". Rti.org. Archived from the original on 2011-11-18. Retrieved 2012-01-19.
  9. "Planning Report 10-1: Economic Impact Assessment of NIST's Text REtrieval Conference (TREC) Program" (PDF). National Institute of Standards and Technology . December 2010.
  10. Timothy G. Armstrong, Alistair Moffat, William Webber, Justin Zobel. Improvements that don't add up: ad hoc retrieval results since 1998. CIKM 2009. ACM.
  11. Varian, Hal (March 4, 2009). "Why data matters". Google via Blogspot.
  12. The 451 Group: Standards in e-Discovery -- walking the walk
  13. IBM and Jeopardy! Relive History with Encore Presentation of Jeopardy!: The IBM Challenge
  14. Ferrucci, David; Brown, Eric; Chu-Carroll, Jennifer; Fan, James; Gondek, David; Kalyanpur, Aditya A.; Lally, Adam; Murdock, J. William; Nyberg, Eric. "Building Watson: An Overview of the DeepQA Project" (PDF). Association for the Advancement of Artificial Intelligence . Archived from the original (PDF) on December 15, 2011.
  15. "Participants - IRF Wiki". Wiki.ir-facility.org. 2009-12-01. Archived from the original on 2012-02-23. Retrieved 2012-01-19.
  16. Oard, Douglas W.; Hedin, Bruce; Tomlinson, Stephen; Baron, Jason R. "Overview of the TREC 2008 Legal Track" (PDF). National Institute of Standards and Technology .
  17. "Text REtrieval Conference (TREC) TREC 2008 Million Query Track Results". Trec.nist.gov. Retrieved 2012-01-19.