Ellen Voorhees

Last updated

Ellen Marie Voorhees (born March 13, 1958) [1] is an American computer scientist known for her work in document retrieval, information retrieval, and natural language processing. She works in the retrieval group at the National Institute of Standards and Technology (NIST). [2]

Contents

Education and career

Voorhees was born in Bensalem Township, Pennsylvania, and was the 1976 valedictorian at Bensalem High School. [1] She did her undergraduate studies at Pennsylvania State University, graduating in 1979 with a bachelor's degree in computer science. [1] [3] She attended Cornell University where she received her master's degree and then went on to complete her Ph.D. in 1985. [3] Her dissertation, The Effectiveness and Efficiency of Agglomerative Hierarchic Clustering in Document Retrieval, was supervised by Gerard Salton. [1]

Prior to joining NIST she was a Senior Member of the Technical Staff at Siemens Corporate Research in Princeton, NJ, where her work on intelligent agents applied to information access resulted in numerous patents. [3] A dedicated researcher and prolific writer, she is the author of hundreds of technical papers.

Recognition

Voorhees was elected as an ACM Fellow in 2018 for "contributions in evaluation of information retrieval, question answering, and other language technologies". Voorhees is a member of the Association for the Advancement of Artificial Intelligence and the Association for Computational Linguistics (ACL), and has been elected as a fellow of the Washington Academy of Sciences. She has published numerous articles on information retrieval techniques and evaluation methodologies and serves on the review boards of several journals and conferences. [4]

In 2023 Voorhees was awarded an Honorary Doctor of Science Degree from the University of Glasgow in recognition of her body of work in the evaluation of information retrieval, question answering, and other language technologies. [5]

In 2024 Voorhees received the Gerard Salton Award, a lifetime achievement award given by ACM's Special Interest Group on Information Retrieval (SIGIR).

Related Research Articles

Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an information need. The information need can be specified in the form of a search query. In the case of document retrieval, queries can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds.

In information science and information retrieval, relevance denotes how well a retrieved document or set of documents meets the information need of the user. Relevance may include concerns such as timeliness, authority or novelty of the result.

Gerard A. "Gerry" Salton was a professor of Computer Science at Cornell University. Salton was perhaps the leading computer scientist working in the field of information retrieval during his time, and "the father of Information Retrieval". His group at Cornell developed the SMART Information Retrieval System, which he initiated when he was at Harvard. It was the very first system to use the now popular vector space model for information retrieval.

The SMART Information Retrieval System is an information retrieval system developed at Cornell University in the 1960s. Many important concepts in information retrieval were developed as part of research on the SMART system, including the vector space model, relevance feedback, and Rocchio classification.

<span class="mw-page-title-main">C. J. van Rijsbergen</span> Dutch computer scientist

C. J. "Keith" van Rijsbergen FREng is a professor of computer science at the University of Glasgow, where he founded the Glasgow Information Retrieval Group. He is one of the founders of modern Information Retrieval and the author of the seminal monograph Information Retrieval and of the textbook The Geometry of Information Retrieval.

<span class="mw-page-title-main">Text Retrieval Conference</span> Meetings for information retrieval research

The Text REtrieval Conference (TREC) is an ongoing series of workshops focusing on a list of different information retrieval (IR) research areas, or tracks. It is co-sponsored by the National Institute of Standards and Technology (NIST) and the Intelligence Advanced Research Projects Activity, and began in 1992 as part of the TIPSTER Text program. Its purpose is to support and encourage research within the information retrieval community by providing the infrastructure necessary for large-scale evaluation of text retrieval methodologies and to increase the speed of lab-to-product transfer of technology.

The Gerard Salton Award is presented by the Association for Computing Machinery (ACM) Special Interest Group on Information Retrieval (SIGIR) every three years to an individual who has made "significant, sustained and continuing contributions to research in information retrieval". SIGIR also co-sponsors the Vannevar Bush Award, for the best paper at the Joint Conference on Digital Libraries.

<span class="mw-page-title-main">Susan Dumais</span> American computer scientist

Susan Dumais is an American computer scientist who is a leader in the field of information retrieval, and has been a significant contributor to Microsoft's search technologies. According to Mary Jane Irwin, who heads the Athena Lecture awards committee, “Her sustained contributions have shaped the thinking and direction of human-computer interaction and information retrieval."

In computer science, an inverted index is a database index storing a mapping from content, such as words or numbers, to its locations in a table, or in a document or a set of documents. The purpose of an inverted index is to allow fast full-text searches, at a cost of increased processing when a document is added to the database. The inverted file may be the database file itself, rather than its index. It is the most popular data structure used in document retrieval systems, used on a large scale for example in search engines. Additionally, several significant general-purpose mainframe-based database management systems have used inverted list architectures, including ADABAS, DATACOM/DB, and Model 204.

Paul B. Kantor is an American information scientist. He is Distinguished Professor Emeritus of Information Science at Rutgers University in New Jersey, and an Honorary Research Associate in Industrial and Systems Engineering at the University of Wisconsin, Madison.

Nicholas J. Belkin is a professor at the School of Communication and Information at Rutgers University. Among the main themes of his research are digital libraries; information-seeking behaviors; and interaction between humans and information retrieval systems. Belkin is best known for his work on human-centered Information Retrieval and the hypothesis of Anomalous State of Knowledge (ASK). Belkin realized that in many cases, users of search systems are unable to precisely formulate what they need. They miss some vital knowledge to formulate their queries. In such cases it is more suitable to attempt to describe a user's anomalous state of knowledge than to ask the user to specify her/his need as a request to the system.

<span class="mw-page-title-main">Amit Singhal</span> American computer engineer

Amitabh Kumar "Amit" Singhal is a former senior vice president at Google Inc., having been a Google Fellow and the head of Google's Search team for 15 years.

<span class="mw-page-title-main">Karen Spärck Jones</span> British computer scientist (1935–2007)

Karen Ida Boalth Spärck Jones was a self-taught programmer and a pioneering British computer scientist responsible for the concept of inverse document frequency (IDF), a technology that underlies most modern search engines. She was an advocate for women in computer science, her slogan being, "Computing is too important to be left to men." In 2019, The New York Times published her belated obituary in its series Overlooked, calling her "a pioneer of computer science for work combining statistics and linguistics, and an advocate for women in the field." From 2008, to recognize her achievements in the fields of information retrieval (IR) and natural language processing (NLP), the Karen Spärck Jones Award is awarded to a new recipient with outstanding research in one or both of her fields.

<span class="mw-page-title-main">Ruzena Bajcsy</span> American computer scientist

Ruzena Bajcsy is an American engineer and computer scientist who specializes in robotics. She is professor of electrical engineering and computer sciences at the University of California, Berkeley, where she is also director emerita of CITRIS.

W. Bruce Croft is a distinguished professor of computer science at the University of Massachusetts Amherst whose work focuses on information retrieval. He is the founder of the Center for Intelligent Information Retrieval and served as the editor-in-chief of ACM Transactions on Information Systems from 1995 to 2002. He was also a member of the National Research Council Computer Science and Telecommunications Board from 2000 to 2003. Since 2015, he is the Dean of the College of Information and Computer Sciences at the University of Massachusetts Amherst. He was Chair of the UMass Amherst Computer Science Department from 2001 to 2007.

The Binary Independence Model (BIM) in computing and information science is a probabilistic information retrieval technique. The model makes some simple assumptions to make the estimation of document/query similarity probable and feasible.

Norbert Fuhr is a professor of computer science and the leader of the Duisburg Information Engineering Group based at the University of Duisburg-Essen, Germany.

ChengXiang Zhai is a computer scientist. He is a Donald Biggar Willett Professor in Engineering in the Department of Computer Science at the University of Illinois at Urbana-Champaign.

<span class="mw-page-title-main">Diffeo, Inc.</span> American knowledge discovery software company

Diffeo, Inc., is a software company that developed a collaborative intelligence text mining product for defense, intelligence and financial services customers.

Tefko Saracevic is professor emeritus at the School of Communication and Information at Rutgers University. He is a contributor to the theoretical concept of relevance and a prominent information scientist. He has been President of the Association for Information Science and Technology Born November 24, 1930, in Zagreb, Croatia came to U.S., 1959; naturalized, 1964.

References

  1. 1 2 3 4 Voorhees, Ellen M. (1985), The Effectiveness and Efficiency of Agglomerative Hierarchic Clustering in Document Retrieval, Cornell University via eCommons: Cornell's digital repository
  2. Ellen M. Voorhees, National Institute of Standards and Technology , retrieved 2018-12-06
  3. 1 2 3 "Dr. Ellen M. Voorheas", ACM Distinguished Speakers, Association for Computing Machinery , retrieved 2018-12-06
  4. 2018 ACM Fellows Honored for Pivotal Achievements that Underpin the Digital Age, Association for Computing Machinery, December 5, 2018
  5. "Ellen Voorhees Receives Honorary Doctor of Science Degree from the University of Glasgow". NIST. 2023-06-14.