Madeleine Ashcraft Bates (born c. 1948) is a researcher in natural language processing who worked at BBN Technologies in Cambridge, Massachusetts from the early 1970s to the late 1990s. [1] She was president of the Association for Computational Linguistics in 1985, [2] and co-editor of the book Challenges in Natural Language Processing (1993). [3]
Bates was a student at Allegheny College before transferring to Carnegie Mellon University, [4] where she majored in mathematics, graduating in 1968. She completed her Ph.D. in applied mathematics at Harvard University in 1975, [5] working there with Bill Woods on augmented transition networks. [6]
While a student at Harvard, she began working part-time at BBN in 1971. After completing her Ph.D., she was an assistant professor at Boston University for three years before becoming a full-time researcher at BBN. [5]
Bates married chemist Alan Hunt Bates in summer 1968; [4] he later became a professor at the University of Massachusetts Dartmouth. Her mother, Madeleine DeMuth Ashcraft (died 1990), was a long-term sufferer of Huntington's disease, [7] and Bates has been an activist for the treatment of Huntington's disease, serving as president of the Massachusetts Chapter of the committee to Combat Huntington's Disease. [8]
Computational linguistics is an interdisciplinary field concerned with the computational modelling of natural language, as well as the study of appropriate computational approaches to linguistic questions. In general, computational linguistics draws upon linguistics, computer science, artificial intelligence, mathematics, logic, philosophy, cognitive science, cognitive psychology, psycholinguistics, anthropology and neuroscience, among others.
Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves.
Word-sense disambiguation (WSD) is the process of identifying which sense of a word is meant in a sentence or other segment of context. In human language processing and cognition, it is usually subconscious/automatic but can often come to conscious attention when ambiguity impairs clarity of communication, given the pervasive polysemy in natural language. In computational linguistics, it is an open problem that affects other computer-related writing, such as discourse, improving relevance of search engines, anaphora resolution, coherence, and inference.
SPL is an abstract notation representing the semantics of a sentence in natural language. In a classical Natural Language Generation (NLG) workflow, an initial text plan is transformed by a sentence planner (generator) component to a sequence of sentence plans modelled in a Sentence Plan Language. A surface generator can be used to transform the SPL notation into natural language sentences.
In linguistics, a treebank is a parsed text corpus that annotates syntactic or semantic sentence structure. The construction of parsed corpora in the early 1990s revolutionized computational linguistics, which benefitted from large-scale empirical data.
Frederick Jelinek was a Czech-American researcher in information theory, automatic speech recognition, and natural language processing. He is well known for his oft-quoted statement, "Every time I fire a linguist, the performance of the speech recognizer goes up".
Eric Brill is a computer scientist specializing in natural language processing. He created the Brill tagger, a supervised part of speech tagger. Another research paper of Brill introduced a machine learning technique now known as transformation-based learning.
In linguistics, statistical semantics applies the methods of statistics to the problem of determining the meaning of words or phrases, ideally through unsupervised learning, to a degree of precision at least sufficient for the purpose of information retrieval.
A phrasal template is a phrase-long collocation that contains one or several empty slots which may be filled by words to produce individual phrases.
In natural language processing, semantic role labeling is the process that assigns labels to words or phrases in a sentence that indicates their semantic role in the sentence, such as that of an agent, goal, or result.
Jun'ichi Tsujii is a Japanese computer scientist specializing in natural language processing and text mining, particularly in the field of biology and bioinformatics.
William Aaron Woods, generally known as Bill Woods, is a researcher in natural language processing, continuous speech understanding, knowledge representation, and knowledge-based search technology. He is currently a Software Engineer at Google.
CICLing is an annual conference on computational linguistics (CL) and natural language processing (NLP). The first CICLing conference was held in 2000 in Mexico City. The conference is attended by one to two hundred of NLP and CL researchers and students every year. As of 2017, it is ranked within top 20 sources on computational linguistics by Google Scholar. Past CICLing conferences have been held in Mexico, Korea, Israel, Romania, Japan, India, Greece, Nepal, Egypt, Turkey, Hungary, and Vietnam; the 2019 event was held in France.
The term "string grammar" in computational linguistics refers to the structure of a specific language, such that it can be formatted as a single continuous string of text, without the need to have line-breaks to alter the meaning. The appearance of any text in "column 1" of a line does not change the meaning of that text in a string grammar. A string grammar can be used to describe the structure of some natural languages, such as English or French, as well as for some computer languages.
LEPOR is an automatic language independent machine translation evaluation metric with tunable parameters and reinforced factors.
Abstract Meaning Representation (AMR) is a semantic representation language. AMR graphs are rooted, labeled, directed, acyclic graphs (DAGs), comprising whole sentences. They are intended to abstract away from syntactic representations, in the sense that sentences which are similar in meaning should be assigned the same AMR, even if they are not identically worded. By nature, the AMR language is biased towards English – it is not meant to function as an international auxiliary language.
Paraphrase or paraphrasing in computational linguistics is the natural language processing task of detecting and generating paraphrases. Applications of paraphrasing are varied including information retrieval, question answering, text summarization, and plagiarism detection. Paraphrasing is also useful in the evaluation of machine translation, as well as semantic parsing and generation of new samples to expand existing corpora.
Martha (Stone) Palmer is an American computer scientist. She is best known for her work on verb semantics, and for the creation of ontological resources such as PropBank and VerbNet.
Yejin Choi is the Brett Helsel Professor of Computer Science at the University of Washington. Her research considers natural language processing and computer vision. Choi was awarded a MacArthur Fellowship in 2022.
Ellen Riloff is an American computer scientist currently serving as a professor at the School of Computing at the University of Utah. Her research focuses on Natural Language Processing and Computational Linguistics, specifically information extraction, sentiment analysis, semantic class induction, and bootstrapping methods that learn from unannotated texts.
{{citation}}
: CS1 maint: untitled periodical (link){{citation}}
: CS1 maint: untitled periodical (link)with the help of Mrs. Madeleine Bates, a graduate student who did much of the grammar development for the parser