Founded | 1962 |
---|---|
Type | Professional organization |
Focus | Computational linguistics and natural language processing |
Origins | Association for Machine Translation and Computational Linguistics |
Area served | Worldwide |
Method | Conferences, publications |
Website | www |
The Association for Computational Linguistics (ACL) is a scientific and professional organization for people working on natural language processing. [1] Its namesake conference is one of the primary high impact conferences for natural language processing research, along with EMNLP. [2] [3] The conference is held each summer in locations where significant computational linguistics research is carried out.
It was founded in 1962, originally named the Association for Machine Translation and Computational Linguistics (AMTCL). It became the ACL in 1968. [4] The ACL has a European (EACL), [5] a North American (NAACL), [6] and an Asian (AACL) [7] chapter.
The ACL was founded in 1962 as the Association for Machine Translation and Computational Linguistics (AMTCL). The initial membership was about 100. In 1965, the AMTCL took over the journal Mechanical Translation and Computational Linguistics . This journal was succeeded by many other journals: the American Journal of Computational Linguistics (1974–1978, 1980–1983), and then Computational Linguistics (1984–present). [8] Since 1988, the journal has been published for the ACL by MIT Press. [9] [10]
The annual meeting was first held in 1963 in conjunction with the Association for Computing Machinery National Conference. [11] The annual meeting was, for a long time, relatively informal and did not publish anything longer than abstracts. By 1968, the society took on its current name, the Association for Computational Linguistics (ACL). The publication of the annual meeting's Proceedings of the ACL began in 1979 and gradually matured into its modern form. [8] Many of the meetings were held in conjunction with the Linguistic Society of America, and a few with the American Society for Information Science and the Cognitive Science Society. [11]
The United States government sponsored much research from 1989 to 1994, characterized by an increase in author retention rates and an increase in research in some key topics, such as speech recognition, in ACL. By the 21st century, it was able to maintain authors at a high rate who coalesced in a more stable arrangement around individual research topics. [12]
In 2020, the annual meeting of the ACL received more submissions from China than from the United States for the first time. [13]
From ACL website [14]
The ACL organizes several of the top conferences and workshops in the field of computational linguistics and natural language processing. These include:
Besides conferences, the ACL also sponsors the journals Computational Linguistics and Transactions of the Association for Computational Linguistics (TACL). Papers and other presentations at ACL and ACL-affiliated venues are archived online in the open-access ACL Anthology. [15]
ACL has a large number of Special Interest Groups (SIGs), focusing on specific areas of natural language processing. Some current SIGs within ACL are: [16]
SIG | Description |
---|---|
SIGANN | Linguistic Annotation |
SIGBIOMED | Biomedical Language Processing |
SIGDAT | Linguistic data and corpus-based approaches |
SIGDIAL | Dialogue Processing |
SIGFSM | Finite State Methods |
SIGGEN Archived 12 May 2008 at the Wayback Machine | Natural Language Generation |
SIGHAN | Chinese Language Processing |
SIGHUM | Language Technologies for the Socio-Economic Sciences and the Humanities |
SIGLEX | Lexicon: the umbrella organization for the SemEval semantic evaluations and SENSEVAL word-sense evaluations |
SIGMT | Machine Translation |
SIGMOL | Mathematics of Language |
SIGMORPHON | Computational Morphology and Phonology |
SIGNLL | Natural Language Learning |
SIGPARSE | Natural Language Parsing |
SIGSEM | Computational Semantics |
SIGSEMITIC | Computational Approaches to Semitic Languages |
SIGSLAV | NLP for Slavic Languages |
SIGSLPAT | Speech & Language Processing for Assistive Technologies |
SIGTYP | Typology |
SIGWAC | Web as Corpus |
Each year, the ACL elects a distinguished computational linguist who becomes vice-president of the organization in the next calendar year and president one year later. Recent ACL presidents are: [17]
Year | Name |
---|---|
2022 | Tim Baldwin |
2021 | Rada Mihalcea |
2020 | Hinrich Schütze |
2019 | Zhou Ming |
2018 | Marti Hearst |
2017 | Joakim Nivre |
2016 | Pushpak Bhattacharyya |
2015 | Christopher D. Manning |
2014 | Gertjan van Noord |
2013 | Haifeng Wang |
2012 | Ken Church |
2011 | Kevin Knight |
2010 | Ido Dagan |
2009 | Steven Bird |
2008 | Bonnie Dorr |
2007 | Mark Steedman |
2006 | Jun'ichi Tsujii |
2005 | Martha Palmer |
2004 | Johanna Moore |
2003 | Mark Johnson |
2002 | John Nerbonne |
2001 | Eduard Hovy |
2000 | Wolfgang Wahlster |
Natural language processing (NLP) is an interdisciplinary subfield of computer science and linguistics. It is primarily concerned with giving computers the ability to support and manipulate human language. It involves processing natural language datasets, such as text corpora or speech corpora, using either rule-based or probabilistic machine learning approaches. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves.
Word-sense disambiguation (WSD) is the process of identifying which sense of a word is meant in a sentence or other segment of context. In human language processing and cognition, it is usually subconscious/automatic but can often come to conscious attention when ambiguity impairs clarity of communication, given the pervasive polysemy in natural language. In computational linguistics, it is an open problem that affects other computer-related writing, such as discourse, improving relevance of search engines, anaphora resolution, coherence, and inference.
AFNLP is the organization for coordinating the natural language processing related activities and events in the Asia-Pacific region.
The North American Chapter of the Association for Computational Linguistics (NAACL) provides a regional focus for members of the Association for Computational Linguistics (ACL) in North America as well as in Central and South America, organizes annual conferences, promotes cooperation and information exchange among related scientific and professional societies, encourages and facilitates ACL membership by people and institutions in the Americas, and provides a source of information on regional activities for the ACL Executive Committee.
Language resource management Lexical markup framework, is the International Organization for Standardization ISO/TC37 standard for natural language processing (NLP) and machine-readable dictionary (MRD) lexicons. The scope is standardization of principles and methods relating to language resources in the contexts of multilingual communication.
Dragomir R. Radev was an American computer scientist who was a professor at Yale University, working on natural language processing and information retrieval. He also served as a University of Michigan computer science professor and Columbia University computer science adjunct professor, as well as a Member of the Advisory Board of Lawyaw.
BabelNet is a multilingual lexicalized semantic network and ontology developed at the NLP group of the Sapienza University of Rome. BabelNet was automatically created by linking Wikipedia to the most popular computational lexicon of the English language, WordNet. The integration is done using an automatic mapping and by filling in lexical gaps in resource-poor languages by using statistical machine translation. The result is an encyclopedic dictionary that provides concepts and named entities lexicalized in many languages and connected with large amounts of semantic relations. Additional lexicalizations and definitions are added by linking to free-license wordnets, OmegaWiki, the English Wiktionary, Wikidata, FrameNet, VerbNet and others. Similarly to WordNet, BabelNet groups words in different languages into sets of synonyms, called Babel synsets. For each Babel synset, BabelNet provides short definitions in many languages harvested from both WordNet and Wikipedia.
LEPOR is an automatic language independent machine translation evaluation metric with tunable parameters and reinforced factors.
In natural language processing (NLP), a word embedding is a representation of a word. The embedding is used in text analysis. Typically, the representation is a real-valued vector that encodes the meaning of the word in such a way that the words that are closer in the vector space are expected to be similar in meaning. Word embeddings can be obtained using language modeling and feature learning techniques, where words or phrases from the vocabulary are mapped to vectors of real numbers.
Empirical Methods in Natural Language Processing (EMNLP) is a leading conference in the area of natural language processing and artificial intelligence. Along with the Association for Computational Linguistics (ACL) and the North American Chapter of the Association for Computational Linguistics (NAACL), it is one of the three primary high impact conferences for natural language processing research. EMNLP is organized by the ACL special interest group on linguistic data (SIGDAT) and was started in 1996, based on an earlier conference series called Workshop on Very Large Corpora (WVLC).
Native-language identification (NLI) is the task of determining an author's native language (L1) based only on their writings in a second language (L2). NLI works through identifying language-usage patterns that are common to specific L1 groups and then applying this knowledge to predict the native language of previously unseen texts. This is motivated in part by applications in second-language acquisition, language teaching and forensic linguistics, amongst others.
UBY is a large-scale lexical-semantic resource for natural language processing (NLP) developed at the Ubiquitous Knowledge Processing Lab (UKP) in the department of Computer Science of the Technische Universität Darmstadt . UBY is based on the ISO standard Lexical Markup Framework (LMF) and combines information from several expert-constructed and collaboratively constructed resources for English and German.
A confusion network is a natural language processing method that combines outputs from multiple automatic speech recognition or machine translation systems. Confusion networks are simple linear directed acyclic graphs with the property that each a path from the start node to the end node goes through all the other nodes. The set of words represented by edges between two nodes is called a confusion set. In machine translation, the defining characteristic of confusion networks is that they allow multiple ambiguous inputs, deferring committal translation decisions until later stages of processing. This approach is used in the open source machine translation software Moses and the proprietary translation API in IBM Bluemix Watson.
Paraphrase or paraphrasing in computational linguistics is the natural language processing task of detecting and generating paraphrases. Applications of paraphrasing are varied including information retrieval, question answering, text summarization, and plagiarism detection. Paraphrasing is also useful in the evaluation of machine translation, as well as semantic parsing and generation of new samples to expand existing corpora.
Ruslan Mitkov is a professor at Lancaster University, and a researcher in Natural Language Processing and Computational Linguistics. He completed his PhD at Technical University of Dresden under the supervision of Nikolaus Joachim Lehmann. He has published more than 240 refereed papers and is best known for his contributions to Anaphora Resolution, and his seminal work in computer-aided generation of multiple-choice tests among others.
Martha (Stone) Palmer is an American computer scientist. She is best known for her work on verb semantics, and for the creation of ontological resources such as PropBank and VerbNet.
Bidirectional Encoder Representations from Transformers (BERT) is a language model based on the transformer architecture, notable for its dramatic improvement over previous state of the art models. It was introduced in October 2018 by researchers at Google. A 2020 literature survey concluded that "in a little over a year, BERT has become a ubiquitous baseline in Natural Language Processing (NLP) experiments counting over 150 research publications analyzing and improving the model."
Mona Talat Diab is a computer science professor and director of Carnegie Mellon University's Language Technologies Institute. Previously, she was a professor at George Washington University and a research scientist with Facebook AI. Her research focuses on natural language processing, computational linguistics, cross lingual/multilingual processing, computational socio-pragmatics, Arabic language processing, and applied machine learning.
Ellen Riloff is an American computer scientist currently serving as a professor at the School of Computing at the University of Utah. Her research focuses on Natural Language Processing and Computational Linguistics, specifically information extraction, sentiment analysis, semantic class induction, and bootstrapping methods that learn from unannotated texts.
Ani Nenkova is Principal Scientist at Adobe Research, currently on leave from her position as an Associate Professor of Computer and Information Science at the University of Pennsylvania. Her research focuses on computational linguistics and artificial intelligence, with an emphasis on developing computational methods for analysis of text quality and style, discourse, affect recognition, and summarization.