Kimmo Matti Koskenniemi (born 7 September 1945 [1] [2] ) is the inventor of finite-state two-level models for computational phonology and morphology. He was a professor of Computational Linguistics at the University of Helsinki, Finland. In the early 1980s Koskenniemi's work became accessible by early adopters such as Lauri Karttunen, Ronald M. Kaplan and Martin Kay, first at the University of Texas Austin, [3] later at the Xerox Palo Alto Research Center. [4]
This application of finite-state transducers to phonology and morphology was initially implemented for Finnish, but it soon proved to be useful for other languages with complex morphology such as Basque [5] and Swahili. [6]
Natural language processing (NLP) is an interdisciplinary subfield of computer science and linguistics. It is primarily concerned with giving computers the ability to support and manipulate speech. It involves processing natural language datasets, such as text corpora or speech corpora, using either rule-based or probabilistic machine learning approaches. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves.
In linguistics, agglutination is a morphological process in which words are formed by stringing together morphemes, each of which corresponds to a single syntactic feature. Languages that use agglutination widely are called agglutinative languages. For example, in the agglutinative language of Turkish, the word evlerinizden consists of the morphemes ev-ler-iniz-den, literally translated morpheme-by-morpheme as house-plural-your(plural)-from. Agglutinative languages are often contrasted with isolating languages, in which words are monomorphemic, and fusional languages, in which words can be complex, but morphemes may correspond to multiple features.
Henry Kučera, born Jindřich Kučera, was a Czech-American linguist who pioneered corpus linguistics, linguistic software, a major contributor to the American Heritage Dictionary, and a pioneer in the development of spell checking computer software. He is remembered in particular as one of the initiators of the Brown Corpus.
In software, a spell checker is a software feature that checks for misspellings in a text. Spell-checking features are often embedded in software or services, such as a word processor, email client, electronic dictionary, or search engine.
A finite-state transducer (FST) is a finite-state machine with two memory tapes, following the terminology for Turing machines: an input tape and an output tape. This contrasts with an ordinary finite-state automaton, which has a single tape. An FST is a type of finite-state automaton (FSA) that maps between two sets of symbols. An FST is more general than an FSA. An FSA defines a formal language by defining a set of accepted strings, while an FST defines relations between sets of strings.
Fred Göran Karlsson is a professor emeritus of general linguistics at the University of Helsinki.
Helsingin Suomalainen Yhteiskoulu, commonly abbreviated SYK, is a free elementary, middle and high school in the Etelä-Haaga district of Helsinki, Finland.
René Paul Victor Kiparsky is a Finnish linguist and professor of linguistics at Stanford University. He is the son of the St. Petersburg (Russia)-born linguist and Baltist/ Slavicist Valentin Kiparsky.
Martin Kay was a computer scientist, known especially for his work in computational linguistics.
András Kornai, son of economist János Kornai, is a mathematical linguist. He has earned two PhDs. He earned his first in Mathematics in 1983 from Eötvös Loránd University in Budapest, where his advisor was Miklós Ajtai, and his second in Linguistics in 1991 from Stanford University, where his advisor was Paul Kiparsky.
Wolfgang U. Dressler is an Austrian professor of linguistics at the University of Vienna. Dressler is a polyglot and scholar who has contributed to various fields of linguistics, especially phonology, morphology, text linguistics, clinical linguistics and child language development. He is an important representative of the 'naturalness theory'.
Kimmo is a Finnish given name for males. Notable people with the name include:
Ronald M. Kaplan has served as a Vice President at Amazon.com and Chief Scientist for Amazon Search (A9.com). He was previously Vice President and Distinguished Scientist at Nuance Communications and director of Nuance' Natural Language and Artificial Intelligence Laboratory. Prior to that he served as Chief Scientist and a Principal Researcher at the Powerset division of Microsoft Bing. He is also an Adjunct Professor in the Linguistics Department at Stanford University and a Principal of Stanford's Center for the Study of Language and Information (CSLI). He was previously a Research Fellow at the Palo Alto Research Center, where he was the manager of research in Natural Language Theory and Technology.
Linguistics is the scientific study of language. The modern-day scientific study of linguistics takes all aspects of language into account — i.e., the cognitive, the social, the cultural, the psychological, the environmental, the biological, the literary, the grammatical, the paleographical, and the structural.
Jennifer Sandra Cole is a professor of linguistics and Director of the Prosody and Speech Dynamics Lab at Northwestern University. Her research uses experimental and computational methods to study the sound structure of language. She was the founding General Editor of Laboratory Phonology (2009–2015) and a founding member of the Association for Laboratory Phonology.
Foma is a free and open source finite-state toolkit created and maintained by Mans Hulden. It includes a compiler, programming language, and C library for constructing finite-state automata and transducers (FST's) for various uses, most typically Natural Language Processing uses such as morphological analysis.
Lauri Juhani Karttunen was an adjunct professor in linguistics at Stanford and an ACL Fellow. He died in 2022.
The following outline is provided as an overview of and topical guide to natural-language processing:
Mans Hulden is a researcher in computational linguistics currently holding the title of Assistant Professor at the Department of Linguistics of the University of Colorado Boulder. He teaches courses in computational linguistics, phonology, and phonetic and he is the creator and maintainer of the free and open source finite-state toolkit Foma.
Arvi Johannes Hurskainen is a Finnish scholar of language technology and linguistics. Since 1985 he has developed rule-based language technology mainly for Swahili, but also for other languages, including machine translation from English to Finnish. He has created a development environment called SALAMA, but it suits to any language. The major applications developed so far include the following: the spell checker for Swahili, the annotator of corpus texts, an advanced dictionary between Swahili and English and translators from Swahili to English, from English to Swahili, and from English to Finnish. He has also developed an advanced learning system for Swahili and a system for producing targeted vocabularies for language learners. Hurskainen has compiled two annotated corpora, Helsinki Corpus of Swahili 1.0 and Helsinki Corpus of Swahili 2.0.