Shlomo Argamon | |
---|---|
Born | 1967 (age 56–57) |
Education | B.S. applied mathematics, MPhil, Ph.D. computer science |
Alma mater | Carnegie-Mellon University, Yale University |
Occupation(s) | Artificial Intelligence, Computational linguistics |
Employer | Touro University System |
Known for | Computational stylistics |
Title | Associate Provost for Artificial Intelligence, Professor of Computer Science |
Website | lingcog |
Shlomo Argamon (born 1967) is an American/Israeli computer scientist and forensic linguist. He is the associate provost for artificial intelligence and professor of computer science at Touro University. [1]
Shlomo Argamon received his B.S. in applied mathematics from Carnegie-Mellon University and his MPhil and Ph.D. in computer science from Yale University, supervised by Drew McDermott. [2] He spent two years doing postdoctoral research under a Fulbright Foundation fellowship with Sarit Kraus at Bar-Ilan University in Ramat Gan, Israel. [3]
Since the late 1990s, Argamon has worked primarily on computational linguistics and machine learning, focusing on the analysis of non-denotational meaning, including computational analysis of language stylistics, sentiment analysis, [4] [5] [6] and metaphor analysis. [7] He has also published well-cited research on active learning (machine learning), [8] metalearning, [9] and robotic mapping.
Argamon is best known for his work on computational stylistics, particularly author profiling. Together with Moshe Koppel and others, he has shown how statistical analysis of word usage can determine an author's age, sex, native language, and personality type with high accuracy in English-language texts. [10] [11] [12] His work also has shown how textual features indicating differences between male and female authorship are consistent between languages and across time. [13] [14] [15]
Argamon also developed computational stylistic methods that provide insights into the meaning of stylistic differences. One of his key innovations for this purpose is the development of computational stylistic analysis using systemic functional linguistics. [16] [17] For example, together with Jeff Dodick and Paul Chase, he examined whether there are clear and consistent differences between scientific method in experimental sciences and historical sciences. Their work showed how using systemic functional features in computational stylistic analysis provides evidence for multiple scientific methodologies of the sorts posited previously by philosophers of science. [18]
Argamon has pushed for the increased use of linguistic analysis for attribution of cybersecurity attacks. He has pointed out how linguistic attribution techniques can be used to good effect on natural language texts that arise in different attack scenarios, and has provided analyses for high-profile cases such as the Sony Pictures hack, [19] [20] the Democratic National Committee cyber attacks, [21] and the Shadow Brokers NSA leak. [22] [23]
In 2013, Argamon founded the Illinois Institute of Technology Master of Data Science program, [24] which he directed until 2019. The program seeks to teach students "to think about the real problems that need to be solved, not to simply find technical solutions." Argamon views data scientists as "sensemakers", whose job is not merely to produce analytic results, but to help their clients make sense of a complex, uncertain, and fast-changing world through rigorous analysis and explanation of the data. [25] [26]
The following outline is provided as an overview and topical guide to linguistics:
Idiolect is an individual's unique use of language, including speech. This unique usage encompasses vocabulary, grammar, and pronunciation. This differs from a dialect, a common set of linguistic characteristics shared among a group of people.
Word-sense disambiguation is the process of identifying which sense of a word is meant in a sentence or other segment of context. In human language processing and cognition, it is usually subconscious.
Linguistics is the scientific study of human language. Someone who engages in this study is called a linguist. See also the Outline of linguistics, the List of phonetics topics, the List of linguists, and the List of cognitive science topics. Articles related to linguistics include:
Stylistics, a branch of applied linguistics, is the study and interpretation of texts of all types, but particularly literary texts, and spoken language with regard to their linguistic and tonal style, where style is the particular variety of language used by different individuals in different situations and settings. For example, the vernacular, or everyday language, may be used among casual friends, whereas more formal language, with respect to grammar, pronunciation or accent, and lexicon or choice of words, is often used in a cover letter and résumé and while speaking during a job interview.
Forensic linguistics, legal linguistics, or language and the law is the application of linguistic knowledge, methods, and insights to the forensic context of law, language, crime investigation, trial, and judicial procedure. It is a branch of applied linguistics.
Stylometry is the application of the study of linguistic style, usually to written language. It has also been applied successfully to music, paintings, and chess.
Sentiment analysis is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine. With the rise of deep language models, such as RoBERTa, also more difficult data domains can be analyzed, e.g., news texts where authors typically express their opinion/sentiment less explicitly.
Internet linguistics is a domain of linguistics advocated by the English linguist David Crystal. It studies new language styles and forms that have arisen under the influence of the Internet and of other new media, such as Short Message Service (SMS) text messaging. Since the beginning of human–computer interaction (HCI) leading to computer-mediated communication (CMC) and Internet-mediated communication (IMC), experts, such as Gretchen McCulloch have acknowledged that linguistics has a contributing role in it, in terms of web interface and usability. Studying the emerging language on the Internet can help improve conceptual organization, translation and web usability. Such study aims to benefit both linguists and web users combined.
Linguistics is the scientific study of language. The areas of linguistic analysis are syntax, semantics (meaning), morphology, phonetics, phonology, and pragmatics. Subdisciplines such as biolinguistics and psycholinguistics bridge many of these divisions.
Carole Elisabeth Chaski is a forensic linguist who is considered one of the leading experts in the field. Her research has led to improvements in the methodology and reliability of stylometric analysis and inspired further research on the use of this approach for authorship identification. Her contributions have served as expert testimony in several federal and state court cases in the United States and Canada. She is president of ALIAS Technology and executive director of the Institute for Linguistic Evidence, a non-profit research organization devoted to linguistic evidence.
Moshe Koppel is an American-Israeli computer scientist, Talmud scholar and political activist; he is best known for his research on authorship attribution. Together with Shlomo Argamon and Jonathan Schler, he has shown that statistical analysis of word usage in a document can be used to determine an author's gender, age, native language and personality type.
The following outline is provided as an overview of and topical guide to social science:
The following outline is provided as an overview of and topical guide to natural-language processing:
Native-language identification (NLI) is the task of determining an author's native language (L1) based only on their writings in a second language (L2). NLI works through identifying language-usage patterns that are common to specific L1 groups and then applying this knowledge to predict the native language of previously unseen texts. This is motivated in part by applications in second-language acquisition, language teaching and forensic linguistics, amongst others.
Jussi Karlgren is a Swedish computational linguist, research scientist at Spotify, and co-founder of text analytics company Gavagai AB. He holds a PhD in computational linguistics from Stockholm University, and the title of docent of language technology at Helsinki University.
General Internet Corpus of Russian (GICR) is a corpus of Russian internet texts that has been accessible on request through an online query interface since 2013. The corpus includes rich text materials from the blogosphere, social networks, major news sources and literary magazines.
Author profiling is the analysis of a given set of texts in an attempt to uncover various characteristics of the author based on stylistic- and content-based features, or to identify the author. Characteristics analysed commonly include age and gender, though more recent studies have looked at other characteristics like personality traits and occupation
Dicta,The Center for Text Analysis is an Israeli non-profit organization focused on research and education in the field of computational linguistics and its application to the Hebrew language, including the religious literature across generations.