Shlomo Argamon

Last updated
Shlomo Argamon
Born1967 (age 5657)
Education B.S. applied mathematics, MPhil, Ph.D. computer science
Alma mater Carnegie-Mellon University, Yale University
Occupation(s) Artificial Intelligence, Computational linguistics
Employer Touro University System
Known for Computational stylistics
TitleAssociate Provost for Artificial Intelligence, Professor of Computer Science
Website lingcog.blogspot.com

Shlomo Argamon (born 1967) is an American/Israeli computer scientist and forensic linguist. He is the associate provost for artificial intelligence and professor of computer science at Touro University. [1]

Contents

Education

Shlomo Argamon received his B.S. in applied mathematics from Carnegie-Mellon University and his MPhil and Ph.D. in computer science from Yale University, supervised by Drew McDermott. [2] He spent two years doing postdoctoral research under a Fulbright Foundation fellowship with Sarit Kraus at Bar-Ilan University in Ramat Gan, Israel. [3]

Research

Since the late 1990s, Argamon has worked primarily on computational linguistics and machine learning, focusing on the analysis of non-denotational meaning, including computational analysis of language stylistics, sentiment analysis, [4] [5] [6] and metaphor analysis. [7] He has also published well-cited research on active learning (machine learning), [8] metalearning, [9] and robotic mapping.

Computational stylistics

Argamon is best known for his work on computational stylistics, particularly author profiling. Together with Moshe Koppel and others, he has shown how statistical analysis of word usage can determine an author's age, sex, native language, and personality type with high accuracy in English-language texts. [10] [11] [12] His work also has shown how textual features indicating differences between male and female authorship are consistent between languages and across time. [13] [14] [15]

Argamon also developed computational stylistic methods that provide insights into the meaning of stylistic differences. One of his key innovations for this purpose is the development of computational stylistic analysis using systemic functional linguistics. [16] [17] For example, together with Jeff Dodick and Paul Chase, he examined whether there are clear and consistent differences between scientific method in experimental sciences and historical sciences. Their work showed how using systemic functional features in computational stylistic analysis provides evidence for multiple scientific methodologies of the sorts posited previously by philosophers of science. [18]

Linguistics for cybersecurity

Argamon has pushed for the increased use of linguistic analysis for attribution of cybersecurity attacks. He has pointed out how linguistic attribution techniques can be used to good effect on natural language texts that arise in different attack scenarios, and has provided analyses for high-profile cases such as the Sony Pictures hack, [19] [20] the Democratic National Committee cyber attacks, [21] and the Shadow Brokers NSA leak. [22] [23]

Data science

In 2013, Argamon founded the Illinois Institute of Technology Master of Data Science program, [24] which he directed until 2019. The program seeks to teach students "to think about the real problems that need to be solved, not to simply find technical solutions." Argamon views data scientists as "sensemakers", whose job is not merely to produce analytic results, but to help their clients make sense of a complex, uncertain, and fast-changing world through rigorous analysis and explanation of the data. [25] [26]

Honors

Related Research Articles

The following outline is provided as an overview and topical guide to linguistics:

Idiolect is an individual's unique use of language, including speech. This unique usage encompasses vocabulary, grammar, and pronunciation. This differs from a dialect, a common set of linguistic characteristics shared among a group of people.

Word-sense disambiguation is the process of identifying which sense of a word is meant in a sentence or other segment of context. In human language processing and cognition, it is usually subconscious.

Linguistics is the scientific study of human language. Someone who engages in this study is called a linguist. See also the Outline of linguistics, the List of phonetics topics, the List of linguists, and the List of cognitive science topics. Articles related to linguistics include:

Stylistics, a branch of applied linguistics, is the study and interpretation of texts of all types, but particularly literary texts, and spoken language with regard to their linguistic and tonal style, where style is the particular variety of language used by different individuals in different situations and settings. For example, the vernacular, or everyday language, may be used among casual friends, whereas more formal language, with respect to grammar, pronunciation or accent, and lexicon or choice of words, is often used in a cover letter and résumé and while speaking during a job interview.

<span class="mw-page-title-main">Forensic linguistics</span> Application of linguistics to forensics

Forensic linguistics, legal linguistics, or language and the law is the application of linguistic knowledge, methods, and insights to the forensic context of law, language, crime investigation, trial, and judicial procedure. It is a branch of applied linguistics.

Stylometry is the application of the study of linguistic style, usually to written language. It has also been applied successfully to music, paintings, and chess.

Sentiment analysis is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine. With the rise of deep language models, such as RoBERTa, also more difficult data domains can be analyzed, e.g., news texts where authors typically express their opinion/sentiment less explicitly.

<span class="mw-page-title-main">Internet linguistics</span> Domain of linguistics

Internet linguistics is a domain of linguistics advocated by the English linguist David Crystal. It studies new language styles and forms that have arisen under the influence of the Internet and of other new media, such as Short Message Service (SMS) text messaging. Since the beginning of human–computer interaction (HCI) leading to computer-mediated communication (CMC) and Internet-mediated communication (IMC), experts, such as Gretchen McCulloch have acknowledged that linguistics has a contributing role in it, in terms of web interface and usability. Studying the emerging language on the Internet can help improve conceptual organization, translation and web usability. Such study aims to benefit both linguists and web users combined.

Linguistics is the scientific study of language. The areas of linguistic analysis are syntax, semantics (meaning), morphology, phonetics, phonology, and pragmatics. Subdisciplines such as biolinguistics and psycholinguistics bridge many of these divisions.

Carole Elisabeth Chaski is a forensic linguist who is considered one of the leading experts in the field. Her research has led to improvements in the methodology and reliability of stylometric analysis and inspired further research on the use of this approach for authorship identification. Her contributions have served as expert testimony in several federal and state court cases in the United States and Canada. She is president of ALIAS Technology and executive director of the Institute for Linguistic Evidence, a non-profit research organization devoted to linguistic evidence.

<span class="mw-page-title-main">Moshe Koppel</span> American-Israeli computer scientist (born 1956)

Moshe Koppel is an American-Israeli computer scientist, Talmud scholar and political activist; he is best known for his research on authorship attribution. Together with Shlomo Argamon and Jonathan Schler, he has shown that statistical analysis of word usage in a document can be used to determine an author's gender, age, native language and personality type.

The following outline is provided as an overview of and topical guide to social science:

The following outline is provided as an overview of and topical guide to natural-language processing:

Native-language identification (NLI) is the task of determining an author's native language (L1) based only on their writings in a second language (L2). NLI works through identifying language-usage patterns that are common to specific L1 groups and then applying this knowledge to predict the native language of previously unseen texts. This is motivated in part by applications in second-language acquisition, language teaching and forensic linguistics, amongst others.

<span class="mw-page-title-main">Jussi Karlgren</span>

Jussi Karlgren is a Swedish computational linguist, research scientist at Spotify, and co-founder of text analytics company Gavagai AB. He holds a PhD in computational linguistics from Stockholm University, and the title of docent of language technology at Helsinki University.

General Internet Corpus of Russian (GICR) is a corpus of Russian internet texts that has been accessible on request through an online query interface since 2013. The corpus includes rich text materials from the blogosphere, social networks, major news sources and literary magazines.

<span class="mw-page-title-main">Author profiling</span> System to identify an author

Author profiling is the analysis of a given set of texts in an attempt to uncover various characteristics of the author based on stylistic- and content-based features, or to identify the author. Characteristics analysed commonly include age and gender, though more recent studies have looked at other characteristics like personality traits and occupation

Dicta,The Center for Text Analysis is an Israeli non-profit organization focused on research and education in the field of computational linguistics and its application to the Hebrew language, including the religious literature across generations.

References

  1. "Touro University Announces New Associate Provost for AI". www.touro.edu. Retrieved 2024-01-06.
  2. "Tech Reports". cs.yale.edu.
  3. "Shlomo Engelson Argamon | The Linguistic Summer Institute 2015". lsa2015.uchicago.edu. Retrieved 2023-02-06.
  4. Kenneth Bloom, Navendu Garg, and Shlomo Argamon. Extracting appraisal expressions. In Proc. Human Language Technologies: Conference of the North American Association for Computational Linguistics (NAACL-HLT), Rochester, New York, April, 2007.
  5. Casey Whitelaw, Navendu Garg, and Shlomo Argamon. Using appraisal groups for sentiment analysis. In Proc. Conference on Information and Knowledge Management, Bremen, Germany, November 2005.
  6. Shlomo Argamon, Ken Bloom, Andrea Esuli, and Fabrizio Sebastiani. Automatically Determining Attitude Type and Force for Sentiment Analysis. 3rd Language and Technology Conference, Poznan, Poland, October 2007.
  7. Lisa Gandy, Nadji Allan, Mark Atallah, Ophir Frieder, Newton Howard, Sergey Kanareykin, Moshe Koppel, Mark Last, Yair Neuman, Shlomo Argamon. Automatic identification of conceptual metaphors with limited knowledge. In Proc. Twenty-Seventh AAAI Conference on Artificial Intelligence (AAAI-13), Bellevue, WA, July 2013.
  8. Shlomo Argamon-Engelson and Ido Dagan. Committee-based sample selection for probabilistic classifiers. Journal of Artificial Intelligence Research, 11:335-360, 1999.
  9. Julio Ortega, Moshe Koppel, and Shlomo Argamon-Engelson. Arbitrating among competing classifiers using learned referees. Knowledge and Information Systems, 3(4):470–490, 2001.
  10. Argamon, Shlomo, Moshe Koppel, Jonathan Fine, and Anat Rachel Shimoni. "Gender, genre, and writing style in formal written texts." Text 23, no. 3 (2003): 321-346.
  11. Argamon, Shlomo, Moshe Koppel, James W. Pennebaker, and Jonathan Schler. "Automatically profiling the author of an anonymous text." Communications of the ACM 52, no. 2 (2009): 119-123.
  12. Argamon, Shlomo; Koppel, Moshe; Pennebaker, James W.; Schler, Jonathan (2007-09-03). "Mining the Blogosphere: Age, gender and the varieties of self-expression". First Monday. doi: 10.5210/fm.v12i9.2003 . ISSN   1396-0466.
  13. Argamon, Shlomo, Jean-Baptiste Goulain, Russell Horton, and Mark Olsen. "Vive la Différence! Text mining gender difference in French literature." Digital Humanities Quarterly 3, no. 2 (2009).
  14. Argamon, Shlomo, Russell Horton, Mark Olsen, and Sterling Stuart Stein. "Gender, Race, and Nationality in BlackDrama, 1850-2000: Mining Differences in Language Use in Authors and their Characters." Proceedings of Digital Humanities (2007).
  15. Hota, Sobhan R., Shlomo Argamon, and Rebecca Chung. "Gender in Shakespeare: Automatic stylistics gender character classification using syntactic, lexical and lemma features." Proc. Chicago Colloquium on Digital Humanities and Computer Science (DHCS) (2006).
  16. Argamon, Shlomo, Casey Whitelaw, Paul Chase, Sobhan Raj Hota, Navendu Garg, and Shlomo Levitan. "Stylistic text classification using functional lexical features." Journal of the American Society for Information Science and Technology 58, no. 6 (2007): 802-822.
  17. Argamon, Shlomo, and Moshe Koppel. "The rest of the story: Finding meaning in stylistic variation." In The Structure of Style, pp. 79-112. Springer, Berlin, Heidelberg, 2010.
  18. Argamon, Shlomo, Jeff Dodick, and Paul Chase. "Language use reflects scientific methodology: A corpus-based study of peer-reviewed journal articles." Scientometrics 75, no. 2 (2008): 203-238.
  19. "Doubts Persist on U.S. Claims of North Korean Role in Sony Hack". NPR.org.
  20. "New Study May Add to Skepticism Among Security Experts That North Korea Was Behind Sony Hack". 2014-12-24.
  21. Savage, Charlie; Perlroth, Nicole (2016-07-27). "Is D.N.C. Email Hacker a Person or a Russian Front? Experts Aren't Sure". The New York Times.
  22. "The NSA Data Leakers Might be Faking Their Awful English to Deceive Us". 2016-08-18.
  23. "Second Snowden could be behind sale of NSA hacking tools".
  24. "Data Science Master's—Master of Applied Science (M.A.S.)". 2023-01-08.
  25. "The Well-Rounded Data Scientist". 16 April 2014.
  26. "Becoming a Data Scientist Podcast Episode 03: Shlomo Argamon | Becoming a Data Scientist".
  27. "BCS Register of Members". wam.bcs.org. Retrieved 2018-10-05.
  28. "Sixth Annual Forensic Linguistics Distinguished Visitor Lecture" (PDF).