Ani Nenkova

Ani Nenkova
Alma mater	Columbia University (PhD in Computer Science); Sofia University (MS)
Scientific career
Fields	Computer Science; Computational Linguistics; Artificial Intelligence
Institutions	Adobe Research ; University of Pennsylvania ; Columbia University ; Sofia University
Thesis	Understanding the process of multi-document summarization: content selection, rewrite and evaluation (PhD thesis, 2006); Tableau Methods for Concept Languages (MS thesis, in Bulgarian, 2000)
Doctoral Advisor	Kathleen McKeown (Columbia University)
Postdoc Advisor	Dan Jurafsky (Stanford NLP)
Website	Personal website

Last updated December 23, 2024

Ani Nenkova is principal scientist at Adobe Research, currently on leave^[1] from her position as an associate professor of computer and information science at the University of Pennsylvania. Her research focuses on computational linguistics and artificial intelligence, with an emphasis on developing computational methods for analysis of text quality and style, discourse, affect recognition, and summarization.

Education

Nenkova earned her master's degree from the Department of Mathematical Logic and Applications (Faculty of Mathematics and Informatics) at Sofia University in Bulgaria.^[2] She then carried out doctoral work at Columbia University, where she was advised by Kathleen McKeown, earning a Ph.D. in computer science in 2006.^[3]

Career

Besides Nenkova’s position as an associate professor at the University of Pennsylvania, she also serves as a co-editor-in-chief of the Transactions of the Association for Computational Linguistics (TACL) and an area chair/senior program committee member for ACL, NAACL and AAAI. In the past, she has served as a member of the editorial board of Computational Linguistics (2009--2011), an associate editor for the IEEE/ACM Transactions on Audio, Speech and Language Processing (2015--2018), and a program co-chair for SIGDial 2014 and NAACL-HLT in 2016.^[4] In February of 2021, Nenkova started a new position at Adobe Research, joining the team as the head of the lab while on leave from Penn.

Research

Nenkova’s research interests include natural language processing, summarization, emotion recognition, and discourse.^[5] In the area of emotion recognition, Nenkova and her collaborators developed an approach that relies on regions of interest related to properties of phoneme or word classes, which served as a significant improvement over other approaches for representing speech in emotion recognition. In Nenkova’s research on hidden meanings, or what makes “great” writing, and literature search automation,^[6] she trains programs on word representation datasets that are curated by humans. These tell the computer what words and phrases mean in a specific context. The long-term goal of this research is to develop new algorithms that can analyze and understand new text without a human translator.^[7] Nenkova and her collaborators have also developed many tools and projects, including Speciteller, a tool for predicting sentence specificity, CATS, the corpus of science journalism articles used for their TACL 13 paper, and SIMetrix (Summary Input Similarity Metrics), a tool to perform the automatic summary evaluation in their EMNLP'09 and CL'14 papers.^[4]

Publications

Nenkova has over 150 publications.^[5]

Selected publications

Automatic Summarization Now Publishers 2011 ISBN 1601984707
Word Embeddings (Also) Encode Human Personality Stereotypes, Agarwal et al, *SEM@NAACL-HLT 2019.
How to Compare Summarizers Without Target Length? Pitfalls, Solutions and Re-Examination of the Neural Summarization Literature, Simeng Sun, Ori Shapira, Ido Dagan, Ani Nenkova, To appear at the Workshop on Methods for Optimizing and Evaluating Neural Language Generation at NAACL 2019.
Predicting Annotation Difficulty to Improve Task Routing and Model Performance for Biomedical Information Extraction, Yang et al, NAACL-HLT 2019.
A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature, Nye et al, ACL 2018.
Combining Lexical and Syntactic Features for Detecting Content-Dense Texts in News, Yang and Nenkova, JAIR. 60: 179-219 (2017).
Fast and Accurate Prediction of Sentence Specificity, Li and Nenkova, AAAI 2015.
Prosodic cues for emotion: analysis with discrete characterization of intonation, Cao et al, Speech prosody, 2014.

Related Research Articles

The Association for Computational Linguistics (ACL) is a scientific and professional organization for people working on natural language processing. Its namesake conference is one of the primary high impact conferences for natural language processing research, along with EMNLP. The conference is held each summer in locations where significant computational linguistics research is carried out.

Aravind Krishna Joshi was the Henry Salvatori Professor of Computer and Cognitive Science in the computer science department of the University of Pennsylvania. Joshi defined the tree-adjoining grammar formalism which is often used in computational linguistics and natural language processing.

Sentiment analysis is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine. With the rise of deep language models, such as RoBERTa, also more difficult data domains can be analyzed, e.g., news texts where authors typically express their opinion/sentiment less explicitly.

The North American Chapter of the Association for Computational Linguistics (NAACL) provides a regional focus for members of the Association for Computational Linguistics (ACL) in North America as well as in Central and South America, organizes annual conferences, promotes cooperation and information exchange among related scientific and professional societies, encourages and facilitates ACL membership by people and institutions in the Americas, and provides a source of information on regional activities for the ACL Executive Committee.

Karen Ida Boalth Spärck Jones was a self-taught programmer and a pioneering British computer scientist responsible for the concept of inverse document frequency (IDF), a technology that underlies most modern search engines. She was an advocate for women in computer science, her slogan being, "Computing is too important to be left to men." In 2019, The New York Times published her belated obituary in its series Overlooked, calling her "a pioneer of computer science for work combining statistics and linguistics, and an advocate for women in the field." From 2008, to recognize her achievements in the fields of information retrieval (IR) and natural language processing (NLP), the Karen Spärck Jones Award is awarded annually to a recipient for outstanding research in one or both of her fields.

Language and Communication Technologies is the scientific study of technologies that explore language and communication. It is an interdisciplinary field that encompasses the fields of computer science, linguistics and cognitive science.

Dragomir R. Radev was an American computer scientist who was a professor at Yale University, working on natural language processing and information retrieval. He also served as a University of Michigan computer science professor and Columbia University computer science adjunct professor, as well as a Member of the Advisory Board of Lawyaw.

BabelNet is a multilingual lexical-semantic knowledge graph, ontology and encyclopedic dictionary developed at the NLP group of the Sapienza University of Rome under the supervision of Roberto Navigli. BabelNet was automatically created by linking Wikipedia to the most popular computational lexicon of the English language, WordNet. The integration is done using an automatic mapping and by filling in lexical gaps in resource-poor languages by using statistical machine translation. The result is an encyclopedic dictionary that provides concepts and named entities lexicalized in many languages and connected with large amounts of semantic relations. Additional lexicalizations and definitions are added by linking to free-license wordnets, OmegaWiki, the English Wiktionary, Wikidata, FrameNet, VerbNet and others. Similarly to WordNet, BabelNet groups words in different languages into sets of synonyms, called Babel synsets. For each Babel synset, BabelNet provides short definitions in many languages harvested from both WordNet and Wikipedia.

Diane Litman is an American professor of computer science at the University of Pittsburgh. She also jointly holds the positions of senior scientist with the Learning Research and Development Center and faculty with the Intelligent Systems department. Litman is noted for her work in the areas of artificial intelligence, computational linguistics, knowledge representation and reasoning, natural language processing, and user modeling.

Barbara J. Grosz CorrFRSE is an American computer scientist and Higgins Professor of Natural Sciences at Harvard University. She has made seminal contributions to the fields of natural language processing and multi-agent systems. With Alison Simmons, she is co-founder of the Embedded EthiCS programme at Harvard, which embeds ethics lessons into computer science courses.

Julia Hirschberg is an American computer scientist noted for her research on computational linguistics and natural language processing.

Kathleen R. McKeown is an American computer scientist, specializing in natural language processing. She is currently the Henry and Gertrude Rothschild Professor of Computer Science and is the Founding Director of the Institute for Data Sciences and Engineering at Columbia University.

Dan Roth is the Eduardo D. Glandt Distinguished Professor of Computer and Information Science at the University of Pennsylvania and the Chief AI Scientist at Oracle. Until June 2024 Dan was a VP/Distinguished Scientist at AWS AI. In his role at AWS Roth led over the last three years the scientific effort behind the first-generation Generative AI products from AWS, including Titan Models, Amazon Q efforts, and Bedrock, from inception until they became generally available.

Martha (Stone) Palmer is an American computer scientist. She is best known for her work on verb semantics, and for the creation of ontological resources such as PropBank and VerbNet.

Yejin Choi is Wissner-Slivka Chair of Computer Science at the University of Washington. Her research considers natural language processing and computer vision.

Heng Ji is a computer scientist who works on information extraction and natural language processing. She is well known for her work on joined named entity recognition and relation extraction, as well as for her work on cross-document event extraction. She has been coordinating the popular NIST TAC Knowledge Base Population task since 2010. She has been recognised as one of AI's 10 to watch by IEEE Intelligent Systems in 2013, and has won multiple awards, including a NSF Career Award in 2009, Google Research awards in 2009 and 2014, and an IBM Watson Faculty Award in 2012.

Mona Talat Diab is a computer science professor and director of Carnegie Mellon University's Language Technologies Institute. Previously, she was a professor at George Washington University and a research scientist with Facebook AI. Her research focuses on natural language processing, computational linguistics, cross lingual/multilingual processing, computational socio-pragmatics, Arabic language processing, and applied machine learning.

Ellen Riloff is an American computer scientist currently serving as a professor at the School of Computing at the University of Utah. Her research focuses on natural language processing and computational linguistics, specifically information extraction, sentiment analysis, semantic class induction, and bootstrapping methods that learn from unannotated texts.

Yang Liu is a Chinese and American computer scientist specializing in speech processing and natural language processing, and a senior principal scientist for Amazon.

Janyce Marbury Wiebe (1959–2018) was an American computer science specializing in natural language processing and known for her work on subjectivity, sentiment analysis, opinion mining, discourse processing, and word-sense disambiguation.

References

↑ "Ani Nenkova". scholar.google.com. Retrieved 2021-06-10.
↑ "Ani Nenkova - Links". www.cis.upenn.edu. Retrieved 2021-06-10.
↑ Ani Nenkova at the Mathematics Genealogy Project
1 2 Nenkova, Ani. "Ani Nenkova". www.cis.upenn.edu. Retrieved 2021-06-10.
1 2 "Ani NENKOVA | Associate professor | PhD, Columbia University | University of Pennsylvania, PA | UP | Department of Computer and Information Science". ResearchGate. Retrieved 2021-06-10.
↑ "Hidden Meanings" (PDF). Penn Engineering Magazine. Spring 2017: 10–13. 2017.
↑ "The brain in the machine". Penn Today. Retrieved 2021-06-10.

External links

Ani Nenkova publications indexed by Google Scholar

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "Ani Nenkova". scholar.google.com. Retrieved 2021-06-10.

[2] "Ani Nenkova - Links". www.cis.upenn.edu. Retrieved 2021-06-10.

[3] Ani Nenkova at the Mathematics Genealogy Project

[:0-4] 1 2 Nenkova, Ani. "Ani Nenkova". www.cis.upenn.edu. Retrieved 2021-06-10.

[:1-5] 1 2 "Ani NENKOVA | Associate professor | PhD, Columbia University | University of Pennsylvania, PA | UP | Department of Computer and Information Science". ResearchGate. Retrieved 2021-06-10.

[6] "Hidden Meanings" (PDF). Penn Engineering Magazine. Spring 2017: 10–13. 2017.

[7] "The brain in the machine". Penn Today. Retrieved 2021-06-10.

[1]

[2]

[3]

[4]

[5]

[6]

[7]