Rada Mihalcea

Last updated
Rada Mihalcea
Rada Mihalcea.jpg
Mihalcea at Bob and Betty Beyster Building, University of Michigan
Born
Education Technical University of Cluj-Napoca (1992), Southern Methodist University (1999, 2001), Oxford University (2010)
OccupationProfessor at University of Michigan
Known for

Rada Mihalcea is the Janice M. Jenkins Collegiate Professor of Computer Science and Engineering at the University of Michigan. She has made contributions to natural language processing, multimodal processing, and computational social science. With Paul Tarau, she is the co-inventor of TextRank Algorithm, [1] which is widely used for text summarization.

Contents

Career

Mihalcea has a Ph.D. in Computer Science and Engineering from Southern Methodist University (2001) and a Ph.D. in Linguistics, Oxford University (2010). [2] In 2017 she was named Director of the Artificial Intelligence Laboratory at University of Michigan, Computer Science and Engineering. In 2018, Mihalcea was elected as new VP for the Association for Computational Linguistics (ACL). In 2021, she was elected the president for ACL. She is a professor of Computer Science and Engineering at the University of Michigan, where she also leads the Language and Information Technologies (LIT) Lab. [3]

A prolific researcher, Mihalcea has authored or coauthored over 400 articles since 1998 on topics ranging from semantic analysis of text to lie detection. [4] Her work has been cited over 40,000 times on Google Scholar, which made her one of the most cited scholars in Multimodal Interation and Computational Social Science. [5]

In 2008, Mihalcea received the Presidential Early Career Award for Scientists and Engineers (PECASE) [6] She is an ACM Fellow (since 2019) and AAAI Fellow (since 2021).

Mihalcea is an outspoken promoter of diversity in computer science. She also supports an expansion of the traditional analysis of educational success, which tends to focus on academic behaviour, to include student life, personality and background outside of the classroom. [7] Mihalcea leads Girls Encoded, a program designed to develop the pipeline of women in computer science as well as to retain the women who have entered into the program. [8] [9] [10]

Awards

Research

Mihalcea is known for her research in natural language processing, multimodal processing, computational social sciences. In a collaboration she leads at the University of Michigan, Mihalcea has created software that can detect human lying. [15] In a study of video clips of high profile court cases, a computer was more accurate at detecting deception than human judges. [16] [17] [18]

Mihalcea's lie-detection software uses machine learning techniques to analyze video clips of actual trials. [19] In her 2015 study, the team used clips from The Innocence Project, a national organization that works to reexamine cases where individuals were tried without the benefit of DNA testing with the aim of exonerating wrongfully convicted individuals. [20] After identifying common human gestures, they transcribed the audio from the video clips of trials and analyzed how often subjects labeled deceptive used various words and phrases. The system was 75% accurate in identifying which subjects were deceptive among 120 videos. [20] [21] That puts Mihalcea’s algorithm on par with the most commonly accepted form of lie detection, polygraph tests, which are roughly 85 percent accurate when testing guilty people and 56 percent accurate when testing the innocent. [22] She notes there are still improvements to be made — in particular to account for cultural and demographic differences. [20] A possibly unique advantage of Mihalcea's study was the real world, high stakes nature of the footage analyzed in the study. In laboratory experiments, it is difficult to create a setting that motivates people to truly lie. [23]

In 2018, Mihalcea and her collaborators worked on an algorithm-based system that identifies linguistic cues in fake news stories. It successfully found fakes up to 76% of the time, compared to a human success rate of 70%. [24]

Publications

Books

Journals and conferences

Personal life

Mihalcea was born in Cluj-Napoca, Romania, where she attended the Technical University of Cluj-Napoca.

She can speak Romanian, English, Italian, and French.

Mihalcea has two children - Zara (b. 2009) and Caius (b. 2013). They were both born in Dallas, Texas.

She is married to an associate professor of engineering at the University of Michigan–Flint - Mihai Burzo. They met while they were both completing Ph.D.s at Southern Methodist University in 2001 [25] and have often collaborated on research, [26] such as the 2015 study on lie detection. [22]

Related Research Articles

Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related to information retrieval, knowledge representation and computational linguistics, a subfield of linguistics. Typically data is collected in text corpora, using either rule-based, statistical or neural-based approaches in machine learning and deep learning.

Word-sense disambiguation is the process of identifying which sense of a word is meant in a sentence or other segment of context. In human language processing and cognition, it is usually subconscious.

Semantic similarity is a metric defined over a set of documents or terms, where the idea of distance between items is based on the likeness of their meaning or semantic content as opposed to lexicographical similarity. These are mathematical tools used to estimate the strength of the semantic relationship between units of language, concepts or instances, through a numerical description obtained according to the comparison of information supporting their meaning or describing their nature. The term semantic similarity is often confused with semantic relatedness. Semantic relatedness includes any relation between two terms, while semantic similarity only includes "is a" relations. For example, "car" is similar to "bus", but is also related to "road" and "driving".

Sentiment analysis is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine. With the rise of deep language models, such as RoBERTa, also more difficult data domains can be analyzed, e.g., news texts where authors typically express their opinion/sentiment less explicitly.

Plagiarism detection or content similarity detection is the process of locating instances of plagiarism or copyright infringement within a work or document. The widespread use of computers and the advent of the Internet have made it easier to plagiarize the work of others.

Computational humor is a branch of computational linguistics and artificial intelligence which uses computers in humor research. It is a relatively new area, with the first dedicated conference organized in 1996.

In natural language processing, semantic role labeling is the process that assigns labels to words or phrases in a sentence that indicates their semantic role in the sentence, such as that of an agent, goal, or result.

In statistics and natural language processing, a topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents. Topic modeling is a frequently used text-mining tool for discovery of hidden semantic structures in a text body. Intuitively, given that a document is about a particular topic, one would expect particular words to appear in the document more or less frequently: "dog" and "bone" will appear more often in documents about dogs, "cat" and "meow" will appear in documents about cats, and "the" and "is" will appear approximately equally in both. A document typically concerns multiple topics in different proportions; thus, in a document that is 10% about cats and 90% about dogs, there would probably be about 9 times more dog words than cat words. The "topics" produced by topic modeling techniques are clusters of similar words. A topic model captures this intuition in a mathematical framework, which allows examining a set of documents and discovering, based on the statistics of the words in each, what the topics might be and what each document's balance of topics is.

William Aaron Woods, generally known as Bill Woods, is a researcher in natural language processing, continuous speech understanding, knowledge representation, and knowledge-based search technology. He is currently a Software Engineer at Google.

Dragomir R. Radev was an American computer scientist who was a professor at Yale University, working on natural language processing and information retrieval. He also served as a University of Michigan computer science professor and Columbia University computer science adjunct professor, as well as a Member of the Advisory Board of Lawyaw.

Stephanie Forrest is an American computer scientist and director of the Biodesign Center for Biocomputing, Security and Society at the Biodesign Institute at Arizona State University. She was previously Distinguished Professor of Computer Science at the University of New Mexico in Albuquerque. She is best known for her work in adaptive systems, including genetic algorithms, computational immunology, biological modeling, automated software repair, and computer security.

Julia Hirschberg is an American computer scientist noted for her research on computational linguistics and natural language processing.

<span class="mw-page-title-main">Dan Roth</span> Professor of Computer Science at University of Pennsylvania

Dan Roth is the Eduardo D. Glandt Distinguished Professor of Computer and Information Science at the University of Pennsylvania and the Chief AI Scientist at Oracle. Until June 2024 Dan was a VP/Distinguished Scientist at AWS AI. In his role at AWS Roth led over the last three years the scientific effort behind the first-generation Generative AI products from AWS, including Titan Models, Amazon Q efforts, and Bedrock, from inception until they became generally available.

Emily Menon Bender is an American linguist who is a professor at the University of Washington. She specializes in computational linguistics and natural language processing. She is also the director of the University of Washington's Computational Linguistics Laboratory. She has published several papers on the risks of large language models and on ethics in natural language processing.

Emily Mower Provost is a professor of computer science at the University of Michigan. She directs the Computational Human-Centered Artificial Intelligence (CHAI) Laboratory.

Emotion recognition in conversation (ERC) is a sub-field of emotion recognition, that focuses on mining human emotions from conversations or dialogues having two or more interlocutors. The datasets in this field are usually derived from social platforms that allow free and plenty of samples, often containing multimodal data. Self- and inter-personal influences play critical role in identifying some basic emotions, such as, fear, anger, joy, surprise, etc. The more fine grained the emotion labels are the harder it is to detect the correct emotion. ERC poses a number of challenges, such as, conversational-context modeling, speaker-state modeling, presence of sarcasm in conversation, emotion shift across consecutive utterances of the same interlocutor.

Mona Talat Diab is a computer science professor and director of Carnegie Mellon University's Language Technologies Institute. Previously, she was a professor at George Washington University and a research scientist with Facebook AI. Her research focuses on natural language processing, computational linguistics, cross lingual/multilingual processing, computational socio-pragmatics, Arabic language processing, and applied machine learning.

Ellen Riloff is an American computer scientist currently serving as a professor at the School of Computing at the University of Utah. Her research focuses on natural language processing and computational linguistics, specifically information extraction, sentiment analysis, semantic class induction, and bootstrapping methods that learn from unannotated texts.

Janyce Marbury Wiebe (1959–2018) was an American computer science specializing in natural language processing and known for her work on subjectivity, sentiment analysis, opinion mining, discourse processing, and word-sense disambiguation.

References

  1. "TextRank: Bringing Order into Texts" (PDF). ACL . Retrieved 2024-03-17.
  2. "The Language of Humor, PhD Dissertation". Oxford University . Retrieved 2021-02-13.
  3. "Language Information and Technologies". lit.eecs.umich.edu. Retrieved 2019-03-07.
  4. "Rada Mihalcea". dblp . Retrieved 2024-03-16.
  5. "Rada Mihalcea". Google Scholar . Retrieved 2024-03-17.
  6. "President Honors Outstanding Early-Career Scientists". National Science Foundation . Retrieved 2017-08-30.
  7. "U Michigan MIDAS Program Backs Student Success Research". Campus Technology. Retrieved 2016-06-23.
  8. "Girls Encoded". girlsencoded.eecs.umich.edu. Retrieved 2019-03-07.
  9. "Making a difference for women in academia". University of Michigan EECS. Retrieved 2019-03-07.
  10. "A champion for women in computer science". University of Michigan EECS. Retrieved 2019-03-07.
  11. 2019 ACM Fellows Recognized for Far-Reaching Accomplishments that Define the Digital Age, Association for Computing Machinery, retrieved 2019-12-11
  12. "Sarah Goddard Power Award". The University Record. Retrieved 2019-03-07.
  13. "Carol Hollenshead Award | Center for the Education of Women | University of Michigan". www.cew.umich.edu. Retrieved 2019-03-07.
  14. "President Honors Outstanding Early-Career Scientists | NSF - National Science Foundation". www.nsf.gov. Retrieved 2019-03-07.
  15. "Researchers Develop New Lie-Detecting Software". Topnews.in. Retrieved 2015-12-16.
  16. "Can you spot a liar? Fail safe ways to determine if someone is telling the truth". New Zealand Herald. Retrieved 2017-01-30.
  17. "New Developed Software can detect lie with %75 success – Baltimore News". Albany Daily Star. Retrieved 2016-08-17.[ permanent dead link ]
  18. "To spot a liar, look at their hands". Quartz. 12 December 2015. Retrieved 2015-12-12.
  19. "Courtroom fibs used to develop lie-detecting software". Gizmag. 2015-12-12. Retrieved 2015-12-12.
  20. 1 2 3 "University professors create new software to detect lies". Michigan Daily. 10 December 2015. Retrieved 2015-12-11.
  21. "Liar, Liar Pants On Fire: 6 Signs Computers Use To Spot Liars With 75% Accuracy". Medical Daily. 2015-12-15. Retrieved 2015-12-16.
  22. 1 2 "5 Ways to Tell If Someone is Lying to You". Yahoo! Health. 15 December 2015. Retrieved 2015-12-15.
  23. "New software analysis words, gestures to detect lies". Jagran Post. Retrieved 2015-12-11.
  24. "Fake news detector algorithm works better than a human". University of Michigan News. 2018-08-21. Retrieved 2019-03-26.
  25. "Episode 31: From Romania – Immigrant Computer Scientists Podcast" . Retrieved 2023-03-19.
  26. "Mihai Burzo's research works | University of Michigan".