Emotion recognition

Last updated

Emotion recognition is the process of identifying human emotion. People vary widely in their accuracy at recognizing the emotions of others. Use of technology to help people with emotion recognition is a relatively nascent research area. Generally, the technology works best if it uses multiple modalities in context. To date, the most work has been conducted on automating the recognition of facial expressions from video, spoken expressions from audio, written expressions from text, and physiology as measured by wearables.

Contents

Human

Humans show a great deal of variability in their abilities to recognize emotion. A key point to keep in mind when learning about automated emotion recognition is that there are several sources of "ground truth", or truth about what the real emotion is. Suppose we are trying to recognize the emotions of Alex. One source is "what would most people say that Alex is feeling?" In this case, the 'truth' may not correspond to what Alex feels, but may correspond to what most people would say it looks like Alex feels. For example, Alex may actually feel sad, but he puts on a big smile and then most people say he looks happy. If an automated method achieves the same results as a group of observers it may be considered accurate, even if it does not actually measure what Alex truly feels. Another source of 'truth' is to ask Alex what he truly feels. This works if Alex has a good sense of his internal state, and wants to tell you what it is, and is capable of putting it accurately into words or a number. However, some people are alexithymic and do not have a good sense of their internal feelings, or they are not able to communicate them accurately with words and numbers. In general, getting to the truth of what emotion is actually present can take some work, can vary depending on the criteria that are selected, and will usually involve maintaining some level of uncertainty.

Automatic

Decades of scientific research have been conducted developing and evaluating methods for automated emotion recognition. There is now an extensive literature proposing and evaluating hundreds of different kinds of methods, leveraging techniques from multiple areas, such as signal processing, machine learning, computer vision, and speech processing. Different methodologies and techniques may be employed to interpret emotion such as Bayesian networks. [1] , Gaussian Mixture models [2] and Hidden Markov Models [3] and deep neural networks. [4]

Approaches

The accuracy of emotion recognition is usually improved when it combines the analysis of human expressions from multimodal forms such as texts, physiology, audio, or video. [5] Different emotion types are detected through the integration of information from facial expressions, body movement and gestures, and speech. [6] The technology is said to contribute in the emergence of the so-called emotional or emotive Internet. [7]

The existing approaches in emotion recognition to classify certain emotion types can be generally classified into three main categories: knowledge-based techniques, statistical methods, and hybrid approaches. [8]

Knowledge-based techniques

Knowledge-based techniques (sometimes referred to as lexicon-based techniques), utilize domain knowledge and the semantic and syntactic characteristics of text and potentially spoken language in order to detect certain emotion types. [9] In this approach, it is common to use knowledge-based resources during the emotion classification process such as WordNet, SenticNet, [10] ConceptNet, and EmotiNet, [11] to name a few. [12] One of the advantages of this approach is the accessibility and economy brought about by the large availability of such knowledge-based resources. [8] A limitation of this technique on the other hand, is its inability to handle concept nuances and complex linguistic rules. [8]

Knowledge-based techniques can be mainly classified into two categories: dictionary-based and corpus-based approaches.[ citation needed ] Dictionary-based approaches find opinion or emotion seed words in a dictionary and search for their synonyms and antonyms to expand the initial list of opinions or emotions. [13] Corpus-based approaches on the other hand, start with a seed list of opinion or emotion words, and expand the database by finding other words with context-specific characteristics in a large corpus. [13] While corpus-based approaches take into account context, their performance still vary in different domains since a word in one domain can have a different orientation in another domain. [14]

Statistical methods

Statistical methods commonly involve the use of different supervised machine learning algorithms in which a large set of annotated data is fed into the algorithms for the system to learn and predict the appropriate emotion types. [8] Machine learning algorithms generally provide more reasonable classification accuracy compared to other approaches, but one of the challenges in achieving good results in the classification process, is the need to have a sufficiently large training set. [8]

Some of the most commonly used machine learning algorithms include Support Vector Machines (SVM), Naive Bayes, and Maximum Entropy. [15] Deep learning, which is under the unsupervised family of machine learning, is also widely employed in emotion recognition. [16] [17] [18] Well-known deep learning algorithms include different architectures of Artificial Neural Network (ANN) such as Convolutional Neural Network (CNN), Long Short-term Memory (LSTM), and Extreme Learning Machine (ELM). [15] The popularity of deep learning approaches in the domain of emotion recognition may be mainly attributed to its success in related applications such as in computer vision, speech recognition, and Natural Language Processing (NLP). [15]

Hybrid approaches

Hybrid approaches in emotion recognition are essentially a combination of knowledge-based techniques and statistical methods, which exploit complementary characteristics from both techniques. [8] Some of the works that have applied an ensemble of knowledge-driven linguistic elements and statistical methods include sentic computing and iFeel, both of which have adopted the concept-level knowledge-based resource SenticNet. [19] [20] The role of such knowledge-based resources in the implementation of hybrid approaches is highly important in the emotion classification process. [12] Since hybrid techniques gain from the benefits offered by both knowledge-based and statistical approaches, they tend to have better classification performance as opposed to employing knowledge-based or statistical methods independently.[ citation needed ] A downside of using hybrid techniques however, is the computational complexity during the classification process. [12]

Datasets

Data is an integral part of the existing approaches in emotion recognition and in most cases it is a challenge to obtain annotated data that is necessary to train machine learning algorithms. [13] For the task of classifying different emotion types from multimodal sources in the form of texts, audio, videos or physiological signals, the following datasets are available:

  1. HUMAINE: provides natural clips with emotion words and context labels in multiple modalities [21]
  2. Belfast database: provides clips with a wide range of emotions from TV programs and interview recordings [22]
  3. SEMAINE: provides audiovisual recordings between a person and a virtual agent and contains emotion annotations such as angry, happy, fear, disgust, sadness, contempt, and amusement [23]
  4. IEMOCAP: provides recordings of dyadic sessions between actors and contains emotion annotations such as happiness, anger, sadness, frustration, and neutral state [24]
  5. eNTERFACE: provides audiovisual recordings of subjects from seven nationalities and contains emotion annotations such as happiness, anger, sadness, surprise, disgust, and fear [25]
  6. DEAP: provides electroencephalography (EEG), electrocardiography (ECG), and face video recordings, as well as emotion annotations in terms of valence, arousal, and dominance of people watching film clips [26]
  7. DREAMER: provides electroencephalography (EEG) and electrocardiography (ECG) recordings, as well as emotion annotations in terms of valence, dominance of people watching film clips [27]
  8. MELD: is a multiparty conversational dataset where each utterance is labeled with emotion and sentiment. MELD [28] provides conversations in video format and hence suitable for multimodal emotion recognition and sentiment analysis. MELD is useful for multimodal sentiment analysis and emotion recognition, dialogue systems and emotion recognition in conversations. [29]
  9. MuSe: provides audiovisual recordings of natural interactions between a person and an object. [30] It has discrete and continuous emotion annotations in terms of valence, arousal and trustworthiness as well as speech topics useful for multimodal sentiment analysis and emotion recognition.
  10. UIT-VSMEC: is a standard Vietnamese Social Media Emotion Corpus (UIT-VSMEC) with about 6,927 human-annotated sentences with six emotion labels, contributing to emotion recognition research in Vietnamese which is a low-resource language in Natural Language Processing (NLP). [31]
  11. BED: provides valence and arousal of people watching images. It also includes electroencephalography (EEG) recordings of people exposed to various stimuli (SSVEP, resting with eyes closed, resting with eyes open, cognitive tasks) for the task of EEG-based biometrics. [32]

Applications

Emotion recognition is used in society for a variety of reasons. Affectiva, which spun out of MIT, provides artificial intelligence software that makes it more efficient to do tasks previously done manually by people, mainly to gather facial expression and vocal expression information related to specific contexts where viewers have consented to share this information. For example, instead of filling out a lengthy survey about how you feel at each point watching an educational video or advertisement, you can consent to have a camera watch your face and listen to what you say, and note during which parts of the experience you show expressions such as boredom, interest, confusion, or smiling. (Note that this does not imply it is reading your innermost feelings—it only reads what you express outwardly.) Other uses by Affectiva include helping children with autism, helping people who are blind to read facial expressions, helping robots interact more intelligently with people, and monitoring signs of attention while driving in an effort to enhance driver safety. [33]

Academic research increasingly uses emotion recognition as a method to study social science questions around elections, protests, and democracy. Several studies focus on the facial expressions of political candidates on social media and find that politicians tend to express happiness. [34] [35] [36] However, this research finds that computer vision tools such as Amazon Rekognition are only accurate for happiness and are mostly reliable as 'happy detectors'. [37] Researchers examining protests, where negative affect such as anger is expected, have therefore developed their own models to more accurately study expressions of negativity and violence in democratic processes. [38]

A patent filed by Snapchat in 2015 describes a method of extracting data about crowds at public events by performing algorithmic emotion recognition on users' geotagged selfies. [39]

Emotient was a startup company which applied emotion recognition to reading frowns, smiles, and other expressions on faces, namely artificial intelligence to predict "attitudes and actions based on facial expressions". [40] Apple bought Emotient in 2016 and uses emotion recognition technology to enhance the emotional intelligence of its products. [40]

nViso provides real-time emotion recognition for web and mobile applications through a real-time API. [41] Visage Technologies AB offers emotion estimation as a part of their Visage SDK for marketing and scientific research and similar purposes. [42]

Eyeris is an emotion recognition company that works with embedded system manufacturers including car makers and social robotic companies on integrating its face analytics and emotion recognition software; as well as with video content creators to help them measure the perceived effectiveness of their short and long form video creative. [43] [44]

Many products also exist to aggregate information from emotions communicated online, including via "like" button presses and via counts of positive and negative phrases in text and affect recognition is increasingly used in some kinds of games and virtual reality, both for educational purposes and to give players more natural control over their social avatars.[ citation needed ]

Subfields

Emotion recognition is probably to gain the best outcome if applying multiple modalities by combining different objects, including text (conversation), audio, video, and physiology to detect emotions.

Emotion recognition in text

Text data is a favorable research object for emotion recognition when it is free and available everywhere in human life. Compare to other types of data, the storage of text data is lighter and easy to compress to the best performance due to the frequent repetition of words and characters in languages. Emotions can be extracted from two essential text forms: written texts and conversations (dialogues). [45] For written texts, many scholars focus on working with sentence level to extract "words/phrases" representing emotions. [46] [47]

Emotion recognition in audio

Different from emotion recognition in text, vocal signals are used for the recognition to extract emotions from audio. [48]

Emotion recognition in video

Video data is a combination of audio data, image data and sometimes texts (in case of subtitles [49] ).

Emotion recognition in conversation

Emotion recognition in conversation (ERC) extracts opinions between participants from massive conversational data in social platforms, such as Facebook, Twitter, YouTube, and others. [29] ERC can take input data like text, audio, video or a combination form to detect several emotions such as fear, lust, pain, and pleasure.

See also

Related Research Articles

<span class="mw-page-title-main">Artificial neural network</span> Computational model used in machine learning, based on connected, hierarchical functions

Artificial neural networks are a branch of machine learning models that are built using principles of neuronal organization discovered by connectionism in the biological neural networks constituting animal brains.

Natural language processing (NLP) is an interdisciplinary subfield of computer science and linguistics. It is primarily concerned with giving computers the ability to support and manipulate human language. It involves processing natural language datasets, such as text corpora or speech corpora, using either rule-based or probabilistic machine learning approaches. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves.

Speech recognition is an interdisciplinary subfield of computer science and computational linguistics that develops methodologies and technologies that enable the recognition and translation of spoken language into text by computers. It is also known as automatic speech recognition (ASR), computer speech recognition or speech to text (STT). It incorporates knowledge and research in the computer science, linguistics and computer engineering fields. The reverse process is speech synthesis.

Affective computing is the study and development of systems and devices that can recognize, interpret, process, and simulate human affects. It is an interdisciplinary field spanning computer science, psychology, and cognitive science. While some core ideas in the field may be traced as far back as to early philosophical inquiries into emotion, the more modern branch of computer science originated with Rosalind Picard's 1995 paper on affective computing and her book Affective Computing published by MIT Press. One of the motivations for the research is the ability to give machines emotional intelligence, including to simulate empathy. The machine should interpret the emotional state of humans and adapt its behavior to them, giving an appropriate response to those emotions.

Text mining, text data mining (TDM) or text analytics is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources." Written resources may include websites, books, emails, reviews, and articles. High-quality information is typically obtained by devising patterns and trends by means such as statistical pattern learning. According to Hotho et al. (2005) we can distinguish between three different perspectives of text mining: information extraction, data mining, and a knowledge discovery in databases (KDD) process. Text mining usually involves the process of structuring the input text, deriving patterns within the structured data, and finally evaluation and interpretation of the output. 'High quality' in text mining usually refers to some combination of relevance, novelty, and interest. Typical text mining tasks include text categorization, text clustering, concept/entity extraction, production of granular taxonomies, sentiment analysis, document summarization, and entity relation modeling.

<span class="mw-page-title-main">Image segmentation</span> Partitioning a digital image into segments

In digital image processing and computer vision, image segmentation is the process of partitioning a digital image into multiple image segments, also known as image regions or image objects. The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics.

The expression computational intelligence (CI) usually refers to the ability of a computer to learn a specific task from data or experimental observation. Even though it is commonly considered a synonym of soft computing, there is still no commonly accepted definition of computational intelligence.

Multimodal interaction provides the user with multiple modes of interacting with a system. A multimodal interface provides several distinct tools for input and output of data.

<span class="mw-page-title-main">Thomas Huang</span> Chinese-American engineer and computer scientist (1936–2020)

Thomas Shi-Tao Huang was a Chinese-born American computer scientist, electrical engineer, and writer. He was a researcher and professor emeritus at the University of Illinois at Urbana-Champaign (UIUC). Huang was one of the leading figures in computer vision, pattern recognition and human computer interaction.

Sentiment analysis is the use of natural language processing, text analysis, computational linguistics, and biometrics to systematically identify, extract, quantify, and study affective states and subjective information. Sentiment analysis is widely applied to voice of the customer materials such as reviews and survey responses, online and social media, and healthcare materials for applications that range from marketing to customer service to clinical medicine. With the rise of deep language models, such as RoBERTa, also more difficult data domains can be analyzed, e.g., news texts where authors typically express their opinion/sentiment less explicitly.

In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. Unlike a statistical ensemble in statistical mechanics, which is usually infinite, a machine learning ensemble consists of only a concrete finite set of alternative models, but typically allows for much more flexible structure to exist among those alternatives.

Data augmentation is a technique in machine learning used to reduce overfitting when training a machine learning model, achieved by training models on several slightly-modified copies of existing data.

<span class="mw-page-title-main">Ioannis Pavlidis</span> Greek-American scholar

Ioannis Thomas Pavlidis is a Greek American scholar. He is the distinguished Eckhard-Pfeiffer Professor of Computer Science at the University of Houston, founder, and director of the Affective and Data Computing Laboratory, formerly known as the Computational Physiology Lab (CPL).

<span class="mw-page-title-main">Amir Hussain (cognitive scientist)</span>

Amir Hussain is a cognitive scientist, the director of Cognitive Big Data and Cybersecurity (CogBID) Research Lab at Edinburgh Napier University He is a professor of computing science. He is founding Editor-in-Chief of Springer Nature's internationally leading Cognitive Computation journal and the new Big Data Analytics journal. He is founding Editor-in-Chief for two Springer Book Series: Socio-Affective Computing and Cognitive Computation Trends, and also serves on the Editorial Board of a number of other world-leading journals including, as Associate Editor for the IEEE Transactions on Neural Networks and Learning Systems, IEEE Transactions on Systems, Man, and Cybernetics (Systems) and the IEEE Computational Intelligence Magazine.

Multimodal sentiment analysis is a technology for traditional text-based sentiment analysis, which includes modalities such as audio and visual data. It can be bimodal, which includes different combinations of two modalities, or trimodal, which incorporates three modalities. With the extensive amount of social media data available online in different forms such as videos and images, the conventional text-based sentiment analysis has evolved into more complex models of multimodal sentiment analysis, which can be applied in the development of virtual assistants, analysis of YouTube movie reviews, analysis of news videos, and emotion recognition such as depression monitoring, among others.

EEG analysis is exploiting mathematical signal analysis methods and computer technology to extract information from electroencephalography (EEG) signals. The targets of EEG analysis are to help researchers gain a better understanding of the brain; assist physicians in diagnosis and treatment choices; and to boost brain-computer interface (BCI) technology. There are many ways to roughly categorize EEG analysis methods. If a mathematical model is exploited to fit the sampled EEG signals, the method can be categorized as parametric, otherwise, it is a non-parametric method. Traditionally, most EEG analysis methods fall into four categories: time domain, frequency domain, time-frequency domain, and nonlinear methods. There are also later methods including deep neural networks (DNNs).

Automated Pain Recognition (APR) is a method for objectively measuring pain and at the same time represents an interdisciplinary research area that comprises elements of medicine, psychology, psychobiology, and computer science. The focus is on computer-aided objective recognition of pain, implemented on the basis of machine learning.

Emotion recognition in conversation (ERC) is a sub-field of emotion recognition, that focuses on mining human emotions from conversations or dialogues having two or more interlocutors. The datasets in this field are usually derived from social platforms that allow free and plenty of samples, often containing multimodal data. Self- and inter-personal influences play critical role in identifying some basic emotions, such as, fear, anger, joy, surprise, etc. The more fine grained the emotion labels are the harder it is to detect the correct emotion. ERC poses a number of challenges, such as, conversational-context modeling, speaker-state modeling, presence of sarcasm in conversation, emotion shift across consecutive utterances of the same interlocutor.

References

  1. Miyakoshi, Yoshihiro, and Shohei Kato. "Facial Emotion Detection Considering Partial Occlusion Of Face Using Baysian Network". Computers and Informatics (2011): 96–101.
  2. Hari Krishna Vydana, P. Phani Kumar, K. Sri Rama Krishna and Anil Kumar Vuppala. "Improved emotion recognition using GMM-UBMs". 2015 International Conference on Signal Processing and Communication Engineering Systems
  3. B. Schuller, G. Rigoll M. Lang. "Hidden Markov model-based speech emotion recognition". ICME '03. Proceedings. 2003 International Conference on Multimedia and Expo, 2003.
  4. Singh, Premjeet; Saha, Goutam; Sahidullah, Md (2021). "Non-linear frequency warping using constant-Q transformation for speech emotion recognition". 2021 International Conference on Computer Communication and Informatics (ICCCI). pp. 1–4. arXiv: 2102.04029 . doi:10.1109/ICCCI50826.2021.9402569. ISBN   978-1-7281-5875-4. S2CID   231846518.
  5. Poria, Soujanya; Cambria, Erik; Bajpai, Rajiv; Hussain, Amir (September 2017). "A review of affective computing: From unimodal analysis to multimodal fusion". Information Fusion. 37: 98–125. doi:10.1016/j.inffus.2017.02.003. hdl: 1893/25490 . S2CID   205433041.
  6. Caridakis, George; Castellano, Ginevra; Kessous, Loic; Raouzaiou, Amaryllis; Malatesta, Lori; Asteriadis, Stelios; Karpouzis, Kostas (19 September 2007). "Multimodal emotion recognition from expressive faces, body gestures and speech". Artificial Intelligence and Innovations 2007: From Theory to Applications. IFIP the International Federation for Information Processing. Vol. 247. pp. 375–388. doi:10.1007/978-0-387-74161-1_41. ISBN   978-0-387-74160-4.
  7. Price (23 August 2015). "Tapping Into The Emotional Internet". TechCrunch. Retrieved 12 December 2018.
  8. 1 2 3 4 5 6 Cambria, Erik (March 2016). "Affective Computing and Sentiment Analysis". IEEE Intelligent Systems. 31 (2): 102–107. doi:10.1109/MIS.2016.31. S2CID   18580557.
  9. Taboada, Maite; Brooke, Julian; Tofiloski, Milan; Voll, Kimberly; Stede, Manfred (June 2011). "Lexicon-Based Methods for Sentiment Analysis". Computational Linguistics. 37 (2): 267–307. doi: 10.1162/coli_a_00049 . ISSN   0891-2017.
  10. Cambria, Erik; Liu, Qian; Decherchi, Sergio; Xing, Frank; Kwok, Kenneth (2022). "SenticNet 7: A Commonsense-based Neurosymbolic AI Framework for Explainable Sentiment Analysis" (PDF). Proceedings of LREC. pp. 3829–3839.
  11. Balahur, Alexandra; Hermida, JesúS M; Montoyo, AndréS (1 November 2012). "Detecting implicit expressions of emotion in text: A comparative analysis". Decision Support Systems. 53 (4): 742–753. doi:10.1016/j.dss.2012.05.024. ISSN   0167-9236.
  12. 1 2 3 Medhat, Walaa; Hassan, Ahmed; Korashy, Hoda (December 2014). "Sentiment analysis algorithms and applications: A survey". Ain Shams Engineering Journal. 5 (4): 1093–1113. doi: 10.1016/j.asej.2014.04.011 .
  13. 1 2 3 Madhoushi, Zohreh; Hamdan, Abdul Razak; Zainudin, Suhaila (2015). "Sentiment analysis techniques in recent works". 2015 Science and Information Conference (SAI). pp. 288–291. doi:10.1109/SAI.2015.7237157. ISBN   978-1-4799-8547-0. S2CID   14821209.
  14. Hemmatian, Fatemeh; Sohrabi, Mohammad Karim (18 December 2017). "A survey on classification techniques for opinion mining and sentiment analysis". Artificial Intelligence Review. 52 (3): 1495–1545. doi:10.1007/s10462-017-9599-6. S2CID   11741285.
  15. 1 2 3 Sun, Shiliang; Luo, Chen; Chen, Junyu (July 2017). "A review of natural language processing techniques for opinion mining systems". Information Fusion. 36: 10–25. doi:10.1016/j.inffus.2016.10.004.
  16. Majumder, Navonil; Poria, Soujanya; Gelbukh, Alexander; Cambria, Erik (March 2017). "Deep Learning-Based Document Modeling for Personality Detection from Text". IEEE Intelligent Systems. 32 (2): 74–79. doi:10.1109/MIS.2017.23. S2CID   206468984.
  17. Mahendhiran, P. D.; Kannimuthu, S. (May 2018). "Deep Learning Techniques for Polarity Classification in Multimodal Sentiment Analysis". International Journal of Information Technology & Decision Making. 17 (3): 883–910. doi:10.1142/S0219622018500128.
  18. Yu, Hongliang; Gui, Liangke; Madaio, Michael; Ogan, Amy; Cassell, Justine; Morency, Louis-Philippe (23 October 2017). "Temporally Selective Attention Model for Social and Affective State Recognition in Multimedia Content". Proceedings of the 25th ACM international conference on Multimedia. MM '17. ACM. pp. 1743–1751. doi:10.1145/3123266.3123413. ISBN   9781450349062. S2CID   3148578.
  19. Cambria, Erik; Hussain, Amir (2015). Sentic Computing: A Common-Sense-Based Framework for Concept-Level Sentiment Analysis. Springer Publishing Company, Incorporated. ISBN   978-3319236537.
  20. Araújo, Matheus; Gonçalves, Pollyanna; Cha, Meeyoung; Benevenuto, Fabrício (7 April 2014). "IFeel: A system that compares and combines sentiment analysis methods". Proceedings of the 23rd International Conference on World Wide Web. WWW '14 Companion. ACM. pp. 75–78. doi:10.1145/2567948.2577013. ISBN   9781450327459. S2CID   11018367.
  21. Paolo Petta; Catherine Pelachaud; Roddy Cowie, eds. (2011). Emotion-oriented systems the humaine handbook. Berlin: Springer. ISBN   978-3-642-15184-2.
  22. Douglas-Cowie, Ellen; Campbell, Nick; Cowie, Roddy; Roach, Peter (1 April 2003). "Emotional speech: towards a new generation of databases". Speech Communication. 40 (1–2): 33–60. CiteSeerX   10.1.1.128.3991 . doi:10.1016/S0167-6393(02)00070-5. ISSN   0167-6393. S2CID   6421586.
  23. McKeown, G.; Valstar, M.; Cowie, R.; Pantic, M.; Schroder, M. (January 2012). "The SEMAINE Database: Annotated Multimodal Records of Emotionally Colored Conversations between a Person and a Limited Agent". IEEE Transactions on Affective Computing. 3 (1): 5–17. doi:10.1109/T-AFFC.2011.20. S2CID   2995377.
  24. Busso, Carlos; Bulut, Murtaza; Lee, Chi-Chun; Kazemzadeh, Abe; Mower, Emily; Kim, Samuel; Chang, Jeannette N.; Lee, Sungbok; Narayanan, Shrikanth S. (5 November 2008). "IEMOCAP: interactive emotional dyadic motion capture database". Language Resources and Evaluation. 42 (4): 335–359. doi:10.1007/s10579-008-9076-6. ISSN   1574-020X. S2CID   11820063.
  25. Martin, O.; Kotsia, I.; Macq, B.; Pitas, I. (3 April 2006). "The eNTERFACE'05 Audio-Visual Emotion Database". 22nd International Conference on Data Engineering Workshops (ICDEW'06). Icdew '06. IEEE Computer Society. pp. 8–. doi:10.1109/ICDEW.2006.145. ISBN   9780769525716. S2CID   16185196.
  26. Koelstra, Sander; Muhl, Christian; Soleymani, Mohammad; Lee, Jong-Seok; Yazdani, Ashkan; Ebrahimi, Touradj; Pun, Thierry; Nijholt, Anton; Patras, Ioannis (January 2012). "DEAP: A Database for Emotion Analysis Using Physiological Signals". IEEE Transactions on Affective Computing. 3 (1): 18–31. CiteSeerX   10.1.1.593.8470 . doi:10.1109/T-AFFC.2011.15. ISSN   1949-3045. S2CID   206597685.
  27. Katsigiannis, Stamos; Ramzan, Naeem (January 2018). "DREAMER: A Database for Emotion Recognition Through EEG and ECG Signals From Wireless Low-cost Off-the-Shelf Devices" (PDF). IEEE Journal of Biomedical and Health Informatics. 22 (1): 98–107. doi:10.1109/JBHI.2017.2688239. ISSN   2168-2194. PMID   28368836. S2CID   23477696. Archived from the original (PDF) on 1 November 2022. Retrieved 1 October 2019.
  28. Poria, Soujanya; Hazarika, Devamanyu; Majumder, Navonil; Naik, Gautam; Cambria, Erik; Mihalcea, Rada (2019). "MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations". Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: Association for Computational Linguistics: 527–536. arXiv: 1810.02508 . doi:10.18653/v1/p19-1050. S2CID   52932143.
  29. 1 2 Poria, S., Majumder, N., Mihalcea, R., & Hovy, E. (2019). Emotion recognition in conversation: Research challenges, datasets, and recent advances. IEEE Access, 7, 100943-100953.
  30. Stappen, Lukas; Schuller, Björn; Lefter, Iulia; Cambria, Erik; Kompatsiaris, Ioannis (2020). "Summary of MuSe 2020: Multimodal Sentiment Analysis, Emotion-target Engagement and Trustworthiness Detection in Real-life Media". Proceedings of the 28th ACM International Conference on Multimedia. Seattle, PA, USA: Association for Computing Machinery. pp. 4769–4770. arXiv: 2004.14858 . doi:10.1145/3394171.3421901. ISBN   9781450379885. S2CID   222278714.
  31. Ho, Vong (2020). "Emotion Recognition for Vietnamese Social Media Text". Computational Linguistics. Communications in Computer and Information Science. Vol. 1215. pp. 319–333. arXiv: 1911.09339 . doi:10.1007/978-981-15-6168-9_27. ISBN   978-981-15-6167-2. S2CID   208202333.
  32. Arnau-González, Pablo; Katsigiannis, Stamos; Arevalillo-Herráez, Miguel; Ramzan, Naeem (February 2021). "BED: A new dataset for EEG-based biometrics". IEEE Internet of Things Journal. (Early Access) (15): 1. doi:10.1109/JIOT.2021.3061727. ISSN   2327-4662. S2CID   233916681.
  33. "Affectiva".
  34. Bossetta, Michael; Schmøkel, Rasmus (2 January 2023). "Cross-Platform Emotions and Audience Engagement in Social Media Political Campaigning: Comparing Candidates' Facebook and Instagram Images in the 2020 US Election". Political Communication. 40 (1): 48–68. doi: 10.1080/10584609.2022.2128949 . ISSN   1058-4609.
  35. Peng, Yilang (January 2021). "What Makes Politicians' Instagram Posts Popular? Analyzing Social Media Strategies of Candidates and Office Holders with Computer Vision". The International Journal of Press/Politics. 26 (1): 143–166. doi:10.1177/1940161220964769. ISSN   1940-1612.
  36. Haim, Mario; Jungblut, Marc (15 March 2021). "Politicians' Self-depiction and Their News Portrayal: Evidence from 28 Countries Using Visual Computational Analysis". Political Communication. 38 (1–2): 55–74. doi:10.1080/10584609.2020.1753869. ISSN   1058-4609.
  37. Bossetta, Michael; Schmøkel, Rasmus (2 January 2023). "Cross-Platform Emotions and Audience Engagement in Social Media Political Campaigning: Comparing Candidates' Facebook and Instagram Images in the 2020 US Election". Political Communication. 40 (1): 48–68. doi: 10.1080/10584609.2022.2128949 . ISSN   1058-4609.
  38. Won, Donghyeon; Steinert-Threlkeld, Zachary C.; Joo, Jungseock (19 October 2017). "Protest Activity Detection and Perceived Violence Estimation from Social Media Images". Proceedings of the 25th ACM international conference on Multimedia. MM '17. New York, NY, USA: Association for Computing Machinery. pp. 786–794. arXiv: 1709.06204 . doi:10.1145/3123266.3123282. ISBN   978-1-4503-4906-2.
  39. Bushwick, Sophie. "This Video Watches You Back". Scientific American. Retrieved 27 January 2020.
  40. 1 2 DeMuth Jr., Chris (8 January 2016). "Apple Reads Your Mind". M&A Daily. Seeking Alpha. Retrieved 9 January 2016.
  41. "nViso". nViso.ch.
  42. "Visage Technologies".
  43. "Feeling sad, angry? Your future car will know".
  44. Varagur, Krithika (22 March 2016). "Cars May Soon Warn Drivers Before They Nod Off". Huffington Post.
  45. Shivhare, S. N., & Khethawat, S. (2012). Emotion detection from text. arXiv preprint arXiv : 1205.4944
  46. Ezhilarasi, R., & Minu, R. I. (2012). Automatic emotion recognition and classification. Procedia Engineering, 38, 21-26.
  47. Krcadinac, U., Pasquier, P., Jovanovic, J., & Devedzic, V. (2013). Synesketch: An open source library for sentence-based emotion recognition. IEEE Transactions on Affective Computing, 4(3), 312-325.
  48. Schmitt, M., Ringeval, F., & Schuller, B. W. (2016, September). At the Border of Acoustics and Linguistics: Bag-of-Audio-Words for the Recognition of Emotions in Speech. In Interspeech (pp. 495-499).
  49. Dhall, A., Goecke, R., Lucey, S., & Gedeon, T. (2012). Collecting large, richly annotated facial-expression databases from movies. IEEE multimedia, (3), 34-41.