List of facial expression databases

Last updated

A facial expression database is a collection of images or video clips with facial expressions of a range of emotions. Well-annotated (emotion-tagged) media content of facial behavior is essential for training, testing, and validation of algorithms for the development of expression recognition systems. The emotion annotation can be done in discrete emotion labels or on a continuous scale. Most of the databases are usually based on the basic emotions theory (by  Paul Ekman) which assumes the existence of six discrete basic emotions (anger, fear, disgust, surprise, joy, sadness). However, some databases include the emotion tagging in continuous arousal-valence scale.

In posed expression databases, the participants are asked to display different basic emotional expressions, while in spontaneous expression database, the expressions are natural. Spontaneous expressions differ from posed ones remarkably in terms of intensity, configuration, and duration. Apart from this, synthesis of some AUs are barely achievable without undergoing the associated emotional state. Therefore, in most cases, the posed expressions are exaggerated, while the spontaneous ones are subtle and differ in appearance.

Many publicly available databases are categorized here. [1] [2] Here are some details of the facial expression databases.

DatabaseFacial expressionNumber of SubjectsNumber of images/videosGray/ColorResolution, Frame rateGround truthType
FERG-3D-DB (Facial Expression Research Group 3D Database) for stylized characters [3] angry, disgust, fear, joy, neutral, sad, surprise439574 annotated examplesColorEmotion labelsFrontal pose
Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) [4] Speech: Calm, happy, sad, angry, fearful, surprise, disgust, and neutral.

Song: Calm, happy, sad, angry, fearful, and neutral. Each expression at two levels of emotional intensity.

24 7356 video and audio filesColor1280x720 (720p)Facial expression labels

Ratings provided by 319 human raters

Posed
Extended Cohn-Kanade Dataset (CK+) [5] neutral, sadness, surprise, happiness, fear, anger, contempt and disgust123 593 image sequences (327 sequences having discrete emotion labels)Mostly gray640* 490Facial expression labels and FACS (AU label for final frame in each image sequence)Posed; spontaneous smiles
Japanese Female Facial Expressions (JAFFE) [6] neutral, sadness, surprise, happiness, fear, anger, and disgust10213 static imagesGray256* 256Facial expression labelPosed
MMI Database [7] 431280 videos and over 250 imagesColor720* 576AU label for the image frame with apex facial expression in each image sequencePosed and Spontaneous
Belfast Database [8] Set 1 (disgust, fear, amusement, frustration, surprise)114570 video clipsColor720*576Natural Emotion
Set 2 (disgust, fear, amusement, frustration, surprise, anger, sadness)82650 video clipsColor
Set 3 (disgust, fear, amusement)60180 video clipsColor1920*1080
Indian Semi-Acted Facial Expression Database (iSAFE) [9] Happy, Sad, Fear, Surprise, Angry, Neutral, Disgust44395 clipsColor1920x1080

(60 fps)

Emotion labelsSpontaneous
DISFA [10] -274,845 video framesColor1024*768; 20 fpsAU intensity for each video frame (12 AUs)Spontaneous
Multimedia Understanding Group (MUG) [11] neutral, sadness, surprise, happiness, fear, anger, and disgust861462 sequencesColor896*896, 19fpsEmotion labelsPosed
Indian Spontaneous Expression Database (ISED) [12] sadness, surprise, happiness, and disgust50428 videos Color1920* 1080, 50 fpsEmotion labelsSpontaneous
Radboud Faces Database (RaFD) [13] neutral, sadness, contempt, surprise, happiness, fear, anger, and disgust67Three different gaze directions and five camera angles (8*67*3*5=8040 images)Color681*1024Emotion labelsPosed
Oulu-CASIA NIR-VIS databasesurprise, happiness, sadness, anger, fear and disgust80three different illumination conditions: normal, weak and dark (total 2880 video sequences)Color320×240Posed
FERG (Facial Expression Research Group Database)-DB [14] for stylized charactersangry, disgust, fear, joy, neutral, sad, surprise655767Color768x768Emotion labelsFrontal pose
AffectNet [15] neutral, happy, sad, surprise, fear, disgust, anger, contempt~450,000 manually annotated

~ 500,000 automatically annotated

ColorVariousEmotion labels, valence, arousalWild setting
IMPA-FACE3D [16] neutral frontal, joy, sadness, surprise, anger, disgust, fear, opened, closed, kiss, left side, right side, neutral sagittal left, neutral sagittal right, nape and forehead (acquired sometimes)38534 static imagesColor640X480Emotion labelsPosed
FEI Face Databaseneutral,smile2002800 static imagesColor640X480Emotion labelsPosed
Aff-Wild [17] [18] valence and arousal200~1,250,000 manually annotatedColorVarious (average = 640x360)Valence, ArousalIn-the-Wild setting
Aff-Wild2 [19] [20] neutral, happiness, sadness, surprise, fear, disgust, anger + valence-arousal + action units 1,2,4,6,12,15,20,25458~2,800,000 manually annotatedColorVarious (average = 1030x630)Valence, Arousal, 7 basic expressions, action units for each video frameIn-the-Wild setting
Real-world Affective Faces Database (RAF-DB) [21] [22] 6 classes of basic emotions (Surprised, Fear, Disgust, Happy, Sad, Angry) plus Neutral and 12 classes of compound emotions (Fearfully Surprised, Fearfully Disgusted, Sadly Angry, Sadly Fearful, Angrily Disgusted, Angrily Surprised, Sadly Disgusted, Disgustedly Surprised, Happily Surprised, Sadly Surprised, Fearfully Angry, Happily Disgusted)29672 annotated examplesColorVarious for original dataset and 100x100 for aligned datasetEmotion labelsPosed and Spontaneous

Related Research Articles

Affective computing is the study and development of systems and devices that can recognize, interpret, process, and simulate human affects. It is an interdisciplinary field spanning computer science, psychology, and cognitive science. While some core ideas in the field may be traced as far back as to early philosophical inquiries into emotion, the more modern branch of computer science originated with Rosalind Picard's 1995 paper on affective computing and her book Affective Computing published by MIT Press. One of the motivations for the research is the ability to give machines emotional intelligence, including to simulate empathy. The machine should interpret the emotional state of humans and adapt its behavior to them, giving an appropriate response to those emotions.

A facial expression is one or more motions or positions of the muscles beneath the skin of the face. According to one set of controversial theories, these movements convey the emotional state of an individual to observers. Facial expressions are a form of nonverbal communication. They are a primary means of conveying social information between humans, but they also occur in most other mammals and some other animal species.

<span class="mw-page-title-main">Facial Action Coding System</span> System of classifying human facial movements

The Facial Action Coding System (FACS) is a system to taxonomize human facial movements by their appearance on the face, based on a system originally developed by a Swedish anatomist named Carl-Herman Hjortsjö. It was later adopted by Paul Ekman and Wallace V. Friesen, and published in 1978. Ekman, Friesen, and Joseph C. Hager published a significant update to FACS in 2002. Movements of individual facial muscles are encoded by the FACS from slight different instant changes in facial appearance. It has proven useful to psychologists and to animators.

<span class="mw-page-title-main">Three-dimensional face recognition</span> Mode of facial recognition

Three-dimensional face recognition is a modality of facial recognition methods in which the three-dimensional geometry of the human face is used. It has been shown that 3D face recognition methods can achieve significantly higher accuracy than their 2D counterparts, rivaling fingerprint recognition.

<span class="mw-page-title-main">Rosalind Picard</span> American computer scientist

Rosalind Wright Picard is an American scholar and inventor who is Professor of Media Arts and Sciences at MIT, founder and director of the Affective Computing Research Group at the MIT Media Lab, and co-founder of the startups Affectiva and Empatica.

Affective haptics is the emerging area of research which focuses on the study and design of devices and systems that can elicit, enhance, or influence the emotional state of a human by means of sense of touch. The research field is originated with the Dzmitry Tsetserukou and Alena Neviarouskaya papers on affective haptics and real-time communication system with rich emotional and haptic channels. Driven by the motivation to enhance social interactivity and emotionally immersive experience of users of real-time messaging, virtual, augmented realities, the idea of reinforcing (intensifying) own feelings and reproducing (simulating) the emotions felt by the partner was proposed. Four basic haptic (tactile) channels governing our emotions can be distinguished:

  1. physiological changes
  2. physical stimulation
  3. social touch
  4. emotional haptic design.

Emotions in virtual communication are expressed and understood in a variety of different ways from those in face-to-face interactions. Virtual communication continues to evolve as technological advances emerge that give way to new possibilities in computer-mediated communication (CMC). The lack of typical auditory and visual cues associated with human emotion gives rise to alternative forms of emotional expression that are cohesive with many different virtual environments. Some environments provide only space for text based communication, where emotions can only be expressed using words. More newly developed forms of expression provide users the opportunity to portray their emotions using images.

Matti Kalevi Pietikäinen is a computer scientist. He is currently Professor (emer.) in the Center for Machine Vision and Signal Analysis, University of Oulu, Finland. His research interests are in texture-based computer vision, face analysis, affective computing, biometrics, and vision-based perceptual interfaces. He was Director of the Center for Machine Vision Research, and Scientific Director of Infotech Oulu.

<span class="mw-page-title-main">Irfan Essa</span>

Irfan Aziz Essa is a professor in the School of Interactive Computing of the College of Computing, and adjunct professor in the School of Electrical and Computer Engineering at the Georgia Institute of Technology. He is an associate dean in Georgia Tech's College of Computing and the director of the new Interdisciplinary Research Center for Machine Learning at Georgia Tech (ML@GT).

<span class="mw-page-title-main">MNIST database</span> Database of handwritten digits

The MNIST database is a large database of handwritten digits that is commonly used for training various image processing systems. The database is also widely used for training and testing in the field of machine learning. It was created by "re-mixing" the samples from NIST's original datasets. The creators felt that since NIST's training dataset was taken from American Census Bureau employees, while the testing dataset was taken from American high school students, it was not well-suited for machine learning experiments. Furthermore, the black and white images from NIST were normalized to fit into a 28x28 pixel bounding box and anti-aliased, which introduced grayscale levels.

Emotion recognition is the process of identifying human emotion. People vary widely in their accuracy at recognizing the emotions of others. Use of technology to help people with emotion recognition is a relatively nascent research area. Generally, the technology works best if it uses multiple modalities in context. To date, the most work has been conducted on automating the recognition of facial expressions from video, spoken expressions from audio, written expressions from text, and physiology as measured by wearables.

Artificial empathy or computational empathy is the development of AI systems—such as companion robots or virtual agents—that can detect emotions and respond to them in an empathic way.

<span class="mw-page-title-main">AlexNet</span> Convolutional neural network

AlexNet is the name of a convolutional neural network (CNN) architecture, designed by Alex Krizhevsky in collaboration with Ilya Sutskever and Geoffrey Hinton, who was Krizhevsky's Ph.D. advisor at the University of Toronto.

<span class="mw-page-title-main">René Vidal</span> Chilean computer scientist (born 1974)

René Vidal is a Chilean electrical engineer and computer scientist who is known for his research in machine learning, computer vision, medical image computing, robotics, and control theory. He is the Herschel L. Seder Professor of the Johns Hopkins Department of Biomedical Engineering, and the founding director of the Mathematical Institute for Data Science (MINDS).

<span class="mw-page-title-main">Ioannis Pavlidis</span> Greek-American scholar

Ioannis Thomas Pavlidis is a Greek American scholar. He is the distinguished Eckhard-Pfeiffer Professor of Computer Science at the University of Houston, founder, and director of the Affective and Data Computing Laboratory, formerly known as the Computational Physiology Lab (CPL).

Egocentric vision or first-person vision is a sub-field of computer vision that entails analyzing images and videos captured by a wearable camera, which is typically worn on the head or on the chest and naturally approximates the visual field of the camera wearer. Consequently, visual data capture the part of the scene on which the user focuses to carry out the task at hand and offer a valuable perspective to understand the user's activities and their context in a naturalistic setting.

Automated Pain Recognition (APR) is a method for objectively measuring pain and at the same time represents an interdisciplinary research area that comprises elements of medicine, psychology, psychobiology, and computer science. The focus is on computer-aided objective recognition of pain, implemented on the basis of machine learning.

<span class="mw-page-title-main">Michael J. Black</span> American-born computer scientist

Michael J. Black is an American-born computer scientist working in Tübingen, Germany. He is a founding director at the Max Planck Institute for Intelligent Systems where he leads the Perceiving Systems Department in research focused on computer vision, machine learning, and computer graphics. He is also an Honorary Professor at the University of Tübingen.

Hatice Gunes is a Turkish computer scientist who is Professor of Affective Intelligence & Robotics at the University of Cambridge. Gunes leads the Affective Intelligence & Robotics Lab. Her research considers human robot interactions and the development of sophisticated technologies with emotional intelligence.

References

  1. "collection of emotional databases". Archived from the original on 2018-03-25.
  2. "facial expression databases".
  3. Aneja, Deepali, et al. "Learning to generate 3D stylized character expressions from humans." 2018 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2018.
  4. Livingstone & Russo (2018). The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. doi : 10.1371/journal.pone.0196391
  5. P. Lucey, J. F. Cohn, T. Kanade, J. Saragih, Z. Ambadar and I. Matthews, "The Extended Cohn-Kanade Dataset (CK+): A complete facial expression dataset for action unit and emotion-specified expression," in 3rd IEEE Workshop on CVPR for Human Communicative Behavior Analysis, 2010
  6. Lyons, Michael; Kamachi, Miyuki; Gyoba, Jiro (1998). The Japanese Female Facial Expression (JAFFE) Database. doi:10.5281/zenodo.3451524.
  7. M. Valstar and M. Pantic, "Induced disgust, happiness and surprise: an addition to the MMI facial expression database," in Proc. Int. Conf. Language Resources and Evaluation, 2010
  8. I. Sneddon, M. McRorie, G. McKeown and J. Hanratty, "The Belfast induced natural emotion database," IEEE Trans. Affective Computing, vol. 3, no. 1, pp. 32-41, 2012
  9. Singh, Shivendra; Benedict, Shajulin (2020). "Indian Semi-Acted Facial Expression (ISAFE) Dataset for Human Emotions Recognition". In Thampi, Sabu M.; Hegde, Rajesh M.; Krishnan, Sri; Mukhopadhyay, Jayanta; Chaudhary, Vipin; Marques, Oge; Piramuthu, Selwyn; Corchado, Juan M. (eds.). Advances in Signal Processing and Intelligent Recognition Systems. Communications in Computer and Information Science. Vol. 1209. Singapore: Springer. pp. 150–162. doi:10.1007/978-981-15-4828-4_13. ISBN   978-981-15-4828-4.
  10. S. M. Mavadati, M. H. Mahoor, K. Bartlett, P. Trinh and J. Cohn., "DISFA: A Spontaneous Facial Action Intensity Database," IEEE Trans. Affective Computing, vol. 4, no. 2, pp. 151–160, 2013
  11. N. Aifanti, C. Papachristou and A. Delopoulos, The MUG Facial Expression Database, in Proc. 11th Int. Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), Desenzano, Italy, April 12–14, 2010.
  12. S L Happy, P. Patnaik, A. Routray, and R. Guha, "The Indian Spontaneous Expression Database for Emotion Recognition," in IEEE Transactions on Affective Computing, 2016, doi : 10.1109/TAFFC.2015.2498174.
  13. Langner, O., Dotsch, R., Bijlstra, G., Wigboldus, D.H.J., Hawk, S.T., & van Knippenberg, A. (2010). Presentation and validation of the Radboud Faces Database. Cognition & Emotion, 24(8), 1377—1388. doi : 10.1080/02699930903485076
  14. "Facial Expression Research Group Database (FERG-DB)". grail.cs.washington.edu. Retrieved 2016-12-06.
  15. Mollahosseini, A.; Hasani, B.; Mahoor, M. H. (2017). "AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild". IEEE Transactions on Affective Computing. PP (99): 18–31. arXiv: 1708.03985 . doi:10.1109/TAFFC.2017.2740923. ISSN   1949-3045. S2CID   37515850.
  16. "IMPA-FACE3D Technical Reports". visgraf.impa.br. Retrieved 2018-03-08.
  17. Zafeiriou, S.; Kollias, D.; Nicolaou, M.A.; Papaioannou, A.; Zhao, G.; Kotsia, I. (2017). "Aff-Wild: Valence and Arousal 'In-the-Wild' Challenge" (PDF). 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). pp. 1980–1987. doi:10.1109/CVPRW.2017.248. ISBN   978-1-5386-0733-6. S2CID   3107614.
  18. Kollias, D.; Tzirakis, P.; Nicolaou, M.A.; Papaioannou, A.; Zhao, G.; Schuller, B.; Kotsia, I.; Zafeiriou, S. (2019). "Deep Affect Prediction in-the-wild: Aff-Wild Database and Challenge, Deep Architectures, and Beyond". International Journal of Computer Vision. 127 (6–7): 907–929. arXiv: 1804.10938 . doi: 10.1007/s11263-019-01158-4 . S2CID   13679040.
  19. Kollias, D.; Zafeiriou, S. (2019). "Expression, affect, action unit recognition: Aff-wild2, multi-task learning and arcface" (PDF). British Machine Vision Conference (BMVC), 2019. arXiv: 1910.04855 .
  20. Kollias, D.; Schulc, A.; Hajiyev, E.; Zafeiriou, S. (2020). "Analysing Affective Behavior in the First ABAW 2020 Competition". 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020). pp. 637–643. arXiv: 2001.11409 . doi:10.1109/FG47880.2020.00126. ISBN   978-1-7281-3079-8. S2CID   210966051.
  21. Li., S. "RAF-DB". Real-world Affective Faces Database.
  22. Li, S.; Deng, W.; Du, J. (2017). "Reliable Crowdsourcing and Deep Locality-Preserving Learning for Expression Recognition in the Wild". 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2584–2593. doi:10.1109/CVPR.2017.277. ISBN   978-1-5386-0457-1. S2CID   11413183.