Irfan Essa

Irfan Essa
Irfan Essa
Alma mater	MIT
Known for	facial recognition, video stabilization, computational photography, computational journalism
	Scientific career
Fields	Computer vision, computational journalism, machine learning, computer graphics, robotics
Institutions	Georgia Tech ; GVU Center
Thesis	Analysis, interpretation and synthesis of facial expressions (1995)
Doctoral advisor	Alex Pentland
Website	prof.irfanessa.com

Last updated January 06, 2025

Irfan Aziz Essa is a professor in the School of Interactive Computing of the College of Computing, and adjunct professor in the School of Electrical and Computer Engineering at the Georgia Institute of Technology (Georgia Tech). He is an associate dean in Georgia Tech's College of Computing^[1] and the director of the new Interdisciplinary Research Center for Machine Learning at Georgia Tech (ML@GT).^[2]

Education

Essa obtained his undergraduate degree in engineering at the Illinois Institute of Technology in 1988.^[3] Following this, Essa attended the Massachusetts Institute of Technology, where he received his magister scientiae (Master of Science) in 1990 and his Ph.D. in 1995 at the MIT Media Lab. His doctoral research focused on the implementation of a system to detect emotions from changes in your facial expression, which was later featured in the New York Times.^[4] He proceeded to hold a position as a research scientist at MIT from 1994 to 1996 before accepting a position at Georgia Tech.

Professional career

Essa's work focuses mainly in the areas of computer vision, computational photography, computer graphics and animation, robotics, computational perception, human-computer interaction, machine learning, computational journalism and artificial intelligence.

After departing MIT, Essa accepted a position as an assistant professor in the College of Computing at Georgia Tech. Today, he holds the position of a professor, and continues his research endeavors alongside his teaching career.

Essa has taught various courses over the years on digital video special effects, computer vision, computational journalism and computational photography.^[5] In the spring of 2013, Essa taught a free online course on computational photography, on the MOOC platform Coursera.^[6] He is affiliated with the GVU Center and RIM@GT, and is one of the faculty members of the Computational Perception Laboratory at Georgia Tech.

In addition to this, Essa has organized the Computational Journalism Symposium both in 2008 and 2013.^[7] He is credited, alongside his doctoral student Nick Diakopoulos, with coining the term computational journalism back in 2006, when they taught the first class on the subject.^[8]

Most recently, Essa has worked as a researcher / consultant with Google to develop a video stabilization algorithm alongside two of his doctoral students, Matthias Grundmann and Vivek Kwatra, which now runs on YouTube, and allows users to stabilize their uploaded videos in real-time.^[9]

Selected bibliography

Kwatra, Vivek, Arno Schödl, Irfan Essa, Greg Turk, and Aaron Bobick. "Graphcut textures: image and video synthesis using graph cuts." In ACM Transactions on Graphics, vol. 22, no. 3, pp. 277–286. ACM, 2003.
Kidd, Cory D., Robert Orr, Gregory D. Abowd, Christopher G. Atkeson, Irfan A. Essa, Blair MacIntyre, Elizabeth Mynatt, Thad E. Starner, and Wendy Newstetter. "The aware home: A living laboratory for ubiquitous computing research." In Cooperative buildings. Integrating information, organizations, and architecture, pp. 191–198. Springer Berlin Heidelberg, 1999.
Essa, Irfan A., and Alex Paul Pentland. "Coding, analysis, interpretation, and recognition of facial expressions." IEEE Transactions on Pattern Analysis and Machine Intelligence 19, no. 7 (1997): 757-763.
Schödl, Arno, Richard Szeliski, David H. Salesin, and Irfan Essa. "Video textures." In Proceedings of the 27th annual conference on Computer graphics and interactive techniques, pp. 489–498. ACM Press/Addison-Wesley Publishing Co., 2000.
Kwatra, Vivek, Irfan Essa, Aaron Bobick, and Nipun Kwatra. "Texture optimization for example-based synthesis." In ACM Transactions on Graphics, vol. 24, no. 3, pp. 795–802. ACM, 2005.
Essa, Irfan Aziz, and Alex P. Pentland. "Facial expression recognition using a dynamic model and motion energy." In Computer Vision, 1995. Proceedings., Fifth International Conference on, pp. 360–367. IEEE, 1995.
Moore, Darnell J., Irfan A. Essa, and Monson H. Hayes III. "Exploiting human actions and object context for recognition tasks." In Computer Vision, 1999. The Proceedings of the Seventh IEEE International Conference on, vol. 1, pp. 80–86. IEEE, 1999.
Basu, Sumit, Irfan Essa, and Alex Pentland. "Motion regularization for model-based head tracking." In Pattern Recognition, 1996., Proceedings of the 13th International Conference on, vol. 3, pp. 611–616. IEEE, 1996.
Haro, Antonio, Myron Flickner, and Irfan Essa. "Detecting and tracking eyes by using their physiological properties, dynamics, and appearance." In Computer Vision and Pattern Recognition, 2000. Proceedings. IEEE Conference on, vol. 1, pp. 163–168. IEEE, 2000.
Mynatt, Elizabeth D., Irfan Essa, and Wendy Rogers. "Increasing the opportunities for aging in place." In Proceedings on the 2000 conference on Universal Usability, pp. 65–71. ACM, 2000.
Grundmann, Matthias, Vivek Kwatra, Mei Han, and Irfan Essa. "Efficient hierarchical graph-based video segmentation." In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pp. 2141–2148. IEEE, 2010.
Grundmann, Matthias, Vivek Kwatra, and Irfan Essa. "Auto-directed video stabilization with robust L1 optimal camera paths." In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, pp. 225–232. IEEE, 2011.

Related Research Articles

The Association for Computing Machinery (ACM) is a US-based international learned society for computing. It was founded in 1947 and is the world's largest scientific and educational computing society. The ACM is a non-profit professional membership group, reporting nearly 110,000 student and professional members as of 2022. Its headquarters are in New York City.

A distance transform, also known as distance map or distance field, is a derived representation of a digital image. The choice of the term depends on the point of view on the object in question: whether the initial image is transformed into another representation, or it is simply endowed with an additional map or field.

General-purpose computing on graphics processing units is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the central processing unit (CPU). The use of multiple video cards in one computer, or large numbers of graphics chips, further parallelizes the already parallel nature of graphics processing.

Gesture recognition is an area of research and development in computer science and language technology concerned with the recognition and interpretation of human gestures. A subdiscipline of computer vision, it employs mathematical algorithms to interpret gestures.

Automatic image annotation is the process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image. This application of computer vision techniques is used in image retrieval systems to organize and locate images of interest from a database.

Ramesh Chandra Jain is a scientist and entrepreneur in the field of information and computer science. He is a Bren Professor in Information & Computer Sciences, Donald Bren School of Information and Computer Sciences, University of California, Irvine.

Takeo Kanade is a Japanese computer scientist and one of the world's foremost researchers in computer vision. He is U.A. and Helen Whitaker Professor at Carnegie Mellon School of Computer Science. He has approximately 300 peer-reviewed academic publications and holds around 20 patents.

Object recognition – technology in the field of computer vision for finding and identifying objects in an image or video sequence. Humans recognize a multitude of objects in images with little effort, despite the fact that the image of the objects may vary somewhat in different view points, in many different sizes and scales or even when they are translated or rotated. Objects can even be recognized when they are partially obstructed from view. This task is still a challenge for computer vision systems. Many approaches to the task have been implemented over multiple decades.

Activity recognition aims to recognize the actions and goals of one or more agents from a series of observations on the agents' actions and the environmental conditions. Since the 1980s, this research field has captured the attention of several computer science communities due to its strength in providing personalized support for many different applications and its connection to many different fields of study such as medicine, human-computer interaction, or sociology.

Informatics is the study of computational systems. According to the ACM Europe Council and Informatics Europe, informatics is synonymous with computer science and computing as a profession, in which the central notion is transformation of information. In some cases, the term "informatics" may also be used with different meanings, e.g. in the context of social computing, or in context of library science.

Computational journalism can be defined as the application of computation to the activities of journalism such as information gathering, organization, sensemaking, communication and dissemination of news information, while upholding values of journalism such as accuracy and verifiability. The field draws on technical aspects of computer science including artificial intelligence, content analysis, visualization, personalization and recommender systems as well as aspects of social computing and information science.

Multilinear principal component analysis (MPCA) is a multilinear extension of principal component analysis (PCA) that is used to analyze M-way arrays, also informally referred to as "data tensors". M-way arrays may be modeled by linear tensor models, such as CANDECOMP/Parafac, or by multilinear tensor models, such as multilinear principal component analysis (MPCA) or multilinear independent component analysis (MICA). The origin of MPCA can be traced back to the tensor rank decomposition introduced by Frank Lauren Hitchcock in 1927; to the Tucker decomposition; and to Peter Kroonenberg's "3-mode PCA" work. In 2000, De Lathauwer et al. restated Tucker and Kroonenberg's work in clear and concise numerical computational terms in their SIAM paper entitled "Multilinear Singular Value Decomposition", (HOSVD) and in their paper "On the Best Rank-1 and Rank-(R₁, R₂, ..., R_N ) Approximation of Higher-order Tensors".

Nicolai Petkov is computer scientist and professor emeritus of Intelligent Systems and Computer Science at the University of Groningen, known for his contributions in the fields of brain-inspired computing, pattern recognition, machine learning, and parallel computing.

Hanspeter Pfister is a Swiss computer scientist. He is the An Wang Professor of Computer Science at the Harvard John A. Paulson School of Engineering and Applied Sciences and an affiliate faculty member of the Center for Brain Science at Harvard University. His research in visual computing lies at the intersection of scientific visualization, information visualization, computer graphics, and computer vision and spans a wide range of topics, including biomedical image analysis and visualization, image and video analysis, and visual analytics in data science.

Gregory D. Hager is the Mandell Bellmore Professor of Computer Science and founding director of the Johns Hopkins Malone Center for Engineering in Healthcare at Johns Hopkins University.

Jiebo Luo is a Chinese-American computer scientist, the Albert Arendt Hopeman Professor of Engineering and professor of computer science at the University of Rochester. He is interested in artificial intelligence, data science and computer vision.

Wolfgang Heidrich is a German-Canadian computer scientist and Professor at the King Abdullah University of Science and Technology (KAUST), for which he served as the director of Visual Computing Center from 2014 to 2021. He was previously a professor at the University of British Columbia (UBC), where he was a Dolby Research Chair (2008-2013). His research has combined methods from computer graphics, optics, machine vision, imaging, inverse methods, and perception to develop new Computational Imaging and Display technologies. His more recent interest focuses on hardware-software co-design of the next generation of imaging systems, with applications such as high dynamic range (HDR) imaging, compact computational cameras, hyper-spectral cameras, wavefront sensors, to name just a few.

References

↑ Georgia Tech Directory for Irfan Essa
↑ "Georgia Tech Machine Learning Center Directory". Archived from the original on 2017-02-02. Retrieved 2017-01-25.
↑ Essa's MIT Alumni Page
↑ Goleman, Daniel (7 January 1997). "Laugh and Your Computer Will Laugh With You, Someday (Published 1997)". The New York Times . Archived from the original on 2015-10-16.
↑ Classes Taught by Professor Essa
↑ Computational Photography by Irfan Essa
↑ Georgia Tech Explores the Digital Future of Journalism
↑ "Computational Journalism Seminar". Archived from the original on 2013-06-26. Retrieved 2013-04-29.
↑ Video Stabilization - Google Research

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Georgia Tech Directory for Irfan Essa

[2] "Georgia Tech Machine Learning Center Directory". Archived from the original on 2017-02-02. Retrieved 2017-01-25.

[3] Essa's MIT Alumni Page

[4] Goleman, Daniel (7 January 1997). "Laugh and Your Computer Will Laugh With You, Someday (Published 1997)". The New York Times . Archived from the original on 2015-10-16.

[5] Classes Taught by Professor Essa

[6] Computational Photography by Irfan Essa

[7] Georgia Tech Explores the Digital Future of Journalism

[8] "Computational Journalism Seminar". Archived from the original on 2013-06-26. Retrieved 2013-04-29.

[9] Video Stabilization - Google Research

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]