Irfan Essa

Last updated
Irfan Essa
Irfan Essa.jpeg
Alma mater MIT
Known forfacial recognition, video stabilization, computational photography, computational journalism
Scientific career
Fields Computer vision, computational journalism, machine learning, computer graphics, robotics
Institutions Georgia Tech
GVU Center
Thesis Analysis, interpretation and synthesis of facial expressions  (1995)
Doctoral advisor Alex Pentland
Website prof.irfanessa.com

Irfan Aziz Essa is a professor in the School of Interactive Computing of the College of Computing, and adjunct professor in the School of Electrical and Computer Engineering at the Georgia Institute of Technology (Georgia Tech). He is an associate dean in Georgia Tech's College of Computing [1] and the director of the new Interdisciplinary Research Center for Machine Learning at Georgia Tech (ML@GT). [2]

Contents

Education

Essa obtained his undergraduate degree in engineering at the Illinois Institute of Technology in 1988. [3] Following this, Essa attended the Massachusetts Institute of Technology, where he received his magister scientiae (Master of Science) in 1990 and his Ph.D. in 1995 at the MIT Media Lab. His doctoral research focused on the implementation of a system to detect emotions from changes in your facial expression, which was later featured in the New York Times. [4] He proceeded to hold a position as a research scientist at MIT from 1994 to 1996 before accepting a position at Georgia Tech.

Professional career

Essa's work focuses mainly in the areas of computer vision, computational photography, computer graphics and animation, robotics, computational perception, human-computer interaction, machine learning, computational journalism and artificial intelligence.

After departing MIT, Essa accepted a position as an assistant professor in the College of Computing at Georgia Tech. Today, he holds the position of a professor, and continues his research endeavors alongside his teaching career.

Essa has taught various courses over the years on digital video special effects, computer vision, computational journalism and computational photography. [5] In the spring of 2013, Essa taught a free online course on computational photography, on the MOOC platform Coursera. [6] He is affiliated with the GVU Center and RIM@GT, and is one of the faculty members of the Computational Perception Laboratory at Georgia Tech.

In addition to this, Essa has organized the Computational Journalism Symposium both in 2008 and 2013. [7] He is credited, alongside his doctoral student Nick Diakopoulos, with coining the term computational journalism back in 2006, when they taught the first class on the subject. [8]

Most recently, Essa has worked as a researcher / consultant with Google to develop a video stabilization algorithm alongside two of his doctoral students, Matthias Grundmann and Vivek Kwatra, which now runs on YouTube, and allows users to stabilize their uploaded videos in real-time. [9]

Selected bibliography

Related Research Articles

The Association for Computing Machinery (ACM) is a US-based international learned society for computing. It was founded in 1947 and is the world's largest scientific and educational computing society. The ACM is a non-profit professional membership group, reporting nearly 110,000 student and professional members as of 2022. Its headquarters are in New York City.

<span class="mw-page-title-main">Distance transform</span> Derived representation of a digital image

A distance transform, also known as distance map or distance field, is a derived representation of a digital image. The choice of the term depends on the point of view on the object in question: whether the initial image is transformed into another representation, or it is simply endowed with an additional map or field.

General-purpose computing on graphics processing units is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the central processing unit (CPU). The use of multiple video cards in one computer, or large numbers of graphics chips, further parallelizes the already parallel nature of graphics processing.

<span class="mw-page-title-main">Gesture recognition</span> Topic in computer science and language technology

Gesture recognition is an area of research and development in computer science and language technology concerned with the recognition and interpretation of human gestures. A subdiscipline of computer vision, it employs mathematical algorithms to interpret gestures.

<span class="mw-page-title-main">Automatic image annotation</span>

Automatic image annotation is the process by which a computer system automatically assigns metadata in the form of captioning or keywords to a digital image. This application of computer vision techniques is used in image retrieval systems to organize and locate images of interest from a database.

Ramesh Chandra Jain is a scientist and entrepreneur in the field of information and computer science. He is a Bren Professor in Information & Computer Sciences, Donald Bren School of Information and Computer Sciences, University of California, Irvine.

<span class="mw-page-title-main">Takeo Kanade</span> Japanese computer scientist

Takeo Kanade is a Japanese computer scientist and one of the world's foremost researchers in computer vision. He is U.A. and Helen Whitaker Professor at Carnegie Mellon School of Computer Science. He has approximately 300 peer-reviewed academic publications and holds around 20 patents.

Object recognition – technology in the field of computer vision for finding and identifying objects in an image or video sequence. Humans recognize a multitude of objects in images with little effort, despite the fact that the image of the objects may vary somewhat in different view points, in many different sizes and scales or even when they are translated or rotated. Objects can even be recognized when they are partially obstructed from view. This task is still a challenge for computer vision systems. Many approaches to the task have been implemented over multiple decades.

Activity recognition aims to recognize the actions and goals of one or more agents from a series of observations on the agents' actions and the environmental conditions. Since the 1980s, this research field has captured the attention of several computer science communities due to its strength in providing personalized support for many different applications and its connection to many different fields of study such as medicine, human-computer interaction, or sociology.

Informatics is the study of computational systems. According to the ACM Europe Council and Informatics Europe, informatics is synonymous with computer science and computing as a profession, in which the central notion is transformation of information. In some cases, the term "informatics" may also be used with different meanings, e.g. in the context of social computing, or in context of library science.

Computational journalism can be defined as the application of computation to the activities of journalism such as information gathering, organization, sensemaking, communication and dissemination of news information, while upholding values of journalism such as accuracy and verifiability. The field draws on technical aspects of computer science including artificial intelligence, content analysis, visualization, personalization and recommender systems as well as aspects of social computing and information science.

Multilinear principal component analysis (MPCA) is a multilinear extension of principal component analysis (PCA) that is used to analyze M-way arrays, also informally referred to as "data tensors". M-way arrays may be modeled by linear tensor models, such as CANDECOMP/Parafac, or by multilinear tensor models, such as multilinear principal component analysis (MPCA) or multilinear independent component analysis (MICA). The origin of MPCA can be traced back to the tensor rank decomposition introduced by Frank Lauren Hitchcock in 1927; to the Tucker decomposition; and to Peter Kroonenberg's "3-mode PCA" work. In 2000, De Lathauwer et al. restated Tucker and Kroonenberg's work in clear and concise numerical computational terms in their SIAM paper entitled "Multilinear Singular Value Decomposition", (HOSVD) and in their paper "On the Best Rank-1 and Rank-(R1, R2, ..., RN ) Approximation of Higher-order Tensors".

Nicolai Petkov is computer scientist and professor emeritus of Intelligent Systems and Computer Science at the University of Groningen, known for his contributions in the fields of brain-inspired computing, pattern recognition, machine learning, and parallel computing.

<span class="mw-page-title-main">Hanspeter Pfister</span> Swiss computer scientist

Hanspeter Pfister is a Swiss computer scientist. He is the An Wang Professor of Computer Science at the Harvard John A. Paulson School of Engineering and Applied Sciences and an affiliate faculty member of the Center for Brain Science at Harvard University. His research in visual computing lies at the intersection of scientific visualization, information visualization, computer graphics, and computer vision and spans a wide range of topics, including biomedical image analysis and visualization, image and video analysis, and visual analytics in data science.

<span class="mw-page-title-main">Gregory D. Hager</span> American computer scientist

Gregory D. Hager is the Mandell Bellmore Professor of Computer Science and founding director of the Johns Hopkins Malone Center for Engineering in Healthcare at Johns Hopkins University.

Jiebo Luo is a Chinese-American computer scientist, the Albert Arendt Hopeman Professor of Engineering and professor of computer science at the University of Rochester. He is interested in artificial intelligence, data science and computer vision.

Wolfgang Heidrich is a German-Canadian computer scientist and Professor at the King Abdullah University of Science and Technology (KAUST), for which he served as the director of Visual Computing Center from 2014 to 2021. He was previously a professor at the University of British Columbia (UBC), where he was a Dolby Research Chair (2008-2013). His research has combined methods from computer graphics, optics, machine vision, imaging, inverse methods, and perception to develop new Computational Imaging and Display technologies. His more recent interest focuses on hardware-software co-design of the next generation of imaging systems, with applications such as high dynamic range (HDR) imaging, compact computational cameras, hyper-spectral cameras, wavefront sensors, to name just a few.

References

  1. Georgia Tech Directory for Irfan Essa
  2. "Georgia Tech Machine Learning Center Directory". Archived from the original on 2017-02-02. Retrieved 2017-01-25.
  3. Essa's MIT Alumni Page
  4. Goleman, Daniel (7 January 1997). "Laugh and Your Computer Will Laugh With You, Someday (Published 1997)". The New York Times . Archived from the original on 2015-10-16.
  5. Classes Taught by Professor Essa
  6. Computational Photography by Irfan Essa
  7. Georgia Tech Explores the Digital Future of Journalism
  8. "Computational Journalism Seminar". Archived from the original on 2013-06-26. Retrieved 2013-04-29.
  9. Video Stabilization - Google Research