Paul Viola | |
---|---|
Born | |
Nationality | American |
Alma mater | Massachusetts Institute of Technology |
Known for | Computer Vision and Facial Recognition |
Awards |
|
Scientific career | |
Fields | Computer Science |
Institutions |
|
Thesis | Alignment by Maximization of Mutual Information (1995) |
Doctoral advisor | Christopher G. Atkeson Tomas Lozano-Perez |
Paul Viola is a computer vision researcher, and Distinguished Engineer at Zoox. He is a former MIT professor, a former vice president of science for Amazon Prime Air and a former Distinguished Engineer at Microsoft. [1] [2] He is best known for his seminal work in facial recognition and machine learning. He is the co-inventor of the Viola–Jones object detection framework along with Michael Jones. [3] [4] He won the Marr Prize in 2003 and the Helmholtz Prize from the International Conference on Computer Vision in 2013. [5] He is the holder of at least 57 patents in the areas of advanced machine learning, web search, data mining, and image processing. [6] He is the author of more than 50 academic research papers with over 56,000 citations. [7]
Computer vision tasks include methods for acquiring, processing, analyzing, and understanding digital images, and extraction of high-dimensional data from the real world in order to produce numerical or symbolic information, e.g. in the form of decisions. "Understanding" in this context signifies the transformation of visual images into descriptions of the world that make sense to thought processes and can elicit appropriate action. This image understanding can be seen as the disentangling of symbolic information from image data using models constructed with the aid of geometry, physics, statistics, and learning theory.
In machine learning (ML), boosting is an ensemble metaheuristic for primarily reducing bias. It can also improve the stability and accuracy of ML classification and regression algorithms. Hence, it is prevalent in supervised learning for converting weak learners to strong learners.
Haar-like features are digital image features used in object recognition. They owe their name to their intuitive similarity with Haar wavelets and were used in the first real-time face detector.
The Viola–Jones object detection framework is a machine learning object detection framework proposed in 2001 by Paul Viola and Michael Jones. It was motivated primarily by the problem of face detection, although it can be adapted to the detection of other object classes.
Caltech 101 is a data set of digital images created in September 2003 and compiled by Fei-Fei Li, Marco Andreetto, Marc 'Aurelio Ranzato and Pietro Perona at the California Institute of Technology. It is intended to facilitate computer vision research and techniques and is most applicable to techniques involving image recognition classification and categorization. Caltech 101 contains a total of 9,146 images, split between 101 distinct object categories and a background category. Provided with the images are a set of annotations describing the outlines of each image, along with a Matlab script for viewing.
Object detection is a computer technology related to computer vision and image processing that deals with detecting instances of semantic objects of a certain class in digital images and videos. Well-researched domains of object detection include face detection and pedestrian detection. Object detection has applications in many areas of computer vision, including image retrieval and video surveillance.
Tomaso Armando Poggio, is the Eugene McDermott professor in the Department of Brain and Cognitive Sciences, an investigator at the McGovern Institute for Brain Research, a member of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) and director of both the Center for Biological and Computational Learning at MIT and the Center for Brains, Minds, and Machines, a multi-institutional collaboration headquartered at the McGovern Institute since 2013.
The International Conference on Computer Vision (ICCV) is a research conference sponsored by the Institute of Electrical and Electronics Engineers (IEEE) held every other year. It is considered to be one of the top conferences in computer vision, alongside CVPR and ECCV, and it is held on years in which ECCV is not.
Jitendra Malik is an Indian-American academic who is the Arthur J. Chick Professor of Electrical Engineering and Computer Sciences at the University of California, Berkeley. He is known for his research in computer vision.
Demetri Terzopoulos is a Greek-Canadian-American computer scientist and entrepreneur. He is currently a Distinguished Professor and Chancellor's Professor of Computer Science in the Henry Samueli School of Engineering and Applied Science at the University of California, Los Angeles, where he directs the UCLA Computer Graphics & Vision Laboratory.
Dorin Comaniciu is a Romanian-American computer scientist. He is Senior Vice President of Artificial Intelligence and Digital Innovation at Siemens Healthcare.
Alan Yuille is a Bloomberg Distinguished Professor of Computational Cognitive Science with appointments in the departments of Cognitive Science and Computer Science at Johns Hopkins University. Yuille develops models of vision and cognition for computers, intended for creating artificial vision systems. He studied under Stephen Hawking at Cambridge University on a PhD in theoretical physics, which he completed in 1981.
Michael J. Jones is an American computer scientist and inventor working as a computer vision researcher at Mitsubishi Electric Research Laboratories.
Michael Kass is an American computer scientist best known for his work in computer graphics and computer vision. He has won an Academy Award and the SIGGRAPH Computer Graphics Achievement Award and is an ACM Fellow.
Kristen Lorraine Grauman is a professor of computer science at the University of Texas at Austin on leave as a research scientist at Facebook AI Research (FAIR). She works on computer vision and machine learning.
Serge Belongie is a professor of Computer Science at the University of Copenhagen, where he also serves as the head of the Danish Pioneer Centre for Artificial Intelligence. Previously, he was the Andrew H. and Ann R. Tisch Professor of Computer Science at Cornell Tech, where he also served as Associate Dean. He has also been a member of the Visiting Faculty program at Google. He is known for his contributions to the fields of computer vision and machine learning, specifically object recognition and image segmentation, with his scientific research in these areas cited over 150,000 times according to Google Scholar. Along with Jitendra Malik, Belongie proposed the concept of shape context, a widely used feature descriptor in object recognition. He has co-founded several startups in the areas of computer vision and object recognition.
Olga Russakovsky is an associate professor of computer science at Princeton University. Her research investigates computer vision and machine learning. She was one of the leaders of the ImageNet Large Scale Visual Recognition challenge and has been recognised by MIT Technology Review as one of the world's top young innovators.
Michael J. Black is an American-born computer scientist working in Tübingen, Germany. He is a founding director at the Max Planck Institute for Intelligent Systems where he leads the Perceiving Systems Department in research focused on computer vision, machine learning, and computer graphics. He is also an Honorary Professor at the University of Tübingen.
Xiaoming Liu is a Chinese-American computer scientist and an academic. He is a Professor in the Department of Computer Science and Engineering, MSU Foundation Professor as well as Anil K. and Nandita Jain Endowed Professor of Engineering at Michigan State University.
Gérard G. Medioni is a computer scientist, author, academic and inventor. He is a vice president and distinguished scientist at Amazon and serves as emeritus professor of Computer Science at the University of Southern California.