Scene statistics is a discipline within the field of perception. It is concerned with the statistical regularities related to scenes. It is based on the premise that a perceptual system is designed to interpret scenes.
Biological perceptual systems have evolved in response to physical properties of natural environments. [1] Therefore natural scenes receive a great deal of attention. [2]
Natural scene statistics are useful for defining the behavior of an ideal observer in a natural task, typically by incorporating signal detection theory, information theory, or estimation theory.
One of the most successful applications of Natural Scenes Statistics Models has been perceptual picture and video quality prediction. For example, the Visual Information Fidelity (VIF) algorithm, which is used to measure the degree of distortion of pictures and videos, is used extensively by the image and video processing communities to assess perceptual quality, often after processing, such as compression, which can degrade the appearance of a visual signal. The premise is that the scene statistics are changed by distortion, and that the visual system is sensitive to the changes in the scene statistics. VIF is heavily used in the streaming television industry. Other popular picture quality models that use natural scene statistics include BRISQUE, [3] and NIQE [4] both of which are no-reference, since they do not require any reference picture to measure quality against.
Geisler (2008) [6] distinguishes between four kinds of domains: (1) Physical environments, (2) Images/Scenes, (3) Neural responses, and (4) Behavior.
Within the domain of images/scenes, one can study the characteristics of information related to redundancy and efficient coding.
Across-domain statistics determine how an autonomous system should make inferences about its environment, process information, and control its behavior. To study these statistics, it is necessary to sample or register information in multiple domains simultaneously.
Within visual perception, an optical illusion is an illusion caused by the visual system and characterized by a visual percept that arguably appears to differ from reality. Illusions come in a wide variety; their categorization is difficult because the underlying cause is often not clear but a classification proposed by Richard Gregory is useful as an orientation. According to that, there are three main classes: physical, physiological, and cognitive illusions, and in each class there are four kinds: Ambiguities, distortions, paradoxes, and fictions. A classical example for a physical distortion would be the apparent bending of a stick half immerged in water; an example for a physiological paradox is the motion aftereffect. An example for a physiological fiction is an afterimage. Three typical cognitive distortions are the Ponzo, Poggendorff, and Müller-Lyer illusion. Physical illusions are caused by the physical environment, e.g. by the optical properties of water. Physiological illusions arise in the eye or the visual pathway, e.g. from the effects of excessive stimulation of a specific receptor type. Cognitive visual illusions are the result of unconscious inferences and are perhaps those most widely known.
The Müller-Lyer illusion is an optical illusion consisting of three stylized arrows. When viewers are asked to place a mark on the figure at the midpoint, they tend to place it more towards the "tail" end. The illusion was devised by Franz Carl Müller-Lyer (1857–1916), a German sociologist, in 1889.
Color vision, a feature of visual perception, is an ability to perceive differences between light composed of different wavelengths independently of light intensity. Color perception is a part of the larger visual system and is mediated by a complex process between neurons that begins with differential stimulation of different types of photoreceptors by light entering the eye. Those photoreceptors then emit outputs that are propagated through many layers of neurons and then ultimately to the brain. Color vision is found in many animals and is mediated by similar underlying mechanisms with common types of biological molecules and a complex history of evolution in different animal taxa. In primates, color vision may have evolved under selective pressure for a variety of visual tasks including the foraging for nutritious young leaves, ripe fruit, and flowers, as well as detecting predator camouflage and emotional states in other primates.
The opponent process is a color theory that states that the human visual system interprets information about color by processing signals from cone cells and rod cells in an antagonistic manner. There is some overlap in the wavelengths of light to which the three types of cones respond, so it is more efficient for the visual system to record differences between the responses of cones, rather than each type of cone's individual response. The opponent color theory suggests that there are three opponent channels the cone photoreceptors are linked together to form three opposing color pairs: red versus green, blue versus yellow, and black versus white. It was first proposed in 1892 by the German physiologist Ewald Hering.
Tone mapping is a technique used in image processing and computer graphics to map one set of colors to another to approximate the appearance of high-dynamic-range images in a medium that has a more limited dynamic range. Print-outs, CRT or LCD monitors, and projectors all have a limited dynamic range that is inadequate to reproduce the full range of light intensities present in natural scenes. Tone mapping addresses the problem of strong contrast reduction from the scene radiance to the displayable range while preserving the image details and color appearance important to appreciate the original scene content.
The efficient coding hypothesis was proposed by Horace Barlow in 1961 as a theoretical model of sensory coding in the brain. Within the brain, neurons communicate with one another by sending electrical impulses referred to as action potentials or spikes. One goal of sensory neuroscience is to decipher the meaning of these spikes in order to understand how the brain represents and processes information about the outside world. Barlow hypothesized that the spikes in the sensory system formed a neural code for efficiently representing sensory information. By efficient Barlow meant that the code minimized the number of spikes needed to transmit a given signal. This is somewhat analogous to transmitting information across the internet, where different file formats can be used to transmit a given image. Different file formats require different number of bits for representing the same image at given distortion level, and some are better suited for representing certain classes of images than others. According to this model, the brain is thought to use a code which is suited for representing visual and audio information representative of an organism's natural environment.
Celeste McCollough Howard is an American psychologist who conducts research in human visual perception. She is best known for her discovery in 1965 of the first contingent aftereffect, known soon after as the McCollough effect.
The Chubb illusion is an optical illusion or error in visual perception in which the apparent contrast of an object varies substantially to most viewers depending on its relative contrast to the field on which it is displayed. These visual illusions are of particular interest to researchers because they may provide valuable insights in regard to the workings of human visual systems.
Lightness is a visual perception of the luminance of an object. It is often judged relative to a similarly lit object. In colorimetry and color appearance models, lightness is a prediction of how an illuminated color will appear to a standard observer. While luminance is a linear measurement of light, lightness is a linear prediction of the human perception of that light.
Image quality can refer to the level of accuracy in which different imaging systems capture, process, store, compress, transmit and display the signals that form an image. Another definition refers to image quality as "the weighted combination of all of the visually significant attributes of an image". The difference between the two definitions is that one focuses on the characteristics of signal processing in different imaging systems and the latter on the perceptual assessments that make an image pleasant for human viewers.
Dale Purves is Geller Professor of Neurobiology Emeritus in the Duke Institute for Brain Sciences where he remains Research Professor with additional appointments in the department of Psychology and Brain Sciences, and the department of Philosophy at Duke University. He earned a B.A. from Yale University in 1960 and an M.D. from Harvard Medical School in 1964. After further clinical training as a surgical resident at the Massachusetts General Hospital, service as a Peace Corps physician, and postdoctoral training at Harvard and University College London, he was appointed to the faculty at Washington University School of Medicine in 1973. He came to Duke in 1990 as the founding chair of the Department of Neurobiology at Duke Medical Center, and was subsequently Director of Duke's Center for Cognitive Neuroscience (2003-2009) and also served as the Director of the Neuroscience and Behavioral Disorders Program at the Duke-NUS Graduate Medical School in Singapore (2009-2013).
Foveated imaging is a digital image processing technique in which the image resolution, or amount of detail, varies across the image according to one or more "fixation points". A fixation point indicates the highest resolution region of the image and corresponds to the center of the eye's retina, the fovea.
Alan Conrad Bovik is an American engineer and vision scientist. He is a Professor at The University of Texas at Austin (UT-Austin), where he holds the Cockrell Family Regents Endowed Chair and is Director of the Laboratory for Image and Video Engineering. He is a faculty member in the UT-Austin Department of Electrical and Computer Engineering, the Institute for Neuroscience, and the Wireless Networking and Communications Group.
Visual perception is the ability to interpret the surrounding environment through color vision, scotopic vision, and mesopic vision, using light in the visible spectrum reflected by the objects in the environment. This is different from visual acuity, which refers to how clearly a person sees. A person can have problems with visual perceptual processing even if they have 20/20 vision.
Ideal observer analysis is a method for investigating how information is processed in a perceptual system. It is also a basic principle that guides modern research in perception.
A perceptual system is a computational system designed to make inferences about properties of a physical environment based on senses. Other definitions may exist.
In the field of perception, a scene is information that can flow from a physical environment into a perceptual system via sensory transduction.
Due to the effect of a spatial context or temporal context, the perceived orientation of a test line or grating pattern can appear tilted away from its physical orientation. The tilt illusion (TI) is the phenomenon that the perceived orientation of a test line or grating is altered by the presence of surrounding lines or grating with a different orientation. And the tilt aftereffect (TAE) is the phenomenon that the perceived orientation is changed after prolonged inspection of another oriented line or grating.
Russell L. De Valois was an American scientist recognized for his pioneering research on spatial and color vision.
Beau Lotto is a Professor of Neuroscience and author. He is a professor at University of London, as well as a visiting scholar at New York University. His research explores how the brain adapts to uncertainty at the cellular, computational and perceptual levels with the aim of understanding the fundamental principles of biologically-inspired innovation.