Visual perception

Last updated February 04, 2026

Visual perception is the ability to detect light and use it to form an image of the surrounding environment.^[1] Photodetection without image formation is classified as light sensing. In most vertebrates, visual perception can be enabled by photopic vision (daytime vision) or scotopic vision (night vision), with most vertebrates having both. Visual perception detects light (photons) in the visible spectrum reflected by objects in the environment or emitted by light sources. The visible range of light is defined by what is readily perceptible to humans, though the visual perception of non-humans often extends beyond the visual spectrum. The resulting perception is also known as vision, sight, or eyesight (adjectives visual, optical, and ocular, respectively). The various physiological components involved in vision are referred to collectively as the visual system, and are the focus of much research in linguistics, psychology, cognitive science, neuroscience, and molecular biology, collectively referred to as vision science.

Visual system
Study
Early studies
Unconscious inference
Gestalt theory
Language model
Analysis of eye movement
Face and object recognition
Cognitive and computational approaches
Transduction
Opponent process
Artificial visual perception
See also
Vision deficiencies or disorders
Related disciplines
References
Further reading
External links

Visual perception involves not only what we see but also how our brain process information, which is also adaptive and influenced by both lifelong experience and changing cognitive capacities.^[2]^[3]

Visual system

Most vertebrates achieve vision through similar visual systems. Generally, light enters the eye through the cornea and is focused by the lens onto the retina, a light-sensitive membrane at the back of the eye. Specialized photoreceptive cells in the retina act as transducers, converting the light into neural impulses. The photoreceptors are broadly classed into cone cells and rod cells, which enable photopic and scotopic vision, respectively. These photoreceptors' signals are transmitted by the optic nerve, from the retina upstream to central ganglia in the brain. The lateral geniculate nucleus, which transmits the information to the visual cortex. Signals from the retina also travel directly from the retina to the superior colliculus.^[4]

The lateral geniculate nucleus sends signals to the primary visual cortex, also called striate cortex. Extrastriate cortex, also called visual association cortex is a set of cortical structures, that receive information from striate cortex, as well as each other.^[5] Recent descriptions of visual association cortex describe a division into two functional pathways, a ventral and a dorsal pathway. This conjecture is known as the two streams hypothesis.

Study

The major problem in visual perception is that what people see is not simply a translation of retinal stimuli (i.e., the image on the retina), with the brain altering the basic information taken in. Thus people interested in perception have long struggled to explain what visual processing does to create what is actually seen.

Early studies

There were two major ancient Greek schools, providing a primitive explanation of how vision works.

The first was the "emission theory" of vision which maintained that vision occurs when rays emanate from the eyes and are intercepted by visual objects. If an object was seen directly it was by 'means of rays' coming out of the eyes and again falling on the object. A refracted image was, however, seen by 'means of rays' as well, which came out of the eyes, traversed through the air, and after refraction, fell on the visible object which was sighted as the result of the movement of the rays from the eye. This theory was championed by scholars who were followers of Euclid's Optics and Ptolemy's Optics .

The second school advocated the so-called 'intromission' approach which sees vision as coming from something entering the eyes representative of the object. With its main propagator Aristotle ( De Sensu ),^[6] and his followers,^[6] this theory seems to have some contact with modern theories of what vision really is, but it remained only a speculation lacking any experimental foundation.

The most decisive development of the intromission theory came from the work of the 11th-century scholar Ibn al-Haytham (Alhazen). In his Book of Optics (Kitāb al-Manāẓir, c. 1021), he rejected both the extramission theory of Euclid and Ptolemy and the purely speculative account of Aristotle. Through systematic experimentation, he demonstrated that vision occurs when light rays reflected from objects enter the eye, where they are focused by the lens onto the retina. This empirical approach marked a turning point: Alhazen not only provided the first correct explanation of vision in terms of intromission^[7] but also introduced experimental methods that influenced later European scholars such as Roger Bacon, Kepler, and eventually Newton.^[8]^[9]

Both schools of thought relied upon the principle that "like is only known by like", and thus upon the notion that the eye was composed of some "internal fire" that interacted with the "external fire" of visible light and made vision possible. Plato makes this assertion in his dialogue Timaeus (45b and 46b), as does Empedocles (as reported by Aristotle in his De Sensu, DK frag. B17).^[6]

Alhazen (965 – c. 1040) carried out many investigations and experiments on visual perception, extended the work of Ptolemy on binocular vision, and commented on the anatomical works of Galen.^[10]^[11] He was the first person to explain that vision occurs when light bounces on an object and then is directed to one's eyes.^[12]

Leonardo da Vinci (1452–1519) is believed to be the first to recognize the special optical qualities of the eye. He wrote "The function of the human eye ... was described by a large number of authors in a certain way. But I found it to be completely different." His main experimental finding was that there is only a distinct and clear vision at the line of sight—the optical line that ends at the fovea. Although he did not use these words literally he actually is the father of the modern distinction between foveal and peripheral vision.^[13]

Isaac Newton (1642–1726/27) was the first to discover through experimentation, by isolating individual colors of the spectrum of light passing through a prism, that the visually perceived color of objects appeared due to the character of light the objects reflected, and that these divided colors could not be changed into any other color, which was contrary to scientific expectation of the day.^[14]

Unconscious inference

Hermann von Helmholtz is often credited with the first modern study of visual perception. Helmholtz examined the human eye and concluded that it was incapable of producing a high-quality image. Insufficient information seemed to make vision impossible. He, therefore, concluded that vision could only be the result of some form of "unconscious inference", coining that term in 1867. He proposed the brain was making assumptions and conclusions from incomplete data, based on previous experiences.^[15]

Inference requires prior experience of the world.

Examples of well-known assumptions, based on visual experience, are:

light comes from above;
objects are normally not viewed from below;
faces are seen (and recognized) upright;^[16]
closer objects can block the view of more distant objects, but not vice versa; and
figures (i.e., foreground objects) tend to have convex borders.

The study of visual illusions (cases when the inference process goes wrong) has yielded much insight into what sort of assumptions the visual system makes.

Another type of unconscious inference hypothesis (based on probabilities) has recently been revived in so-called Bayesian studies of visual perception.^[17] Proponents of this approach consider that the visual system performs some form of Bayesian inference to derive a perception from sensory data. However, it is not clear how proponents of this view derive, in principle, the relevant probabilities required by the Bayesian equation. Models based on this idea have been used to describe various visual perceptual functions, such as the perception of motion, the perception of depth, and figure-ground perception.^[18]^[19] The "wholly empirical theory of perception" is a related and newer approach that rationalizes visual perception without explicitly invoking Bayesian formalisms.^{[ citation needed ]}

Gestalt theory

Gestalt psychologists working primarily in the 1930s and 1940s raised many of the research questions that are studied by vision scientists today.^[20]

The Gestalt Laws of Organization have guided the study of how people perceive visual components as organized patterns or wholes, instead of many different parts. "Gestalt" is a German word that partially translates to "configuration or pattern" along with "whole or emergent structure". According to this theory, there are eight main factors that determine how the visual system automatically groups elements into patterns: Proximity, Similarity, Closure, Symmetry, Common Fate (i.e. common motion), Continuity as well as Good Gestalt (pattern that is regular, simple, and orderly) and Past Experience.^[21]

Language model

Following in the footsteps of George Berkeley, the Australian philosopher Colin Murray Turbayne argued in favor of an alternative to the classical "geometric model," of visual perception by asserting that aspects of it have needlessly clouded our understanding of vision since the time of Euclid. Quoting the sculptor Naum Gabo he notes: "Lines, shapes, color and movement have a language of their own, but reading takes time. It is not enough to look. you must see and "see" means "read".^[22] Turbayne argued that a "language model peculiarly illuminates this ancient problem of how we see, shedding a bright light on dark areas dimly light by its great rival."^[23] Specifically, he highlighted the limitations found within a purely mechanistic explanation of vision by arguing that several cases of "visual illusion" can be more adequately explained through the utilization of the terms found within such a language model. With this in mind, he presented a comparative analysis of specific examples of visual distortion including: the "Barrovian Case", the case of the "Horizontal Moon" and the case of the "Inverted Retinal Image."^[24]^[25]^[26]

Analysis of eye movement

During the 1960s, technical development permitted the continuous registration of eye movement during reading,^[27] in picture viewing,^[28] and later, in visual problem solving,^[29] and when headset-cameras became available, also during driving.^[30]

The picture to the right shows what may happen during the first two seconds of visual inspection. While the background is out of focus, representing the peripheral vision, the first eye movement goes to the boots of the man (just because they are very near the starting fixation and have a reasonable contrast). Eye movements serve the function of attentional selection, i.e., to select a fraction of all visual inputs for deeper processing by the brain.^[31]

The following fixations jump from face to face. They might even permit comparisons between faces.^[32]

It may be concluded that the icon face is a very attractive search icon within the peripheral field of vision. The foveal vision adds detailed information to the peripheral first impression.

It can also be noted that there are different types of eye movements: fixational eye movements (microsaccades, ocular drift, and tremor), vergence movements, saccadic movements and pursuit movements. Fixations are comparably static points where the eye rests. However, the eye is never completely still, and gaze position will drift. These drifts are in turn corrected by microsaccades, very small fixational eye movements. Vergence movements involve the cooperation of both eyes to allow for an image to fall on the same area of both retinas. This results in a single focused image. Saccadic movements is the type of eye movement that makes jumps from one position to another position and is used to rapidly scan a particular scene/image. Lastly, pursuit movement is smooth eye movement and is used to follow objects in motion.^[33]

Face and object recognition

There is considerable evidence that face and object recognition are accomplished by distinct systems. For example, prosopagnosic patients show deficits in face, but not object processing, while object agnosic patients (most notably, patient C.K.) show deficits in object processing with spared face processing.^[34] Behaviorally, it has been shown that faces, but not objects, are subject to inversion effects, leading to the claim that faces are "special".^[34]^[35] Further, face and object processing recruit distinct neural systems.^[36] Notably, some have argued that the apparent specialization of the human brain for face processing does not reflect true domain specificity, but rather a more general process of expert-level discrimination within a given class of stimulus,^[37] though this latter claim is the subject of substantial debate. Using fMRI and electrophysiology Doris Tsao and colleagues described brain regions and a mechanism for face recognition in macaque monkeys.^[38]

The inferotemporal cortex has a key role in the task of recognition and differentiation of different objects. A study by MIT shows that subset regions of the IT cortex are in charge of different objects.^[39] By selectively shutting off neural activity of many small areas of the cortex, the animal gets alternately unable to distinguish between certain particular pairments of objects. This shows that the IT cortex is divided into regions that respond to different and particular visual features. In a similar way, certain particular patches and regions of the cortex are more involved in face recognition than other object recognition.

Some studies tend to show that rather than the uniform global image, some particular features and regions of interest of the objects are key elements when the brain needs to recognise an object in an image.^[40]^[41] In this way, the human vision is vulnerable to small particular changes to the image, such as disrupting the edges of the object, modifying texture or any small change in a crucial region of the image.^[42]

Studies of people whose sight has been restored after a long blindness reveal that they cannot necessarily recognize objects and faces (as opposed to color, motion, and simple geometric shapes). Some hypothesize that being blind during childhood prevents some part of the visual system necessary for these higher-level tasks from developing properly.^[43] The general belief that a critical period lasts until age 5 or 6 was challenged by a 2007 study that found that older patients could improve these abilities with years of exposure.^[44]

Cognitive and computational approaches

In the 1970s, David Marr developed a multi-level theory of vision, which analyzed the process of vision at different levels of abstraction. In order to focus on the understanding of specific problems in vision, he identified three levels of analysis: the computational, algorithmic and implementational levels. Many vision scientists, including Tomaso Poggio, have embraced these levels of analysis and employed them to further characterize vision from a computational perspective.^[45]

The computational level addresses, at a high level of abstraction, the problems that the visual system must overcome. The algorithmic level attempts to identify the strategy that may be used to solve these problems. Finally, the implementational level attempts to explain how solutions to these problems are realized in neural circuitry.

Marr suggested that it is possible to investigate vision at any of these levels independently. Marr described vision as proceeding from a two-dimensional visual array (on the retina) to a three-dimensional description of the world as output. His stages of vision include:

A 2D or primal sketch of the scene, based on feature extraction of fundamental components of the scene, including edges, regions, etc. Note the similarity in concept to a pencil sketch drawn quickly by an artist as an impression.
A 21⁄2 D sketch of the scene, where textures are acknowledged, etc. Note the similarity in concept to the stage in drawing where an artist highlights or shades areas of a scene, to provide depth.
A 3 D model, where the scene is visualized in a continuous, 3-dimensional map.^[46]

Marr's 21⁄2D sketch assumes that a depth map is constructed, and that this map is the basis of 3D shape perception. However, both stereoscopic and pictorial perception, as well as monocular viewing, make clear that the perception of 3D shape precedes, and does not rely on, the perception of the depth of points. It is not clear how a preliminary depth map could, in principle, be constructed, nor how this would address the question of figure-ground organization, or grouping. The role of perceptual organizing constraints, overlooked by Marr, in the production of 3D shape percepts from binocularly-viewed 3D objects has been demonstrated empirically for the case of 3D wire objects, e.g.^[47]^[48] For a more detailed discussion, see Pizlo (2008).^[49]

A more recent, alternative framework proposes that vision is composed instead of the following three stages: encoding, selection, and decoding.^[50] Encoding is to sample and represent visual inputs (e.g., to represent visual inputs as neural activities in the retina). Selection, or attentional selection, is to select a tiny fraction of input information for further processing, e.g., by shifting gaze to an object or visual location to better process the visual signals at that location. Decoding is to infer or recognize the selected input signals, e.g., to recognize the object at the center of gaze as somebody's face. In this framework,^[51] attentional selection starts at the primary visual cortex along the visual pathway, and the attentional constraints impose a dichotomy between the central and peripheral visual fields for visual recognition or decoding.

Transduction

Transduction is the process through which energy from environmental stimuli is converted to neural activity. The retina contains three different cell layers: photoreceptor layer, bipolar cell layer, and ganglion cell layer. The photoreceptor layer where transduction occurs is farthest from the lens. It contains photoreceptors with different sensitivities called rods and cones. The cones are responsible for color perception and are of three distinct types labeled red, green, and blue. Rods are responsible for the perception of objects in low light.^[52] Photoreceptors contain within them a special chemical called a photopigment, which is embedded in the membrane of the lamellae; a single human rod contains approximately 10 million of them. The photopigment molecules consist of two parts: an opsin (a protein) and retinal (a lipid).^[53] There are 3 specific photopigments (each with their own wavelength sensitivity) that respond across the spectrum of visible light. When the appropriate wavelengths (those that the specific photopigment is sensitive to) hit the photoreceptor, the photopigment splits into two, which sends a signal to the bipolar cell layer, which in turn sends a signal to the ganglion cells, the axons of which form the optic nerve and transmit the information to the brain. If a particular cone type is missing or abnormal, due to a genetic anomaly, a color vision deficiency, sometimes called color blindness will occur.^[54]

Opponent process

Transduction involves chemical messages sent from the photoreceptors to the bipolar cells to the ganglion cells. Several photoreceptors may send their information to one ganglion cell. There are two types of ganglion cells: red/green and yellow/blue. These neurons constantly fire—even when not stimulated. The brain interprets different colors (and with a lot of information, an image) when the rate of firing of these neurons alters. Red light stimulates the red cone, which in turn stimulates the red/green ganglion cell. Likewise, green light stimulates the green cone, which stimulates the green/red ganglion cell and blue light stimulates the blue cone which stimulates the blue/yellow ganglion cell. The rate of firing of the ganglion cells is increased when it is signaled by one cone and decreased (inhibited) when it is signaled by the other cone. The first color in the name of the ganglion cell is the color that excites it and the second is the color that inhibits it. i.e.: A red cone would excite the red/green ganglion cell and the green cone would inhibit the red/green ganglion cell. This is an opponent process. If the rate of firing of a red/green ganglion cell is increased, the brain would know that the light was red, if the rate was decreased, the brain would know that the color of the light was green.^[54]

Artificial visual perception

Artificial visual perception is leveling up and teaching machines to understand scenes and not just to spot objects but giving them street smarts for vision.^[55]

Theories and observations of visual perception have been the main source of inspiration for computer vision (also called machine vision, or computational vision). Special hardware structures and software algorithms provide machines with the capability to interpret the images coming from a camera or a sensor.

References

↑ "Vision". University of Hawaii. August 22, 2012. Retrieved February 15, 2025.
↑ Ballesteros, Soledad (1994). Human Perception: Cognitive Approaches. Psychology Press (published October 10, 2016).
↑ Sekuler, Allison; Palmer, Stephen (March 1992). "Perception of Partly Occluded Objects: A Microgenetic Analysis". ProQuest. ProQuest 213789725.
↑ Sadun, Alfredo A.; Johnson, Betty M.; Smith, Lois E. H. (1986). "Neuroanatomy of the human visual system: Part II Retinal projections to the superior colliculus and pulvinar" . Neuro-Ophthalmology. 6 (6): 363–370. doi:10.3109/01658108609016476. ISSN 0165-8107.
↑ Carlson, Neil R. (2013). "6". Physiology of Behaviour (11th ed.). Upper Saddle River, New Jersey, US: Pearson Education Inc. pp. 187–189. ISBN 978-0-205-23939-9.
1 2 3 Finger, Stanley (1994). Origins of neuroscience: a history of explorations into brain function. Oxford [Oxfordshire]: Oxford University Press. pp. 67–69. ISBN 978-0-19-506503-9. OCLC 27151391.
↑ Ibn al-Hayṯam, al-Ḥasan ibn al-Ḥasan; Ṣabrah, ʿAbd al-Ḥamīd (1989). The Optics of Ibn Al-Haytham. Studies of the Warburg Institute. London: Warburg Institute, University of London. ISBN 978-0-85481-072-7.
↑ Sarton, George (1931). Introduction to the History of Science, Volume II: From Rabbi Ben Ezra to Roger Bacon. Carnegie Institution of Washington Publication. Baltimore: The Williams & Wilkins Company Baltimore. pp. 509, 762.
↑ Lindberg, David Charles (1981). Theories of vision from al-Kindi to Kepler. University of Chicago history of science and medicine. Chicago London: University of Chicago press. ISBN 978-0-226-48235-4.
↑ Howard, I (1996). "Alhazen's neglected discoveries of visual phenomena". Perception. 25 (10): 1203–1217. doi:10.1068/p251203. PMID 9027923. S2CID 20880413.
↑ Khaleefa, Omar (1999). "Who Is the Founder of Psychophysics and Experimental Psychology?". American Journal of Islamic Social Sciences. 16 (2): 1–26. doi: 10.35632/ajis.v16i2.2126 .
↑ Adamson, Peter (July 7, 2016). Philosophy in the Islamic World: A History of Philosophy Without Any Gaps. Oxford University Press. p. 77. ISBN 978-0-19-957749-1.
↑ Keele, Kd (1955). "Leonardo da Vinci on vision". Proceedings of the Royal Society of Medicine. 48 (5): 384–390. doi:10.1177/003591575504800512. ISSN 0035-9157. PMC 1918888 . PMID 14395232.
↑ Margaret, Livingstone (2008). Vision and art : the biology of seeing. Hubel, David H. New York: Abrams. ISBN 978-0-8109-9554-3. OCLC 192082768.
↑ von Helmholtz, Hermann (1925). Handbuch der physiologischen Optik. Vol. 3. Leipzig: Voss. Archived from the original on September 27, 2018. Retrieved December 14, 2016.
↑ Hunziker, Hans-Werner (2006). Im Auge des Lesers: foveale und periphere Wahrnehmung – vom Buchstabieren zur Lesefreude [In the eye of the reader: foveal and peripheral perception – from letter recognition to the joy of reading]. Zürich: Transmedia Stäubli Verlag. ISBN 978-3-7266-0068-6.^{[ page needed ]}
↑ Stone, JV (2011). "Footprints sticking out of the sand. Part 2: children's Bayesian priors for shape and lighting direction" (PDF). Perception. 40 (2): 175–90. doi:10.1068/p6776. PMID 21650091. S2CID 32868278.
↑ Mamassian, Pascal; Landy, Michael; Maloney, Laurence T. (2002). "Bayesian Modelling of Visual Perception". In Rao, Rajesh P. N.; Olshausen, Bruno A.; Lewicki, Michael S. (eds.). Probabilistic Models of the Brain: Perception and Neural Function. Neural Information Processing. MIT Press. pp. 13–36. ISBN 978-0-262-26432-7.
↑ "A Primer on Probabilistic Approaches to Visual Perception". Archived from the original on July 10, 2006. Retrieved October 14, 2010.
↑ Wagemans, Johan (November 2012). "A Century of Gestalt Psychology in Visual Perception". Psychological Bulletin. 138 (6): 1172–1217. CiteSeerX 10.1.1.452.8394 . doi:10.1037/a0029333. PMC 3482144 . PMID 22845751.
↑ Wagemans, Johan; Elder, James H.; Kubovy, Michael; Palmer, Stephen E.; Peterson, Mary A.; Singh, Manish; von der Heydt, Rüdiger (November 2012). "A century of Gestalt psychology in visual perception: I. Perceptual grouping and figure-ground organization". Psychological Bulletin. 138 (6): 1172–1217. Bibcode:2012PsycB.138.1172W. doi:10.1037/a0029333. ISSN 1939-1455. PMC 3482144 . PMID 22845751.
↑ ETC: A Review of General Semantics. " Visual Language" by Colin Murray Turbayne, Vol. 28. No. 1 (March 1971) p. 1 International Society for General Semantics on JSTOR.org
↑ The Myth of Metaphor. Turbayne, Colin Murray. Yale University Press, London (1962) Introduction P. 6, pp.101-129 on hathitrust.org
↑ The Myth of Metaphor. Turbayne, Colin Murray. Yale University Press London (1962) Part III pp. 141-219 on hathitrust.org
↑ ETC: A Review of General Semantics. "Visual Language" by Colin Murray Turbayne Vol. 28. No. 1 (March 1971) p.51-58 International Society for General Semantics on JSTOR.org
↑ Myers, C. Mason (1964). "Reviewed work: The Myth of Metaphor, Colin Murray Turbayne" . The Philosophical Review. 73 (4): 549–552. doi:10.2307/2183311. JSTOR 2183311.
↑ Taylor, Stanford E. (November 1965). "Eye Movements in Reading: Facts and Fallacies". American Educational Research Journal. 2 (4): 187–202. doi:10.2307/1161646. JSTOR 1161646.
↑ Yarbus, A. L. (1967). Eye movements and vision, Plenum Press, New York^{[ page needed ]}
↑ Hunziker, H. W. (1970). "Visuelle Informationsaufnahme und Intelligenz: Eine Untersuchung über die Augenfixationen beim Problemlösen" [Visual information acquisition and intelligence: A study of the eye fixations in problem solving]. Schweizerische Zeitschrift für Psychologie und Ihre Anwendungen (in German). 29 (1/2).^{[ page needed ]}
↑ Cohen, A. S. (1983). "Informationsaufnahme beim Befahren von Kurven, Psychologie für die Praxis 2/83" [Information recording when driving on curves, psychology in practice 2/83]. Bulletin der Schweizerischen Stiftung für Angewandte Psychologie.^{[ page needed ]}
↑ Purves, Dale; Augustine, George J.; Fitzpatrick, David; Katz, Lawrence C.; LaMantia, Anthony-Samuel; McNamara, James O.; Williams, S. Mark (2001), "Types of Eye Movements and Their Functions", Neuroscience. 2nd edition, Sinauer Associates, retrieved July 1, 2025
↑ Liu, Meng; Zhan, Jiayu; Wang, Lihui (September 20, 2024). "Specified functions of the first two fixations in face recognition: Sampling the general-to-specific facial information". iScience. 27 (9) 110686. Bibcode:2024iSci...27k0686L. doi:10.1016/j.isci.2024.110686. ISSN 2589-0042. PMC 11378928 . PMID 39246447.
↑ Carlson, Neil R.; Heth, C. Donald; Miller, Harold; Donahoe, John W.; Buskist, William; Martin, G. Neil; Schmaltz, Rodney M. (2009). Psychology the Science of Behaviour . Toronto Ontario: Pearson Canada. pp. 140–1. ISBN 978-0-205-70286-2.
1 2 Moscovitch, Morris; Winocur, Gordon; Behrmann, Marlene (1997). "What Is Special about Face Recognition? Nineteen Experiments on a Person with Visual Object Agnosia and Dyslexia but Normal Face Recognition". Journal of Cognitive Neuroscience. 9 (5): 555–604. doi:10.1162/jocn.1997.9.5.555. PMID 23965118. S2CID 207550378.
↑ Yin, Robert K. (1969). "Looking at upside-down faces". Journal of Experimental Psychology. 81 (1): 141–5. doi:10.1037/h0027474.
↑ Kanwisher, Nancy; McDermott, Josh; Chun, Marvin M. (June 1997). "The fusiform face area: a module in human extrastriate cortex specialized for face perception". The Journal of Neuroscience. 17 (11): 4302–11. doi:10.1523/JNEUROSCI.17-11-04302.1997. PMC 6573547 . PMID 9151747.
↑ Gauthier, Isabel; Skudlarski, Pawel; Gore, John C.; Anderson, Adam W. (February 2000). "Expertise for cars and birds recruits brain areas involved in face recognition". Nature Neuroscience . 3 (2): 191–7. doi:10.1038/72140. PMID 10649576. S2CID 15752722.
↑ Chang, Le; Tsao, Doris Y. (June 1, 2017). "The Code for Facial Identity in the Primate Brain". Cell. 169 (6): 1013–1028.e14. doi: 10.1016/j.cell.2017.05.011 . ISSN 0092-8674. PMC 8088389 . PMID 28575666.
↑ "How the brain distinguishes between objects". MIT News. March 13, 2019. Retrieved October 10, 2019.
↑ Srivastava, Sanjana; Ben-Yosef, Guy; Boix, Xavier (February 8, 2019). Minimal Images in Deep Neural Networks: Fragile Object Recognition in Natural Images. arXiv: 1902.03227 . OCLC 1106329907.
↑ Ben-Yosef, Guy; Assif, Liav; Ullman, Shimon (February 2018). "Full interpretation of minimal images". Cognition. 171: 65–84. doi:10.1016/j.cognition.2017.10.006. hdl: 1721.1/106887 . ISSN 0010-0277. PMID 29107889. S2CID 3372558.
↑ Elsayed, Gamaleldin F.; Shankar, Shreya; Cheung, Brian; Papernot, Nicolas; Kurakin, Alex; Goodfellow, Ian; Sohl-Dickstein, Jascha (February 22, 2018). "Adversarial Examples that Fool both Computer Vision and Time-Limited Humans" (PDF). Advances in Neural Information Processing Systems 31 (NeurIPS 2018). arXiv: 1802.08195 . OCLC 1106289156.
↑ Man with restored sight provides new insight into how vision develops
↑ Out Of Darkness, Sight: Rare Cases Of Restored Vision Reveal How The Brain Learns To See
↑ Poggio, Tomaso (1981). "Marr's Computational Approach to Vision". Trends in Neurosciences. 4: 258–262. doi:10.1016/0166-2236(81)90081-3. S2CID 53163190.
↑ Marr, D (1982). Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. MIT Press.^{[ page needed ]}
↑ Rock, Irvin; DiVita, Joseph (1987). "A case of viewer-centered object perception" . Cognitive Psychology. 19 (2): 280–293. doi:10.1016/0010-0285(87)90013-2. PMID 3581759. S2CID 40154873.
↑ Pizlo, Zygmunt; Stevenson, Adam K. (1999). "Shape constancy from novel views". Perception & Psychophysics. 61 (7): 1299–1307. doi: 10.3758/BF03206181 . ISSN 0031-5117. PMID 10572459. S2CID 8041318.
↑ 3D Shape, Z. Pizlo (2008) MIT Press
↑ Zhaoping, Li (2014). Understanding vision: theory, models, and data. United Kingdom: Oxford University Press. ISBN 978-0199564668.
↑ Zhaoping, L (2019). "A new framework for understanding vision from the perspective of the primary visual cortex" . Current Opinion in Neurobiology. 58: 1–10. doi:10.1016/j.conb.2019.06.001. PMID 31271931. S2CID 195806018.
↑ Hecht, Selig (April 1, 1937). "Rods, Cones, and the Chemical Basis of Vision". Physiological Reviews. 17 (2): 239–290. doi:10.1152/physrev.1937.17.2.239. ISSN 0031-9333.
↑ Carlson, Neil R. (2013). "6". Physiology of Behaviour (11th ed.). Upper Saddle River, New Jersey, US: Pearson Education Inc. p. 170. ISBN 978-0-205-23939-9.
1 2 Carlson, Neil R.; Heth, C. Donald (2010). "5". Psychology the science of behaviour (2nd ed.). Upper Saddle River, New Jersey, US: Pearson Education Inc. pp. 138–145. ISBN 978-0-205-64524-4.
↑ Bhatt, Mehul; Suchan, Jakob (September 2022). "Artificial Visual Intelligence: Perceptual Commonsense for Human-Centred Cognitive Technologies". ResearchGate.

External links

The Organization of the Retina and Visual System
Effect of Detail on Visual Perception by Jon McLoone, the Wolfram Demonstrations Project
The Joy of Visual Perception, resource on the eye's perception abilities.
VisionScience. Resource for Research in Human and Animal Vision A collection of resources in vision science and perception
Vision and Psychophysics
Vision, Scholarpedia Expert articles about Vision
What are the limits of human vision?

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "Vision". University of Hawaii. August 22, 2012. Retrieved February 15, 2025.

[2] Ballesteros, Soledad (1994). Human Perception: Cognitive Approaches. Psychology Press (published October 10, 2016).

[3] Sekuler, Allison; Palmer, Stephen (March 1992). "Perception of Partly Occluded Objects: A Microgenetic Analysis". ProQuest. ProQuest 213789725.

[4] Sadun, Alfredo A.; Johnson, Betty M.; Smith, Lois E. H. (1986). "Neuroanatomy of the human visual system: Part II Retinal projections to the superior colliculus and pulvinar" . Neuro-Ophthalmology. 6 (6): 363–370. doi:10.3109/01658108609016476. ISSN 0165-8107.

[Carlson_2013_187-189-5] Carlson, Neil R. (2013). "6". Physiology of Behaviour (11th ed.). Upper Saddle River, New Jersey, US: Pearson Education Inc. pp. 187–189. ISBN 978-0-205-23939-9.

[Finger-6] 1 2 3 Finger, Stanley (1994). Origins of neuroscience: a history of explorations into brain function. Oxford [Oxfordshire]: Oxford University Press. pp. 67–69. ISBN 978-0-19-506503-9. OCLC 27151391.

[7] Ibn al-Hayṯam, al-Ḥasan ibn al-Ḥasan; Ṣabrah, ʿAbd al-Ḥamīd (1989). The Optics of Ibn Al-Haytham. Studies of the Warburg Institute. London: Warburg Institute, University of London. ISBN 978-0-85481-072-7.

[8] Sarton, George (1931). Introduction to the History of Science, Volume II: From Rabbi Ben Ezra to Roger Bacon. Carnegie Institution of Washington Publication. Baltimore: The Williams & Wilkins Company Baltimore. pp. 509, 762.

[9] Lindberg, David Charles (1981). Theories of vision from al-Kindi to Kepler. University of Chicago history of science and medicine. Chicago London: University of Chicago press. ISBN 978-0-226-48235-4.

[Howard-10] Howard, I (1996). "Alhazen's neglected discoveries of visual phenomena". Perception. 25 (10): 1203–1217. doi:10.1068/p251203. PMID 9027923. S2CID 20880413.

[Khaleefa-11] Khaleefa, Omar (1999). "Who Is the Founder of Psychophysics and Experimental Psychology?". American Journal of Islamic Social Sciences. 16 (2): 1–26. doi: 10.35632/ajis.v16i2.2126 .

[12] Adamson, Peter (July 7, 2016). Philosophy in the Islamic World: A History of Philosophy Without Any Gaps. Oxford University Press. p. 77. ISBN 978-0-19-957749-1.

[13] Keele, Kd (1955). "Leonardo da Vinci on vision". Proceedings of the Royal Society of Medicine. 48 (5): 384–390. doi:10.1177/003591575504800512. ISSN 0035-9157. PMC 1918888 . PMID 14395232.

[Margaret._2008-14] Margaret, Livingstone (2008). Vision and art : the biology of seeing. Hubel, David H. New York: Abrams. ISBN 978-0-8109-9554-3. OCLC 192082768.

[vonHelmholtz1867-15] von Helmholtz, Hermann (1925). Handbuch der physiologischen Optik. Vol. 3. Leipzig: Voss. Archived from the original on September 27, 2018. Retrieved December 14, 2016.

[16] Hunziker, Hans-Werner (2006). Im Auge des Lesers: foveale und periphere Wahrnehmung – vom Buchstabieren zur Lesefreude [In the eye of the reader: foveal and peripheral perception – from letter recognition to the joy of reading]. Zürich: Transmedia Stäubli Verlag. ISBN 978-3-7266-0068-6.^{[ page needed ]}

[17] Stone, JV (2011). "Footprints sticking out of the sand. Part 2: children's Bayesian priors for shape and lighting direction" (PDF). Perception. 40 (2): 175–90. doi:10.1068/p6776. PMID 21650091. S2CID 32868278.

[18] Mamassian, Pascal; Landy, Michael; Maloney, Laurence T. (2002). "Bayesian Modelling of Visual Perception". In Rao, Rajesh P. N.; Olshausen, Bruno A.; Lewicki, Michael S. (eds.). Probabilistic Models of the Brain: Perception and Neural Function. Neural Information Processing. MIT Press. pp. 13–36. ISBN 978-0-262-26432-7.

[19] "A Primer on Probabilistic Approaches to Visual Perception". Archived from the original on July 10, 2006. Retrieved October 14, 2010.

[Gestalt_and_Vision-20] Wagemans, Johan (November 2012). "A Century of Gestalt Psychology in Visual Perception". Psychological Bulletin. 138 (6): 1172–1217. CiteSeerX 10.1.1.452.8394 . doi:10.1037/a0029333. PMC 3482144 . PMID 22845751.

[21] Wagemans, Johan; Elder, James H.; Kubovy, Michael; Palmer, Stephen E.; Peterson, Mary A.; Singh, Manish; von der Heydt, Rüdiger (November 2012). "A century of Gestalt psychology in visual perception: I. Perceptual grouping and figure-ground organization". Psychological Bulletin. 138 (6): 1172–1217. Bibcode:2012PsycB.138.1172W. doi:10.1037/a0029333. ISSN 1939-1455. PMC 3482144 . PMID 22845751.

[22] ETC: A Review of General Semantics. " Visual Language" by Colin Murray Turbayne, Vol. 28. No. 1 (March 1971) p. 1 International Society for General Semantics on JSTOR.org

[23] The Myth of Metaphor. Turbayne, Colin Murray. Yale University Press, London (1962) Introduction P. 6, pp.101-129 on hathitrust.org

[24] The Myth of Metaphor. Turbayne, Colin Murray. Yale University Press London (1962) Part III pp. 141-219 on hathitrust.org

[25] ETC: A Review of General Semantics. "Visual Language" by Colin Murray Turbayne Vol. 28. No. 1 (March 1971) p.51-58 International Society for General Semantics on JSTOR.org

[26] Myers, C. Mason (1964). "Reviewed work: The Myth of Metaphor, Colin Murray Turbayne" . The Philosophical Review. 73 (4): 549–552. doi:10.2307/2183311. JSTOR 2183311.

[Taylor,_1965-27] Taylor, Stanford E. (November 1965). "Eye Movements in Reading: Facts and Fallacies". American Educational Research Journal. 2 (4): 187–202. doi:10.2307/1161646. JSTOR 1161646.

[28] Yarbus, A. L. (1967). Eye movements and vision, Plenum Press, New York^{[ page needed ]}

[29] Hunziker, H. W. (1970). "Visuelle Informationsaufnahme und Intelligenz: Eine Untersuchung über die Augenfixationen beim Problemlösen" [Visual information acquisition and intelligence: A study of the eye fixations in problem solving]. Schweizerische Zeitschrift für Psychologie und Ihre Anwendungen (in German). 29 (1/2).^{[ page needed ]}

[30] Cohen, A. S. (1983). "Informationsaufnahme beim Befahren von Kurven, Psychologie für die Praxis 2/83" [Information recording when driving on curves, psychology in practice 2/83]. Bulletin der Schweizerischen Stiftung für Angewandte Psychologie.^{[ page needed ]}

[31] Purves, Dale; Augustine, George J.; Fitzpatrick, David; Katz, Lawrence C.; LaMantia, Anthony-Samuel; McNamara, James O.; Williams, S. Mark (2001), "Types of Eye Movements and Their Functions", Neuroscience. 2nd edition, Sinauer Associates, retrieved July 1, 2025

[32] Liu, Meng; Zhan, Jiayu; Wang, Lihui (September 20, 2024). "Specified functions of the first two fixations in face recognition: Sampling the general-to-specific facial information". iScience. 27 (9) 110686. Bibcode:2024iSci...27k0686L. doi:10.1016/j.isci.2024.110686. ISSN 2589-0042. PMC 11378928 . PMID 39246447.

[33] Carlson, Neil R.; Heth, C. Donald; Miller, Harold; Donahoe, John W.; Buskist, William; Martin, G. Neil; Schmaltz, Rodney M. (2009). Psychology the Science of Behaviour . Toronto Ontario: Pearson Canada. pp. 140–1. ISBN 978-0-205-70286-2.

[PMID_23965118-34] 1 2 Moscovitch, Morris; Winocur, Gordon; Behrmann, Marlene (1997). "What Is Special about Face Recognition? Nineteen Experiments on a Person with Visual Object Agnosia and Dyslexia but Normal Face Recognition". Journal of Cognitive Neuroscience. 9 (5): 555–604. doi:10.1162/jocn.1997.9.5.555. PMID 23965118. S2CID 207550378.

[35] Yin, Robert K. (1969). "Looking at upside-down faces". Journal of Experimental Psychology. 81 (1): 141–5. doi:10.1037/h0027474.

[36] Kanwisher, Nancy; McDermott, Josh; Chun, Marvin M. (June 1997). "The fusiform face area: a module in human extrastriate cortex specialized for face perception". The Journal of Neuroscience. 17 (11): 4302–11. doi:10.1523/JNEUROSCI.17-11-04302.1997. PMC 6573547 . PMID 9151747.

[37] Gauthier, Isabel; Skudlarski, Pawel; Gore, John C.; Anderson, Adam W. (February 2000). "Expertise for cars and birds recruits brain areas involved in face recognition". Nature Neuroscience . 3 (2): 191–7. doi:10.1038/72140. PMID 10649576. S2CID 15752722.

[38] Chang, Le; Tsao, Doris Y. (June 1, 2017). "The Code for Facial Identity in the Primate Brain". Cell. 169 (6): 1013–1028.e14. doi: 10.1016/j.cell.2017.05.011 . ISSN 0092-8674. PMC 8088389 . PMID 28575666.

[39] "How the brain distinguishes between objects". MIT News. March 13, 2019. Retrieved October 10, 2019.

[40] Srivastava, Sanjana; Ben-Yosef, Guy; Boix, Xavier (February 8, 2019). Minimal Images in Deep Neural Networks: Fragile Object Recognition in Natural Images. arXiv: 1902.03227 . OCLC 1106329907.

[41] Ben-Yosef, Guy; Assif, Liav; Ullman, Shimon (February 2018). "Full interpretation of minimal images". Cognition. 171: 65–84. doi:10.1016/j.cognition.2017.10.006. hdl: 1721.1/106887 . ISSN 0010-0277. PMID 29107889. S2CID 3372558.

[42] Elsayed, Gamaleldin F.; Shankar, Shreya; Cheung, Brian; Papernot, Nicolas; Kurakin, Alex; Goodfellow, Ian; Sohl-Dickstein, Jascha (February 22, 2018). "Adversarial Examples that Fool both Computer Vision and Time-Limited Humans" (PDF). Advances in Neural Information Processing Systems 31 (NeurIPS 2018). arXiv: 1802.08195 . OCLC 1106289156.

[43] Man with restored sight provides new insight into how vision develops

[44] Out Of Darkness, Sight: Rare Cases Of Restored Vision Reveal How The Brain Learns To See

[45] Poggio, Tomaso (1981). "Marr's Computational Approach to Vision". Trends in Neurosciences. 4: 258–262. doi:10.1016/0166-2236(81)90081-3. S2CID 53163190.

[Marr-46] Marr, D (1982). Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. MIT Press.^{[ page needed ]}

[47] Rock, Irvin; DiVita, Joseph (1987). "A case of viewer-centered object perception" . Cognitive Psychology. 19 (2): 280–293. doi:10.1016/0010-0285(87)90013-2. PMID 3581759. S2CID 40154873.

[48] Pizlo, Zygmunt; Stevenson, Adam K. (1999). "Shape constancy from novel views". Perception & Psychophysics. 61 (7): 1299–1307. doi: 10.3758/BF03206181 . ISSN 0031-5117. PMID 10572459. S2CID 8041318.

[49] 3D Shape, Z. Pizlo (2008) MIT Press

[50] Zhaoping, Li (2014). Understanding vision: theory, models, and data. United Kingdom: Oxford University Press. ISBN 978-0199564668.

[51] Zhaoping, L (2019). "A new framework for understanding vision from the perspective of the primary visual cortex" . Current Opinion in Neurobiology. 58: 1–10. doi:10.1016/j.conb.2019.06.001. PMID 31271931. S2CID 195806018.

[52] Hecht, Selig (April 1, 1937). "Rods, Cones, and the Chemical Basis of Vision". Physiological Reviews. 17 (2): 239–290. doi:10.1152/physrev.1937.17.2.239. ISSN 0031-9333.

[Carlson_2013_170-53] Carlson, Neil R. (2013). "6". Physiology of Behaviour (11th ed.). Upper Saddle River, New Jersey, US: Pearson Education Inc. p. 170. ISBN 978-0-205-23939-9.

[Carlson_2010_138–145-54] 1 2 Carlson, Neil R.; Heth, C. Donald (2010). "5". Psychology the science of behaviour (2nd ed.). Upper Saddle River, New Jersey, US: Pearson Education Inc. pp. 138–145. ISBN 978-0-205-64524-4.

[55] Bhatt, Mehul; Suchan, Jakob (September 2022). "Artificial Visual Intelligence: Perceptual Commonsense for Human-Centred Cognitive Technologies". ResearchGate.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

[24]

[25]

[26]

[27]

[28]

[29]

[30]

[31]

[32]

[33]

[34]

[35]

[36]

[37]

[38]

[39]

[40]

[41]

[42]

[43]

[44]

[45]

[46]

[47]

[48]

[49]

[50]

[51]

[52]

[53]

[54]

[55]

Authority control databases
International	GND
National	United States France BnF data Japan Czech Republic 2 Israel
Other	İslâm Ansiklopedisi Yale LUX