Ensemble coding

Last updated

Ensemble coding, also known as ensemble perception or summary representation, is a theory in cognitive neuroscience about the internal representation of groups of objects in the human mind. Ensemble coding proposes that such information is recorded via summary statistics, particularly the average or variance. Experimental evidence tends to support the theory for low-level visual information, such as shapes and sizes, as well as some high-level features such as face gender. Nonetheless, it remains unclear the extent to which ensemble coding applies to high-level or non-visual stimuli, and the theory remains the subject of active research.

Contents

Theory

Extensive amounts of information are available to the visual system. Ensemble coding is a theory that suggests that people process the general gist of their complex visual surroundings by grouping objects together based on shared properties. The world is filled with redundant information of which the human visual system has become particularly sensitive. [1] [2] The brain exploits this redundancy and condenses the information. For example, the leaves of a tree or blades of grass give rise to the percept of 'tree-ness' and 'lawn-ness'. [3] It has been demonstrated that individuals have the ability to quickly and accurately encode ensembles of objects, like leaves on a tree, and gather summary statistical information (like the mean and variance) from groups of stimuli. [4] [5] Some research suggests that this process provides rough visual information from the entire visual field, giving way to a complete and accurate picture of the visual world. [6] [7] Although the individual details of this accurate picture might be inaccessible, the 'gist' of the scene remains accessible. [3] Ensemble coding is an adaptive process that lightens the cognitive load in the processing and storing of visual representations through the use of heuristics. [7] [8]

Operational definition

David Whitney and Allison Yamanashi Leib have developed an operational and flexible definition stating that ensemble coding should cover the following five concepts: [1]

Opposing theories

Some research has found countering evidence to the theory of ensemble coding.

Limited visual capacity

Vision science has noted that although humans take in large amounts of visual information, adults are only able to process, attend to, and retain up to roughly four items from the visual environment. [9] [10] Furthermore, scientists have found that this visual upper limit capacity exists across various phenomena including change blindness, [11] [12] object tracking, [13] and feature representation. [10]

Low resolution representations and limited capacity

Additional theories in vision science propose that stimuli are represented in the brain individually as small, low resolution, icons stored in templates with limited capacities and are organized through associative links. [14] [15]

History

Throughout its history, ensemble coding been known by many names. Interest in the theory began to emerge in the early 20th century. [8] In its earliest years, ensemble coding was known as Gestalt grouping. [8] In 1923, Max Wertheimer, a Gestalt psychology theorist, was addressing how humans perceive their visual world holistically rather than individually. [16] Gestaltists argued that in object perception, the individual object features were either lost or difficult to perceive and therefore the grouped object was the favored percept. [17] Although Gestaltists helped define some of the central principles of object perception, research into modern ensemble coding did not occur until many years later.[ citation needed ]

In 1971, Norman Anderson was one of the earliest to conduct explicit ensemble coding research. [3] [18] Anderson's research into social ensemble coding showed that individuals described by two positive terms were rated more favorably than individuals described by two positive terms and two negative terms. [19] This research on impression formation demonstrated that a weighted mean or average captures how information is integrated rather than the summation. [19] Additional research during this time explored ensemble coding in group attractiveness, [20] shopping preferences, [21] and the perceived badness of criminals. [22]

The current era

Findings by Dan Ariely in 2001 were the first data to support the modern theories of ensemble coding. Ariely used novel experimental paradigms, which he labeled "mean discrimination" and "member identification", to examine how sets of objects are perceived. He conducted three studies involving shape ensembles that varied in size. Across all studies, participants were able to accurately encode the mean size of the ensemble of objects, but they were inaccurate when asked if a certain object was a part of the set. Ariely's findings were the first that found statistical summary information emerge in the visual perception of grouped objects. [23]

Consistent with Ariely's findings, [23] follow-up research conducted by Sang Chul Chong and Anne Treisman in 2003 provided evidence that participants are engaging in summary statistical processes. Their research revealed that participant's maintained high accuracy in encoding the mean size of the stimuli even with short stimuli presentations as low as 50 milliseconds, memory delays, and object distribution differences. [24]

Additional research has demonstrated that ensemble coding is not limited to the mean size of objects in the ensemble, [23] but that additional content is extracted, such as average line orientation, [25] average spatial location, [26] average number, [27] and statistical summaries such as the variances [28] are detected. Observers are also able to extract accurate perceptual summaries of high-level features such as the average direction of eye gaze of grouped faces [29] and the average walking direction of a crowd. [30]

Levels of ensemble coding

People have the ability to encode ensembles of objects along various dimensions. [1] These dimensions have been divided into levels that vary from low-level to high-level feature information.

Low-level feature information

Low-level ensemble coding has been observed in various psychophysical areas of research. For example, people accurately perceive the average size of objects, [24] motion direction of grouped dots, [31] [32] number, [27] line orientation, [25] and spatial location. [26] [1]

High-level feature information

High-level ensemble coding extends to more complex, higher level objects including faces. [1] [3]

Independence of low- and high-level information

Some findings suggest lower-level and higher-level information may be processed by independent cognitive mechanisms [33] [34]

Social vision and ensemble coding

Based on the early work of Anderson, [18] it appears that humans integrate semantic as well as social information into memory using ensemble coding. These findings suggest that social processes may hinge on the same sort of underlying mechanisms that allow people to perceive average object orientation [25] and average object direction of motion. [31] [32] [3]

In recent years, ensemble coding in the field of social vision has emerged. Social vision is a field of research that examines how people perceive one another. With the addition of ensemble coding, the field is able to explore people perception, or how people perceive groups of other people. This specific research area focuses on how observers accurately perceive and extract social information from groups and how that extracted information influences downstream judgments and behaviors. [35] In 2018, seminal research introducing the use ensemble coding in the field of social vision was conducted by Briana Goodale. Goodale's research found that humans can accurately extract sex ratio summaries from ensembles of faces and that this sex ratio provides an early visual cue signaling sense of belonging and fit within group. [35] Specifically, this research found that participants felt a stronger sense of belonging to a given ensemble as members of their own sex increased in the perceived ensemble. [35]

Additional research has uncovered that in as little as 75 milliseconds, participants are able to derive the average sex ratio of an ensemble of faces. [4] Furthermore, within that 75 milliseconds, participants were able to form impressions based on the perceived sex ratio and make inferences about the group's perceived threat. [4] Specifically, this research found that groups were judged as more threatening as the ratio of men to women increased. [4]

In 2023, researchers found that people can accurately gauge the average trustworthiness of multiple faces presented together, even at very brief exposure times (as short as 250 ms). The findings suggest that our brains efficiently extract a summary statistic of facial features from crowds, enabling quick social judgments that may influence behavior. [36]

Related Research Articles

<span class="mw-page-title-main">Perception</span> Interpretation of sensory information

Perception is the organization, identification, and interpretation of sensory information in order to represent and understand the presented information or environment. All perception involves signals that go through the nervous system, which in turn result from physical or chemical stimulation of the sensory system. Vision involves light striking the retina of the eye; smell is mediated by odor molecules; and hearing involves pressure waves.

<span class="mw-page-title-main">Attention</span> Psychological process of selectively perceiving and prioritising discrete aspects of information

Attention or focus, is the concentration of awareness on some phenomenon to the exclusion of other stimuli. It is a process of selectively concentrating on a discrete aspect of information, whether considered subjective or objective. William James (1890) wrote that "Attention is the taking possession by the mind, in clear and vivid form, of one out of what seem several simultaneously possible objects or trains of thought. Focalization, concentration, of consciousness are of its essence." Attention has also been described as the allocation of limited cognitive processing resources. Attention is manifested by an attentional bottleneck, in terms of the amount of data the brain can process each second; for example, in human vision, only less than 1% of the visual input data can enter the bottleneck, leading to inattentional blindness.

Gestalt psychology, gestaltism, or configurationism is a school of psychology and a theory of perception that emphasises the processing of entire patterns and configurations, and not merely individual components. It emerged in the early twentieth century in Austria and Germany as a rejection of basic principles of Wilhelm Wundt's and Edward Titchener's elementalist and structuralist psychology.

<span class="mw-page-title-main">Color constancy</span> How humans perceive color

Color constancy is an example of subjective constancy and a feature of the human color perception system which ensures that the perceived color of objects remains relatively constant under varying illumination conditions. A green apple for instance looks green to us at midday, when the main illumination is white sunlight, and also at sunset, when the main illumination is red. This helps us identify objects.

<span class="mw-page-title-main">Attitude (psychology)</span> Concept in psychology and communication studies

An attitude "is a summary evaluation of an object of thought. An attitude object can be anything a person discriminates or holds in mind." Attitudes include beliefs (cognition), emotional responses (affect) and behavioral tendencies. In the classical definition an attitude is persistent, while in more contemporary conceptualizations, attitudes may vary depending upon situations, context, or moods.

<span class="mw-page-title-main">Wishful thinking</span> Formation of beliefs based on what might be pleasing to imagine

Wishful thinking is the formation of beliefs based on what might be pleasing to imagine, rather than on evidence, rationality, or reality. It is a product of resolving conflicts between belief and desire. Methodologies to examine wishful thinking are diverse. Various disciplines and schools of thought examine related mechanisms such as neural circuitry, human cognition and emotion, types of bias, procrastination, motivation, optimism, attention and environment. This concept has been examined as a fallacy. It is related to the concept of wishful seeing.

Psychophysics quantitatively investigates the relationship between physical stimuli and the sensations and perceptions they produce. Psychophysics has been described as "the scientific study of the relation between stimulus and sensation" or, more completely, as "the analysis of perceptual processes by studying the effect on a subject's experience or behaviour of systematically varying the properties of a stimulus along one or more physical dimensions".

<span class="mw-page-title-main">Subitizing</span> Assessing the quantity of objects in a visual scene without individually counting each item

Subitizing is the rapid, accurate, and confident judgments of numbers performed for small numbers of items. The term was coined in 1949 by E. L. Kaufman et al., and is derived from the Latin adjective subitus and captures a feeling of immediately knowing how many items lie within the visual scene, when the number of items present falls within the subitizing range. Sets larger than about four to five items cannot be subitized unless the items appear in a pattern with which the person is familiar. Large, familiar sets might be counted one-by-one. A person could also estimate the number of a large set—a skill similar to, but different from, subitizing.

<span class="mw-page-title-main">Ambiguous image</span> Image that exploits graphical similarities between two or more distinct images

Ambiguous images or reversible figures are visual forms that create ambiguity by exploiting graphical similarities and other properties of visual system interpretation between two or more distinct image forms. These are famous for inducing the phenomenon of multistable perception. Multistable perception is the occurrence of an image being able to provide multiple, although stable, perceptions.

<span class="mw-page-title-main">Anne Treisman</span> English cognitive psychologist (1935–2018)

Anne Marie Treisman was an English psychologist who specialised in cognitive psychology.

<span class="mw-page-title-main">In-group and out-group</span> Sociological notions

In social psychology and sociology, an in-group is a social group to which a person psychologically identifies as being a member. By contrast, an out-group is a social group with which an individual does not identify. People may for example identify with their peer group, family, community, sports team, political party, gender, sexual orientation, religion, or nation. It has been found that the psychological membership of social groups and categories is associated with a wide variety of phenomena.

The two-streams hypothesis is a model of the neural processing of vision as well as hearing. The hypothesis, given its initial characterisation in a paper by David Milner and Melvyn A. Goodale in 1992, argues that humans possess two distinct visual systems. Recently there seems to be evidence of two distinct auditory systems as well. As visual information exits the occipital lobe, and as sound leaves the phonological network, it follows two main pathways, or "streams". The ventral stream leads to the temporal lobe, which is involved with object and visual identification and recognition. The dorsal stream leads to the parietal lobe, which is involved with processing the object's spatial location relative to the viewer and with speech repetition.

Visual search is a type of perceptual task requiring attention that typically involves an active scan of the visual environment for a particular object or feature among other objects or features. Visual search can take place with or without eye movements. The ability to consciously locate an object or target amongst a complex array of stimuli has been extensively studied over the past 40 years. Practical examples of using visual search can be seen in everyday life, such as when one is picking out a product on a supermarket shelf, when animals are searching for food among piles of leaves, when trying to find a friend in a large crowd of people, or simply when playing visual search games such as Where's Wally?

In perceptual psychology, a sensory cue is a statistic or signal that can be extracted from the sensory input by a perceiver, that indicates the state of some property of the world that the perceiver is interested in perceiving.

<span class="mw-page-title-main">Pandemonium architecture</span>

Pandemonium architecture is a theory in cognitive science that describes how visual images are processed by the brain. It has applications in artificial intelligence and pattern recognition. The theory was developed by the artificial intelligence pioneer Oliver Selfridge in 1959. It describes the process of object recognition as a hierarchical system of detection and association by a metaphorical set of "demons" sending signals to each other. This model is now recognized as the basis of visual perception in cognitive science.

Visual perception is the ability to interpret the surrounding environment through photopic vision, color vision, scotopic vision, and mesopic vision, using light in the visible spectrum reflected by objects in the environment. This is different from visual acuity, which refers to how clearly a person sees. A person can have problems with visual perceptual processing even if they have 20/20 vision.

Impression formation in social psychology refers to the processes by which different pieces of knowledge about another are combined into a global or summary impression. Social psychologist Solomon Asch is credited with the seminal research on impression formation and conducted research on how individuals integrate information about personality traits. Two major theories have been proposed to explain how this process of integration takes place. The Gestalt approach views the formation of a general impression as the sum of several interrelated impressions. As an individual seeks to form a coherent and meaningful impression of another individual, previous impressions significantly influence the interpretation of subsequent information. In contrast to the Gestalt approach, the cognitive algebra approach asserts that individuals' experiences are combined with previous evaluations to form a constantly changing impression of a person. A related area to impression formation is the study of person perception, making dispositional attributions, and then adjusting those inferences based on the information available.

Visual object recognition refers to the ability to identify the objects in view based on visual input. One important signature of visual object recognition is "object invariance", or the ability to identify objects across changes in the detailed context in which objects are viewed, including changes in illumination, object pose, and background context.

Emotion perception refers to the capacities and abilities of recognizing and identifying emotions in others, in addition to biological and physiological processes involved. Emotions are typically viewed as having three components: subjective experience, physical changes, and cognitive appraisal; emotion perception is the ability to make accurate decisions about another's subjective experience by interpreting their physical changes through sensory systems responsible for converting these observed changes into mental representations. The ability to perceive emotion is believed to be both innate and subject to environmental influence and is also a critical component in social interactions. How emotion is experienced and interpreted depends on how it is perceived. Likewise, how emotion is perceived is dependent on past experiences and interpretations. Emotion can be accurately perceived in humans. Emotions can be perceived visually, audibly, through smell and also through bodily sensations and this process is believed to be different from the perception of non-emotional material.

<span class="mw-page-title-main">Cheerleader effect</span> Psychological effect on perceptions of attractiveness

The cheerleader effect, also known as the group attractiveness effect or the friend effect, is a proposed cognitive bias which causes people to perceive individuals as 1.5–2.0% more attractive in a group than when seen alone. The first paper to report this effect was written by Drew Walker and Edward Vul, in 2013.

References

  1. 1 2 3 4 5 Whitney D, Yamanashi Leib A (January 2018). "Ensemble Perception". Annual Review of Psychology. 69 (1): 105–129. doi: 10.1146/annurev-psych-010416-044232 . PMID   28892638. S2CID   39630841.
  2. Whitney D, Haberman J, Sweeny T (2014). "From textures to crowds: multiple levels of summary statistical perception.". In Werner JS, Chalupa LM (eds.). In The New Visual Neuroscience. Cambridge, MA: MIT Press. pp. 695–710.
  3. 1 2 3 4 5 Haberman J, Whitney D (May 2012). "Ensemble Perception". In Wolfe J, Robertson L (eds.). From Perception to Consciousness. Oxford University Press. pp. 339–349. doi:10.1093/acprof:osobl/9780199734337.003.0030. ISBN   978-0-19-973433-7.
  4. 1 2 3 4 Alt NP, Goodale B, Lick DJ, Johnson KL (March 2019). "Threat in the Company of Men: Ensemble Perception and Threat Evaluations of Groups Varying in Sex Ratio". Social Psychological and Personality Science. 10 (2): 152–159. doi:10.1177/1948550617731498. S2CID   149407595.
  5. Alvarez GA (March 2011). "Representing multiple objects as an ensemble enhances visual cognition". Trends in Cognitive Sciences. 15 (3): 122–31. doi:10.1016/j.tics.2011.01.003. PMID   21292539. S2CID   2752461.
  6. Chong SC, Treisman A (February 2003). "Representation of statistical properties". Vision Research. 43 (4): 393–404. doi: 10.1016/S0042-6989(02)00596-5 . PMID   12535996.
  7. 1 2 Haberman J, Whitney D (June 2009). "Seeing the mean: ensemble coding for sets of faces". Journal of Experimental Psychology. Human Perception and Performance. 35 (3): 718–34. doi:10.1037/a0013899. PMC   2696629 . PMID   19485687.
  8. 1 2 3 Wolfe J, Robertson L (December 2011). From Perception to Consciousness: Searching with Anne Treisman. Oxford University Press. ISBN   978-0-19-990984-1.
  9. Alvarez GA, Cavanagh P (February 2004). "The capacity of visual short-term memory is set both by visual information load and by number of objects". Psychological Science. 15 (2): 106–11. doi:10.1111/j.0963-7214.2004.01502006.x. PMID   14738517. S2CID   2286443.
  10. 1 2 Luck SJ, Vogel EK (November 1997). "The capacity of visual working memory for features and conjunctions". Nature. 390 (6657): 279–81. Bibcode:1997Natur.390..279L. doi:10.1038/36846. PMID   9384378. S2CID   205025290.
  11. O'Regan JK, Deubel H, Clark JJ, Rensink RA (2000-01-01). "Picture Changes During Blinks: Looking Without Seeing and Seeing Without Looking". Visual Cognition. 7 (1–3): 191–211. doi:10.1080/135062800394766. ISSN   1350-6285. S2CID   18034759.
  12. Simons DJ, Chabris CF (1999-09-01). "Gorillas in our midst: sustained inattentional blindness for dynamic events". Perception. 28 (9): 1059–74. doi:10.1068/p281059. PMID   10694957. S2CID   1073781.
  13. Scholl BJ, Pylyshyn ZW (March 1999). "Tracking multiple items through occlusion: clues to visual objecthood". Cognitive Psychology. 38 (2): 259–90. doi:10.1006/cogp.1998.0698. PMID   10090804. S2CID   17447994.
  14. Nakayama K (1993-05-13). "The iconic bottleneck and the tenuous link between early visual processing and perception.". In Adler K, Pointon M (eds.). Vision: Coding and efficiency. Cambridge University Press. ISBN   978-0-521-44769-0.
  15. Neisser U (1967). Cognitive Psychology. New York: Appleton-Cent.
  16. Wertheimer M (January 1923). "Untersuchungen zur Lehre von der Gestalt. II" [Investigations into the teaching of the form]. Psychological Research (in German). 4 (1): 301–50. doi:10.1007/BF00410640. S2CID   143510308.
  17. Koffka, K. (1935). The Principles of Gestalt Psychology. London: Routledge and Kegan Paul Ltd.
  18. 1 2 Anderson, Norman H. (1971). "Integration theory and attitude change". Psychological Review. 78 (3): 171–206. doi:10.1037/h0030834. ISSN   0033-295X.
  19. 1 2 Anderson, Norman H. (1965). "Averaging versus adding as a stimulus-combination rule in impression formation". Journal of Experimental Psychology. 70 (4): 394–400. doi:10.1037/h0022280. ISSN   0022-1015. PMID   5826027.
  20. Anderson, N. H., Lindner, R., & Lopes, L. L. (1973). Integration Theory Applied to Judgments of Group Attractiveness. Journal of Personality and Social Psychology, 26(3), 400-408.
  21. Levin, I. P. (1974). Averaging Processes in Ratings and Choices Based on Numerical Information. Memory & Cognition, 2(4), 786-790.
  22. Leon, M., Oden, G. C., & Anderson, N. H. (1973). Functional Measurement of Social Values. Journal of Personality and Social Psychology, 27(3), 301-310.
  23. 1 2 3 Ariely D (March 2001). "Seeing sets: representation by statistical properties". Psychological Science. 12 (2): 157–62. doi:10.1111/1467-9280.00327. JSTOR   40063604. PMID   11340926. S2CID   6435925.
  24. 1 2 Chong SC, Treisman A (February 2003). "Representation of statistical properties". Vision Research. 43 (4): 393–404. doi: 10.1016/S0042-6989(02)00596-5 . PMID   12535996.
  25. 1 2 3 Dakin SC, Watt RJ (November 1997). "The computation of orientation statistics from visual texture". Vision Research. 37 (22): 3181–92. doi: 10.1016/S0042-6989(97)00133-8 . PMID   9463699.
  26. 1 2 Alvarez GA, Oliva A (April 2008). "The representation of simple ensemble visual features outside the focus of attention". Psychological Science. 19 (4): 392–8. doi:10.1111/j.1467-9280.2008.02098.x. PMC   2587223 . PMID   18399893.
  27. 1 2 Halberda J, Sires SF, Feigenson L (July 2006). "Multiple spatially overlapping sets can be enumerated in parallel". Psychological Science. 17 (7): 572–6. doi:10.1111/j.1467-9280.2006.01746.x. PMID   16866741. S2CID   18182572.
  28. Solomon JA, Morgan M, Chubb C (October 2011). "Efficiencies for the statistics of size discrimination". Journal of Vision. 11 (12): 13. doi:10.1167/11.12.13. PMC   4135075 . PMID   22011381.
  29. Sweeny, Timothy D.; Whitney, David (October 2014). "Perceiving Crowd Attention: Ensemble Perception of a Crowd's Gaze". Psychological Science. 25 (10): 1903–1913. doi:10.1177/0956797614544510. ISSN   0956-7976. PMC   4192023 . PMID   25125428.
  30. Sweeny, Timothy D.; Haroz, Steve; Whitney, David (2013). "Perceiving group behavior: Sensitive ensemble coding mechanisms for biological motion of human crowds". Journal of Experimental Psychology: Human Perception and Performance. 39 (2): 329–337. doi:10.1037/a0028712. ISSN   1939-1277. PMID   22708744.
  31. 1 2 Watamaniuk SN, Sekuler R, Williams DW (1989-01-01). "Direction perception in complex dynamic displays: the integration of direction information". Vision Research. 29 (1): 47–59. doi:10.1016/0042-6989(89)90173-9. PMID   2773336. S2CID   11379304.
  32. 1 2 Watamaniuk SN, McKee SP (February 1998). "Simultaneous encoding of direction at a local and global scale". Perception & Psychophysics. 60 (2): 191–200. doi: 10.3758/BF03206028 . PMID   9529903.
  33. Haberman, Jason; Brady, Timothy F; Alvarez, George A (August 2014). "Independent ensemble processing mechanisms for high-level and low-level perceptual features". Journal of Vision. 14 (1322): 1322. doi: 10.1167/14.10.1322 .
  34. Sama, Marco A; Nestor, Adrian; Cant, Jonathan S (May 2019). "Independence of viewpoint and identity in face ensemble processing". Journal of Vision. 19 (2): 10.1167/19.5.2. doi: 10.1167/19.5.2 . S2CID   145822839.
  35. 1 2 3 Goodale, Brianna M.; Alt, Nicholas P.; Lick, David J.; Johnson, Kerri L. (November 2018). "Groups at a glance: Perceivers infer social belonging in a group based on perceptual summaries of sex ratio". Journal of Experimental Psychology: General. 147 (11): 1660–1676. doi: 10.1037/xge0000450 . ISSN   1939-2222. PMID   30372114.
  36. Dolan, Eric W. (2024-02-24). "Ensemble perception: Trust judgments of crowds of faces happen at the blink of an eye". PsyPost - Psychology News. Retrieved 2024-02-29.