Chernoff face

Last updated
This example shows Chernoff faces for lawyers' ratings of twelve judges Chernoff faces for evaluations of US judges.svg
This example shows Chernoff faces for lawyers' ratings of twelve judges

Chernoff faces, invented by applied mathematician, statistician and physicist Herman Chernoff in 1973, display multivariate data in the shape of a human face. The individual parts, such as eyes, ears, mouth and nose represent values of the variables by their shape, size, placement and orientation. The idea behind using faces is that humans easily recognize faces and notice small changes without difficulty. Chernoff faces handle each variable differently. Because the features of the faces vary in perceived importance, the way in which variables are mapped to the features should be carefully chosen (e.g. eye size and eyebrow-slant have been found to carry significant weight). [1]

Contents

Detail

Chernoff faces themselves can be plotted on a standard XY graph; the faces can be positioned XY based on the two most important variables, and then the faces themselves represent the rest of the dimensions for each item. Edward Tufte, presenting such a diagram, says that this kind of Chernoff-face graph would "reduce well, maintaining legibility even with individual areas of 0.05 square inches as shown ... with cartoon faces and even numbers becoming data measures, we would appear to have reached the limit of graphical economy of presentation, imagination, and let it be admitted, eccentricity". [2]

Extensions

Asymmetrical faces

In 1981, Bernhard Flury and Hans Riedwyl suggested "asymmetrical" Chernoff faces; [3] since a face has vertical symmetry (around the y axis), the left side of the face is identical to the right and is basically wasted space – a point also made by Tufte. [4] One could have the 18 variables that specify the left be one set of data, but use a different set of data for the right side of the face, allowing one face to depict 35 different measurements. They present results showing that such asymmetrical faces are useful in visualizing databases of identical twins, for example, and are useful in grouping as pairs of Chernoff faces would be. [3]

Chernoff fish

Julie Rodriguez and Piotr Kaczmarek use "Chernoff fish", where various parts of a cartoon fish are used to encode different financial details. [5]

In literature

In Peter Watts' novel Blindsight (2006), a transhuman character is seen using a variant of Chernoff faces. This is explained by the character as a more efficient method of representing data, as a large portion of the human brain is devoted to facial recognition. [6]

In the 2014 sci-fi short story "Degrees of Freedom" by Karl Schroeder, Chernoff faces make a prominent appearances as a future technology, supporting the communication of aggregate sentiment and perspective. [7]

Related Research Articles

<span class="mw-page-title-main">Information design</span> Communication and graphic design

Information design is the practice of presenting information in a way that fosters an efficient and effective understanding of the information. The term has come to be used for a specific area of graphic design related to displaying information effectively, rather than just attractively or for artistic expression. Information design is closely related to the field of data visualization and is often taught as part of graphic design courses. The broad applications of information design along with its close connections to other fields of design and communication practices have created some overlap in the definitions of communication design, data visualization, and information architecture.

<span class="mw-page-title-main">Edward Tufte</span> American statistician (born 1942)

Edward Rolf Tufte, sometimes known as "ET", is an American statistician and professor emeritus of political science, statistics, and computer science at Yale University. He is noted for his writings on information design and as a pioneer in the field of data visualization.

<span class="mw-page-title-main">Bar chart</span> Type of chart

A bar chart or bar graph is a chart or graph that presents categorical data with rectangular bars with heights or lengths proportional to the values that they represent. The bars can be plotted vertically or horizontally. A vertical bar chart is sometimes called a column chart.

In statistics, exploratory data analysis (EDA) is an approach of analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling and thereby contrasts traditional hypothesis testing. Exploratory data analysis has been promoted by John Tukey since 1970 to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments. EDA is different from initial data analysis (IDA), which focuses more narrowly on checking assumptions required for model fitting and hypothesis testing, and handling missing values and making transformations of variables as needed. EDA encompasses IDA.

A small multiple is a series of similar graphs or charts using the same scale and axes, allowing them to be easily compared. It uses multiple views to show different partitions of a dataset. The term was popularized by Edward Tufte.

<span class="mw-page-title-main">Chartjunk</span> Term for unnecessary visual elements in charts

Chartjunk consists of all visual elements in charts and graphs that are not necessary to comprehend the information represented on the graph, or that distract the viewer from this information.

<span class="mw-page-title-main">Infographic</span> Graphic visual representation of information

Infographics are graphic visual representations of information, data, or knowledge intended to present information quickly and clearly. They can improve cognition by using graphics to enhance the human visual system's ability to see patterns and trends. Similar pursuits are information visualization, data visualization, statistical graphics, information design, or information architecture. Infographics have evolved in recent years to be for mass communication, and thus are designed with fewer assumptions about the readers' knowledge base than other types of visualizations. Isotypes are an early example of infographics conveying information quickly and easily to the masses.

<span class="mw-page-title-main">Parallel coordinates</span> Chart displaying multivariate data

Parallel Coordinates plots are a common method of visualizing high-dimensional datasets to analyze multivariate data having multiple variables, or attributes.

<span class="mw-page-title-main">Data and information visualization</span> Visual representation of data

Data and information visualization is the practice of designing and creating easy-to-communicate and easy-to-understand graphic or visual representations of a large amount of complex quantitative and qualitative data and information with the help of static, dynamic or interactive visual items. Typically based on data and information collected from a certain domain of expertise, these visualizations are intended for a broader audience to help them visually explore and discover, quickly understand, interpret and gain important insights into otherwise difficult-to-identify structures, relationships, correlations, local and global patterns, trends, variations, constancy, clusters, outliers and unusual groupings within data. When intended for the general public to convey a concise version of known, specific information in a clear and engaging manner, it is typically called information graphics.

<span class="mw-page-title-main">Thematic map</span> Type of map that visualizes data

A thematic map is a type of map that portrays the geographic pattern of a particular subject matter (theme) in a geographic area. This usually involves the use of map symbols to visualize selected properties of geographic features that are not naturally visible, such as temperature, language, or population. In this, they contrast with general reference maps, which focus on the location of a diverse set of physical features, such as rivers, roads, and buildings. Alternative names have been suggested for this class, such as special-subject or special-purpose maps, statistical maps, or distribution maps, but these have generally fallen out of common usage. Thematic mapping is closely allied with the field of Geovisualization.

<span class="mw-page-title-main">Multivariate map</span> Thematic map visualizing multiple variables

A bivariate map or multivariate map is a type of thematic map that displays two or more variables on a single map by combining different sets of symbols. Each of the variables is represented using a standard thematic map technique, such as choropleth, cartogram, or proportional symbols. They may be the same type or different types, and they may be on separate layers of the map, or they may be combined into a single multivariate symbol.

<span class="mw-page-title-main">Radar chart</span> Type of chart

A radar chart is a graphical method of displaying multivariate data in the form of a two-dimensional chart of three or more quantitative variables represented on axes starting from the same point. The relative position and angle of the axes is typically uninformative, but various heuristics, such as algorithms that plot data as the maximal total area, can be applied to sort the variables (axes) into relative positions that reveal distinct correlations, trade-offs, and a multitude of other comparative measures.

GGobi is a free statistical software tool for interactive data visualization. GGobi allows extensive exploration of the data with Interactive dynamic graphics. It is also a tool for looking at multivariate data. R can be used in sync with GGobi. The GGobi software can be embedded as a library in other programs and program packages using an application programming interface (API) or as an add-on to existing languages and scripting environments, e.g., with the R command line or from a Perl or Python scripts. GGobi prides itself on its ability to link multiple graphs together.

Statistical graphics, also known as statistical graphical techniques, are graphics used in the field of statistics for data visualization.

<span class="mw-page-title-main">Plot (graphics)</span> Graphical technique for data sets

A plot is a graphical technique for representing a data set, usually as a graph showing the relationship between two or more variables. The plot can be drawn by hand or by a computer. In the past, sometimes mechanical or electronic plotters were used. Graphs are a visual representation of the relationship between variables, which are very useful for humans who can then quickly derive an understanding which may not have come from lists of values. Given a scale or ruler, graphs can also be used to read off the value of an unknown variable plotted as a function of a known one, but this can also be done with data presented in tabular form. Graphs of functions are used in mathematics, sciences, engineering, technology, finance, and other areas.

Mondrian is a general-purpose statistical data-visualization system, for interactive data visualization.

<span class="mw-page-title-main">Motion chart</span>

A motion chart is a dynamic bubble chart which allows efficient and interactive exploration and visualization of longitudinal multivariate data. Motion charts provide mechanisms for mapping ordinal, nominal and quantitative variables onto time, 2D coordinate axes, size, colors, glyphs and appearance characteristics, which facilitate the interactive display of multidimensional and temporal data.

<span class="mw-page-title-main">Howard G. Funkhouser</span> American mathematician and historian (1898–1984)

Howard Gray Funkhouser was an American mathematician, historian and associate professor of mathematics at the Washington and Lee University, and later at the Phillips Exeter Academy, particularly known for his early work on the history of graphical methods.

<span class="mw-page-title-main">Graphical perception</span>

Graphical perception is the human capacity for visually interpreting information on graphs and charts. Both quantitative and qualitative information can be said to be encoded into the image, and the human capacity to interpret it is sometimes called decoding. The importance of human graphical perception, what we discern easily versus what our brains have more difficulty decoding, is fundamental to good statistical graphics design, where clarity, transparency, accuracy and precision in data display and interpretation are essential for understanding the translation of data in a graph to clarify and interpret the science.

References

  1. "An Experimental Analysis of the Pre-Attentiveness of Features in Chernoff Faces", Morris CJ, Ebert DS, Rheingans P. 1999; published by the conference "Applied Imagery Pattern Recognition: 3D Visualization for Data Exploration and Decision Making". http://www.research.ibm.com/people/c/cjmorris/publications/Chernoff_990402.pdf doi : 10.1117/12.384865
  2. Edward R. Tufte, The Visual Display of Quantitative Information, p. 142.
  3. 1 2 Flury, Bernhard; Riedwyl, Hans (December 1981). "Graphical Representation of Multivariate Data by Means of Asymmetrical Faces". Journal of the American Statistical Association . 76 (376). American Statistical Association: 757–765. doi:10.2307/2287565. JSTOR   2287565.
  4. Edward R. Tufte, The Visual Display of Quantitative Information, p. 97: "Halves may be easier to sort (by matching the right half of an unsorted face to the left half of a sorted face) than full faces. Or else an asymmetrical full face can be used to report additional variables ( Flury & Riedwyl 1981 ). Bilateral symmetry doubles the space consumed by the design in a graph, without adding new information."
  5. Visualizing Financial Data. John Wiley & Sons. 2 May 2016. ISBN   9781118907856.
  6. Peter Watts. "Blindsight" . Retrieved 2021-04-19.
  7. Karl Schroeder. "End notes from Karl Schroeder's "Degrees of Freedom"". Archived from the original on 2018-03-24. Retrieved 2018-03-23.

Other sources

Further reading