# Statistical graphics

Last updated

Statistical graphics, also known as statistical graphical techniques, are graphics used in the field of statistics for data visualization.

## Overview

Whereas statistics and data analysis procedures generally yield their output in numeric or tabular form, graphical techniques allow such results to be displayed in some sort of pictorial form. They include plots such as scatter plots, histograms, probability plots, spaghetti plots, residual plots, box plots, block plots and biplots. [1]

Exploratory data analysis (EDA) relies heavily on such techniques. They can also provide insight into a data set to help with testing assumptions, model selection and regression model validation, estimator selection, relationship identification, factor effect determination, and outlier detection. In addition, the choice of appropriate statistical graphics can provide a convincing means of communicating the underlying message that is present in the data to others. [1]

Graphical statistical methods have four objectives: [2]

• The exploration of the content of a data set
• The use to find structure in data
• Checking assumptions in statistical models
• Communicate the results of an analysis.

If one is not using statistical graphics, then one is forfeiting insight into one or more aspects of the underlying structure of the data.

## History

Statistical graphics have been central to the development of science and date to the earliest attempts to analyse data. Many familiar forms, including bivariate plots, statistical maps, bar charts, and coordinate paper were used in the 18th century. Statistical graphics developed through attention to four problems: [3]

• Spatial organization in the 17th and 18th century
• Discrete comparison in the 18th and early 19th century
• Continuous distribution in the 19th century and
• Multivariate distribution and correlation in the late 19th and 20th century.

Since the 1970s statistical graphics have been re-emerging as an important analytic tool with the revitalisation of computer graphics and related technologies. [3]

## Examples

Famous graphics were designed by:

See the plots page for many more examples of statistical graphics.

## Related Research Articles

A chart is a graphical representation for data visualization, in which "the data is represented by symbols, such as bars in a bar chart, lines in a line chart, or slices in a pie chart". A chart can represent tabular numeric data, functions or some kinds of quality structure and provides different info.

Information design is the practice of presenting information in a way that fosters an efficient and effective understanding of the information. The term has come to be used for a specific area of graphic design related to displaying information effectively, rather than just attractively or for artistic expression. Information design is closely related to the field of data visualization and is often taught as part of graphic design courses. The broad applications of information design along with its close connections to other fields of design and communication practices have created some overlap in the definitions of communication design, data visualization, and information architecture.

Edward Rolf Tufte, sometimes known as "ET", is an American statistician and professor emeritus of political science, statistics, and computer science at Yale University. He is noted for his writings on information design and as a pioneer in the field of data visualization.

A small multiple is a series of similar graphs or charts using the same scale and axes, allowing them to be easily compared. It uses multiple views to show different partitions of a dataset. The term was popularized by Edward Tufte.

Scientific visualization is an interdisciplinary branch of science concerned with the visualization of scientific phenomena. It is also considered a subset of computer graphics, a branch of computer science. The purpose of scientific visualization is to graphically illustrate scientific data to enable scientists to understand, illustrate, and glean insight from their data.

Visualization or visualisation is any technique for creating images, diagrams, or animations to communicate a message. Visualization through visual imagery has been an effective way to communicate both abstract and concrete ideas since the dawn of humanity. Examples from history include cave paintings, Egyptian hieroglyphs, Greek geometry, and Leonardo da Vinci's revolutionary methods of technical drawing for engineering and scientific purposes.

A pie chart is a circular statistical graphic, which is divided into slices to illustrate numerical proportion. In a pie chart, the arc length of each slice, is proportional to the quantity it represents. While it is named for its resemblance to a pie which has been sliced, there are variations on the way it can be presented. The earliest known pie chart is generally credited to William Playfair's Statistical Breviary of 1801.

Chartjunk refers to all visual elements in charts and graphs that are not necessary to comprehend the information represented on the graph, or that distract the viewer from this information.

Infographics are graphic visual representations of information, data, or knowledge intended to present information quickly and clearly. They can improve cognition by utilizing graphics to enhance the human visual system's ability to see patterns and trends. Similar pursuits are information visualization, data visualization, statistical graphics, information design, or information architecture. Infographics have evolved in recent years to be for mass communication, and thus are designed with fewer assumptions about the readers' knowledge base than other types of visualizations. Isotypes are an early example of infographics conveying information quickly and easily to the masses.

Data visualization is an interdisciplinary field that deals with the graphic representation of data. It is a particularly efficient way of communicating when the data is numerous as for example a Time Series. From an academic point of view, this representation can be considered as a mapping between the original data and graphic elements. The mapping determines how the attributes of these elements vary according to the data. In this light, a bar chart is a mapping of the length of a bar to a magnitude of a variable. Since the graphic design of the mapping can adversely affect the readability of a chart, mapping is a core competency of Data visualization. Data visualization has its roots in the field of Statistics and is therefore generally considered a branch of Descriptive Statistics. However, because both design skills and statistical and computing skills are required to visualize effectively, it is argued by some authors that it is both an Art and a Science.

Charles Joseph Minard was a French civil engineer recognized for his significant contribution in the field of information graphics in civil engineering and statistics. Minard was, among other things, noted for his representation of numerical data on geographic maps, especially his flow maps.

A thematic map is a type of map that portrays the geographic pattern of a particular subject matter (theme) in a geographic area. This usually involves the use of map symbols to visualize selected properties of geographic features that are not naturally visible, such as temperature, language, or population. In this, they contrast with general reference maps, which focus on the location of a diverse set of physical features, such as rivers, roads, and buildings. Alternative names have been suggested for this class, such as special-subject or special-purpose maps, statistical maps, or distribution maps, but these have generally fallen out of common usage. Thematic mapping is closely allied with the field of Geovisualization.

A bivariate or multivariate map is a type of thematic map that displays two or more variables on a single map by combining different sets of symbols. Each of the variables is represented using a standard thematic map technique, such as choropleth, cartogram, or proportional symbols. They may be the same type or different types, and they may be on separate layers of the map, or they may be combined into a single multivariate symbol.

Michael Louis Friendly is an American psychologist, Professor of Psychology at York University in Ontario, Canada, and director of its Statistical Consulting Service, especially known for his contributions to graphical methods for categorical and multivariate data, and on the history of data and information visualisation.

A plot is a graphical technique for representing a data set, usually as a graph showing the relationship between two or more variables. The plot can be drawn by hand or by a computer. In the past, sometimes mechanical or electronic plotters were used. Graphs are a visual representation of the relationship between variables, which are very useful for humans who can then quickly derive an understanding which may not have come from lists of values. Given a scale or ruler, graphs can also be used to read off the value of an unknown variable plotted as a function of a known one, but this can also be done with data presented in tabular form. Graphs of functions are used in mathematics, sciences, engineering, technology, finance, and other areas.

Seasonal subseries plots are a graphical tool to visualize and detect seasonality in a time series. Seasonal subseries plots involves the extraction of the seasons from a time series into a subseries. Based on a selected periodicity, it is an alternative plot that emphasizes the seasonal patterns are where the data for each season are collected together in separate mini time plots.

A motion chart is a dynamic bubble chart which allows efficient and interactive exploration and visualization of longitudinal multivariate Data. Motion charts provide mechanisms for mapping ordinal, nominal and quantitative variables onto time, 2D coordinate axes, size, colors, glyphs and appearance characteristics, which facilitate the interactive display of multidimensional and temporal data.

Howard Gray Funkhouser was an American mathematician, historian and Associate Professor of Mathematics at the Washington and Lee University and later at the Phillips Exeter Academy, particularly known for his early work on the history of graphical methods.

James Ralph Beniger was an American historian and sociologist and Professor of Communications and Sociology at the Annenberg School for Communication at the University of Southern California, particularly known his early work on the history of quantitative graphics in statistics, and his later work on the technological and economic origins of the information society.

Graphical perception is the human capacity for visually interpreting information on graphs and charts. Both quantitative and qualitative information can be said to be encoded into the image, and the human capacity to interpret it is sometimes called decoding. The importance of human graphical perception, what we discern easily versus what our brains have more difficulty decoding, is fundamental to good statistical graphics design, where clarity, transparency, accuracy and precision in data display and interpretation are essential for understanding the translation of data in a graph to clarify and interpret the science.

## References

Citations
1. "The Role of Graphics". NIST/SEMATECH e-Handbook of Statistical Methods. 2003–2010. Retrieved May 5, 2011.
2. Jacoby, William G. (1997). Statistical Graphics for Univariate and Bivariate Data: Statistical Graphics. pp. 2–4.
3. James R. Beniger and Dorothy L. Robyn (1978). "Quantitative graphics in statistics: A brief history". In: The American Statistician. 32: pp. 1–11.
4. Tufte, Edward (1983). . Cheshire, Connecticut: Graphics Press. ISBN   0961392142.
5. Small, Hugh. "Florence Nightingale's statistical diagrams".
6. Crosier, Scott. "John Snow: The London Cholera Epidemic of 1854". University of California, Santa Barbara.
7. Corbett, John. "Charles Joseph Minard: Mapping Napoleon's March, 1861". Center for Spatially Integrated Social Science. Retrieved 21 September 2014.