Data visualization

Last updated

Contents

Data visualization (often abbreviated data viz [1] ) is an interdisciplinary field that deals with the graphic representation of data. It is a particularly efficient way of communicating when the data is numerous as for example a time series. [2]

From an academic point of view, this representation can be considered as a mapping between the original data (usually numerical) and graphic elements [3] (for example, lines or points in a chart). The mapping determines how the attributes of these elements vary according to the data. In this light, a bar chart is a mapping of the length of a bar to a magnitude of a variable. Since the graphic design of the mapping can adversely affect the readability of a chart, [2] mapping is a core competency of Data visualization. [4]

Data visualization has its roots in the field of statistics and is therefore generally considered a branch of descriptive Statistics. However, because both design skills and statistical and computing skills are required to visualize effectively, it is argued by authors such as Gershon and Page that it is both an art and a science. [4]

Research into how people read and misread various types of visualizations is helping to determine what types and features of visualizations are most understandable and effective in conveying information. [5] [6]

Overview

Data visualization is one of the steps in analyzing data and presenting it to users. Data visualization process v1.png
Data visualization is one of the steps in analyzing data and presenting it to users.

To communicate information clearly and efficiently, data visualization uses statistical graphics, plots, information graphics and other tools. Numerical data may be encoded using dots, lines, or bars, to visually communicate a quantitative message. [7] Effective visualization helps users analyze and reason about data and evidence. [8] It makes complex data more accessible, understandable, and usable, but can also be reductive. [9] Users may have particular analytical tasks, such as making comparisons or understanding causality, and the design principle of the graphic (i.e., showing comparisons or showing causality) follows the task. Tables are generally used where users will look up a specific measurement, while charts of various types are used to show patterns or relationships in the data for one or more variables.

Data visualization refers to the techniques used to communicate data or information by encoding it as visual objects (e.g., points, lines, or bars) contained in graphics. The goal is to communicate information clearly and efficiently to users. It is one of the steps in data analysis or data science. According to Vitaly Friedman (2008) the "main goal of data visualization is to communicate information clearly and effectively through graphical means. It doesn't mean that data visualization needs to look boring to be functional or extremely sophisticated to look beautiful. To convey ideas effectively, both aesthetic form and functionality need to go hand in hand, providing insights into a rather sparse and complex data set by communicating its key aspects in a more intuitive way. Yet designers often fail to achieve a balance between form and function, creating gorgeous data visualizations which fail to serve their main purpose — to communicate information". [10]

Indeed, Fernanda Viegas and Martin M. Wattenberg suggested that an ideal visualization should not only communicate clearly, but stimulate viewer engagement and attention. [11]

Data visualization is closely related to information graphics, information visualization, scientific visualization, exploratory data analysis and statistical graphics. In the new millennium, data visualization has become an active area of research, teaching and development. According to Post et al. (2002), it has united scientific and information visualization. [12]

In the commercial environment data visualization is often referred to as dashboards. Infographics are another very common form of data visualization.

Underpinnings

Characteristics of effective graphical displays

Charles Joseph Minard's 1869 diagram of Napoleonic France's invasion of Russia, an early example of an information graphic Minard.png
Charles Joseph Minard's 1869 diagram of Napoleonic France's invasion of Russia, an early example of an information graphic

The greatest value of a picture is when it forces us to notice what we never expected to see.

John Tukey [13]

Edward Tufte has explained that users of information displays are executing particular analytical tasks such as making comparisons. The design principle of the information graphic should support the analytical task. [14] As William Cleveland and Robert McGill show, different graphical elements accomplish this more or less effectively. For example, dot plots and bar charts outperform pie charts. [15]

In his 1983 book The Visual Display of Quantitative Information, Edward Tufte defines 'graphical displays' and principles for effective graphical display in the following passage: "Excellence in statistical graphics consists of complex ideas communicated with clarity, precision, and efficiency. Graphical displays should:

Graphics reveal data. Indeed graphics can be more precise and revealing than conventional statistical computations." [16]

For example, the Minard diagram shows the losses suffered by Napoleon's army in the 1812–1813 period. Six variables are plotted: the size of the army, its location on a two-dimensional surface (x and y), time, the direction of movement, and temperature. The line width illustrates a comparison (size of the army at points in time), while the temperature axis suggests a cause of the change in army size. This multivariate display on a two-dimensional surface tells a story that can be grasped immediately while identifying the source data to build credibility. Tufte wrote in 1983 that: "It may well be the best statistical graphic ever drawn." [16]

Not applying these principles may result in misleading graphs, distorting the message, or supporting an erroneous conclusion. According to Tufte, chartjunk refers to the extraneous interior decoration of the graphic that does not enhance the message or gratuitous three-dimensional or perspective effects. Needlessly separating the explanatory key from the image itself, requiring the eye to travel back and forth from the image to the key, is a form of "administrative debris." The ratio of "data to ink" should be maximized, erasing non-data ink where feasible. [16]

The Congressional Budget Office summarized several best practices for graphical displays in a June 2014 presentation. These included: a) Knowing your audience; b) Designing graphics that can stand alone outside the report's context; and c) Designing graphics that communicate the key messages in the report. [17]

Quantitative messages

A time series illustrated with a line chart demonstrating trends in U.S. federal spending and revenue over time Total Revenues and Outlays as Percent GDP 2013.png
A time series illustrated with a line chart demonstrating trends in U.S. federal spending and revenue over time
A scatterplot illustrating negative correlation between two variables (inflation and unemployment) measured at points in time U.S. Phillips Curve 2000 to 2013.png
A scatterplot illustrating negative correlation between two variables (inflation and unemployment) measured at points in time

Author Stephen Few described eight types of quantitative messages that users may attempt to understand or communicate from a set of data and the associated graphs used to help communicate the message:

  1. Time-series: A single variable is captured over a period of time, such as the unemployment rate over a 10-year period. A line chart may be used to demonstrate the trend.
  2. Ranking: Categorical subdivisions are ranked in ascending or descending order, such as a ranking of sales performance (the measure) by sales persons (the category, with each sales person a categorical subdivision) during a single period. A bar chart may be used to show the comparison across the sales persons.
  3. Part-to-whole: Categorical subdivisions are measured as a ratio to the whole (i.e., a percentage out of 100%). A pie chart or bar chart can show the comparison of ratios, such as the market share represented by competitors in a market.
  4. Deviation: Categorical subdivisions are compared against a reference, such as a comparison of actual vs. budget expenses for several departments of a business for a given time period. A bar chart can show comparison of the actual versus the reference amount.
  5. Frequency distribution: Shows the number of observations of a particular variable for given interval, such as the number of years in which the stock market return is between intervals such as 0-10%, 11-20%, etc. A histogram, a type of bar chart, may be used for this analysis. A boxplot helps visualize key statistics about the distribution, such as median, quartiles, outliers, etc.
  6. Correlation: Comparison between observations represented by two variables (X,Y) to determine if they tend to move in the same or opposite directions. For example, plotting unemployment (X) and inflation (Y) for a sample of months. A scatter plot is typically used for this message.
  7. Nominal comparison: Comparing categorical subdivisions in no particular order, such as the sales volume by product code. A bar chart may be used for this comparison.
  8. Geographic or geospatial: Comparison of a variable across a map or layout, such as the unemployment rate by state or the number of persons on the various floors of a building. A cartogram is a typical graphic used. [7] [18]

Analysts reviewing a set of data may consider whether some or all of the messages and graphic types above are applicable to their task and audience. The process of trial and error to identify meaningful relationships and messages in the data is part of exploratory data analysis.

Visual perception and data visualization

A human can distinguish differences in line length, shape, orientation, distances, and color (hue) readily without significant processing effort; these are referred to as "pre-attentive attributes". For example, it may require significant time and effort ("attentive processing") to identify the number of times the digit "5" appears in a series of numbers; but if that digit is different in size, orientation, or color, instances of the digit can be noted quickly through pre-attentive processing. [19]

Compelling graphics take advantage of pre-attentive processing and attributes and the relative strength of these attributes. For example, since humans can more easily process differences in line length than surface area, it may be more effective to use a bar chart (which takes advantage of line length to show comparison) rather than pie charts (which use surface area to show comparison). [19]

Human perception/cognition and data visualization

Almost all data visualizations are created for human consumption. Knowledge of human perception and cognition is necessary when designing intuitive visualizations. [20] Cognition refers to processes in human beings like perception, attention, learning, memory, thought, concept formation, reading, and problem solving. [21] Human visual processing is efficient in detecting changes and making comparisons between quantities, sizes, shapes and variations in lightness. When properties of symbolic data are mapped to visual properties, humans can browse through large amounts of data efficiently. It is estimated that 2/3 of the brain's neurons can be involved in visual processing. Proper visualization provides a different approach to show potential connections, relationships, etc. which are not as obvious in non-visualized quantitative data. Visualization can become a means of data exploration.

Studies have shown individuals used on average 19% less cognitive resources, and 4.5% better able to recall details when comparing data visualization with text. [22]

History

Selected milestones and inventions 50 years of datavisulization berengueres own work.png
Selected milestones and inventions

There is no comprehensive 'history' of data visualization. There are no accounts that span the entire development of visual thinking and the visual representation of data, and which collate the contributions of disparate disciplines. [23] Michael Friendly and Daniel J Denis of York University are engaged in a project that attempts to provide a comprehensive history of visualization. Contrary to general belief, data visualization is not a modern development. Since prehistory, stellar data, or information such as location of stars were visualized on the walls of caves (such as those found in Lascaux Cave in Southern France) since the Pleistocene era. [24] Physical artefacts such as Mesopotamian clay tokens (5500 BC), Inca quipus (2600 BC) and Marshall Islands stick charts (n.d.) can also be considered as visualizing quantitative information. [25] [26]

The first documented data visualization can be tracked back to 1160 B.C. with Turin Papyrus Map which accurately illustrates the distribution of geological resources and provides information about quarrying of those resources. [27] Such maps can be categorized as thematic cartography, which is a type of data visualization that presents and communicates specific data and information through a geographical illustration designed to show a particular theme connected with a specific geographic area. Earliest documented forms of data visualization were various thematic maps from different cultures and ideograms and hieroglyphs that provided and allowed interpretation of information illustrated. For example, Linear B tablets of Mycenae provided a visualization of information regarding Late Bronze Age era trades in the Mediterranean. The idea of coordinates was used by ancient Egyptian surveyors in laying out towns, earthly and heavenly positions were located by something akin to latitude and longitude at least by 200 BC, and the map projection of a spherical earth into latitude and longitude by Claudius Ptolemy [c.85–c. 165] in Alexandria would serve as reference standards until the 14th century. [27]

The invention of paper and parchment allowed further development of visualizations throughout history. Figure shows a graph from the 10th or possibly 11th century that is intended to be an illustration of the planetary movement, used in an appendix of a textbook in monastery schools. [28] The graph apparently was meant to represent a plot of the inclinations of the planetary orbits as a function of the time. For this purpose, the zone of the zodiac was represented on a plane with a horizontal line divided into thirty parts as the time or longitudinal axis. The vertical axis designates the width of the zodiac. The horizontal scale appears to have been chosen for each planet individually for the periods cannot be reconciled. The accompanying text refers only to the amplitudes. The curves are apparently not related in time.

Planetary movements Mouvement des planetes au cours du temps.png
Planetary movements

By the 16th century, techniques and instruments for precise observation and measurement of physical quantities, and geographic and celestial position were well-developed (for example, a “wall quadrant” constructed by Tycho Brahe [1546–1601], covering an entire wall in his observatory). Particularly important were the development of triangulation and other methods to determine mapping locations accurately. [23] Very early, the measure of time led scholars to develop innovative way of visualizing the data (e.g. Lorenz Codomann in 1596, Johannes Temporarius in 1596 [29] ).

French philosopher and mathematician René Descartes and Pierre de Fermat developed analytic geometry and two-dimensional coordinate system which heavily influenced the practical methods of displaying and calculating values. Fermat and Blaise Pascal's work on statistics and probability theory laid the groundwork for what we now conceptualize as data. [23] According to the Interaction Design Foundation, these developments allowed and helped William Playfair, who saw potential for graphical communication of quantitative data, to generate and develop graphical methods of statistics. [20]

Playfair TimeSeries Playfair TimeSeries.png
Playfair TimeSeries

In the second half of the 20th century, Jacques Bertin used quantitative graphs to represent information "intuitively, clearly, accurately, and efficiently". [20]

John Tukey and Edward Tufte pushed the bounds of data visualization; Tukey with his new statistical approach of exploratory data analysis and Tufte with his book "The Visual Display of Quantitative Information" paved the way for refining data visualization techniques for more than statisticians. With the progression of technology came the progression of data visualization; starting with hand-drawn visualizations and evolving into more technical applications – including interactive designs leading to software visualization. [30]

Programs like SAS, SOFA, R, Minitab, Cornerstone and more allow for data visualization in the field of statistics. Other data visualization applications, more focused and unique to individuals, programming languages such as D3, Python and JavaScript help to make the visualization of quantitative data a possibility. Private schools have also developed programs to meet the demand for learning data visualization and associated programming libraries, including free programs like The Data Incubator or paid programs like General Assembly. [31]

Beginning with the symposium "Data to Discovery" in 2013, ArtCenter College of Design, Caltech and JPL in Pasadena have run an annual program on interactive data visualization. [32] The program asks: How can interactive data visualization help scientists and engineers explore their data more effectively? How can computing, design, and design thinking help maximize research results? What methodologies are most effective for leveraging knowledge from these fields? By encoding relational information with appropriate visual and interactive characteristics to help interrogate, and ultimately gain new insight into data, the program develops new interdisciplinary approaches to complex science problems, combining design thinking and the latest methods from computing, user-centered design, interaction design and 3D graphics.

Terminology

Data visualization involves specific terminology, some of which is derived from statistics. For example, author Stephen Few defines two types of data, which are used in combination to support a meaningful analysis or visualization:

The distinction between quantitative and categorical variables is important because the two types require different methods of visualization.

Two primary types of information displays are tables and graphs.

Eppler and Lengler have developed the "Periodic Table of Visualization Methods," an interactive chart displaying various data visualization methods. It includes six types of data visualization methods: data, information, concept, strategy, metaphor and compound. [35]

Techniques

NameVisual dimensionsDescription / Example usages
Bar chart of tips by day of week Tips-day-barchart.pdf
Bar chart of tips by day of week
Bar chart
  • length/count
  • category
  • color
  • Presents categorical data with rectangular bars with heights or lengths proportional to the values that they represent. The bars can be plotted vertically or horizontally.
  • A bar graph shows comparisons among discrete categories. One axis of the chart shows the specific categories being compared, and the other axis represents a measured value.
  • Some bar graphs present bars clustered in groups of more than one, showing the values of more than one measured variable. These clustered groups can be differentiated using color.
  • For example; comparison of values, such as sales performance for several persons or businesses in a single time period.
Variable-width bar chart relating (1) population, (2) per capita greenhouse gas emissions, and (3) total greenhouse gas emissions 20210626 Variwide chart of greenhouse gas emissions per capita by country.svg
Variable-width bar chart relating (1) population, (2) per capita greenhouse gas emissions, and (3) total greenhouse gas emissions

Variable-width ("variwide") bar chart

  • category (size/count/extent in first dimension)
  • size/count/extent in second dimension
  • size/count/extent as area of bar
  • color
  • Includes most features of basic bar chart, above
  • Area of non-uniform-width bar explicitly conveys information of a third quantity that is implicitly related to first and second quantities from horizontal and vertical axes
Projected (1) frequency and (2) intensity of extreme "10-year heat waves" are connected in pairs of horizontal and vertical bars, respectively. Bars are distinguished by (3) color-coded primary category (degree of global warming). 20220208 Projected temperature extremes for different degrees of global warming - orthogonal bar chart - IPCC AR6 WG1 SPM.svg
Projected (1) frequency and (2) intensity of extreme "10-year heat waves" are connected in pairs of horizontal and vertical bars, respectively. Bars are distinguished by (3) color-coded primary category (degree of global warming).

Orthogonal (orthogonal composite) bar chart

  • numerical value of first variable (extent in first dimension; superimposed horizontal bars)
  • numerical value of second variable (extent in second dimension; like conventional vertical bar chart)
  • category for first and second variables (e.g., color-coded)
  • Includes most features of basic bar chart, above
  • Pairs of numeric variables, usually color-coded, rendered by category
  • Variables need not be directly related in the way they are in "variwide" charts
Histogram of housing prices Housingprice.png
Histogram of housing prices
Histogram
  • bin limits
  • count/length
  • color
  • An approximate representation of the distribution of numerical data. Divide the entire range of values into a series of intervals and then count how many values fall into each interval this is called binning. The bins are usually specified as consecutive, non-overlapping intervals of a variable. The bins (intervals) must be adjacent, and are often (but not required to be) of equal size.
  • For example, determining frequency of annual stock market percentage returns within particular ranges (bins) such as 0-10%, 11-20%, etc. The height of the bar represents the number of observations (years) with a return % in the range represented by the respective bin.
Basic scatterplot of two variables Scatterplot5.pdf
Basic scatterplot of two variables
Scatter plot
  • x position
  • y position
  • symbol/glyph
  • color
  • size
  • Uses Cartesian coordinates to display values for typically two variables for a set of data.
  • Points can be coded via color, shape and/or size to display additional variables.
  • Each point on the plot has an associated x and y term that determines its location on the cartesian plane.
  • Scatter plots are often used to highlight the correlation between variables (x and y).
Scatter plot Scatter plot.jpg
Scatter plot
Scatter plot (3D)
  • position x
  • position y
  • position z
  • color
  • symbol
  • size
  • Similar to the 2-dimensional scatter plot above, the 3-dimensional scatter plot visualizes the relationship between typically 3 variables from a set of data.
  • Again point can be coded via color, shape and/or size to display additional variables
Network analysis Social Network Analysis Visualization.png
Network analysis
Network
  • Finding clusters in the network (e.g. grouping Facebook friends into different clusters).
  • Discovering bridges (information brokers or boundary spanners) between clusters in the network
  • Determining the most influential nodes in the network (e.g. A company wants to target a small group of people on Twitter for a marketing campaign).
  • Finding outlier actors who do not fit into any cluster or are in the periphery of a network.
Pie chart English dialects1997.svg
Pie chart
Pie chart
  • color
  • Represents one categorical variable which is divided into slices to illustrate numerical proportion. In a pie chart, the arc length of each slice (and consequently its central angle and area), is proportional to the quantity it represents.
  • For example, as shown in the graph to the right, the proportion of English native speakers worldwide
Line chart ScientificGraphSpeedVsTime.svg
Line chart
Line chart
  • x position
  • y position
  • symbol/glyph
  • color
  • size
  • Represents information as a series of data points called 'markers' connected by straight line segments.
  • Similar to a scatter plot except that the measurement points are ordered (typically by their x-axis value) and joined with straight line segments.
  • Often used to visualize a trend in data over intervals of time – a time series – thus the line is often drawn chronologically.
Streamgraph LastGraph example.svg
Streamgraph
Streamgraph
  • width
  • color
  • time (flow)
  • A type of stacked area graph which is displaced around a central axis, resulting in a flowing shape.
  • Unlike a traditional stacked area graph in which the layers are stacked on top of an axis, in a streamgraph the layers are positioned to minimize their "wiggle".
  • Streamgraphs display data with only positive values, and are not able to represent both negative and positive values.
  • For example, the right visual shows the music listened to by a user over the start of the year 2012
Treemap Top100 states area treemap pop-density.svg
Treemap
Treemap
  • size
  • color
  • Is a method for displaying hierarchical data using nested figures, usually rectangles.
  • For example, disk space by location / file type
Gantt chart GanttChartAnatomy.png
Gantt chart
Gantt chart
  • color
  • time (flow)
Heat map Heatmap.png
Heat map
Heat map
  • color
  • categorical variable
  • Represents the magnitude of a phenomenon as color in two dimensions.
  • There are two categories of heat maps:
    • cluster heat map: where magnitudes are laid out into a matrix of fixed cell size whose rows and columns are categorical data. For example, the graph to the right.
    • spatial heat map: where no matrix of fixed cell size for example a heat-map. For example, a heat map showing population densities displayed on a geographical map
Stripe graphic 20190705 Warming stripes - Berkeley Earth (world) - avg above- and below-ice readings.png
Stripe graphic
Stripe graphic
  • x position
  • color
  • A sequence of colored stripes visually portrays trend of a data series.
  • Portrays a single variable—prototypically temperature over time to portray global warming
  • Deliberately minimalist—with no technical indicia—to communicate intuitively with non-scientists [36]
  • Can be "stacked" to represent plural series (example)
Animated spiral graphic 5 9 16 Andrea TempSpiralEdHawkins.gif
Animated spiral graphic
Animated spiral graphic
  • radial distance (dependent variable)
  • rotating angle (cycling through months)
  • color (passing years)
  • Portrays a single dependent variable—prototypically temperature over time to portray global warming
  • Dependent variable is progressively plotted along a continuous "spiral" determined as a function of (a) constantly rotating angle (twelve months per revolution) and (b) evolving color (color changes over passing years) [37]
Box and whisker plot Michelsonmorley-boxplot.svg
Box and whisker plot
Box and Whisker Plot
  • x axis
  • y axis
  • A method for graphically depicting groups of numerical data through their quartiles.
  • Box plots may also have lines extending from the boxes (whiskers) indicating variability outside the upper and lower quartiles.
  • Outliers may be plotted as individual points.
  • The two boxes graphed on top of each other represent the middle 50% of the data, with the line separating the two boxes identifying the median data value and the top and bottom edges of the boxes represent the 75th and 25th percentile data points respectively.
  • Box plots are non-parametric: they display variation in samples of a statistical population without making any assumptions of the underlying statistical distribution, thus are useful for getting an initial understanding of a data set. For example, comparing the distribution of ages between a group of people (e.g., male and females).
Flowchart LampFlowchart.svg
Flowchart
Flowchart
  • Represents a workflow, process or a step-by-step approach to solving a task.
  • The flowchart shows the steps as boxes of various kinds, and their order by connecting the boxes with arrows.
  • For example, outlying the actions to undertake if a lamp is not working, as shown in the diagram to the right.
Radar chart MER Star Plot.gif
Radar chart
Radar chart
  • attributes
  • value assigned to attributes
  • Displays multivariate data in the form of a two-dimensional chart of three or more quantitative variables represented on axes starting from the same point.
  • The relative position and angle of the axes is typically uninformative, but various heuristics, such as algorithms that plot data as the maximal total area, can be applied to sort the variables (axes) into relative positions that reveal distinct correlations, trade-offs, and a multitude of other comparative measures.
  • For example, comparing attributes/skills (e.g., communication, analytical, IT skills) learnt across different university degrees (e.g., mathematics, economics, psychology)
Venn diagram Venn diagram gr la ru.svg
Venn diagram
Venn diagram
  • all possible logical relations between a finite collection of different sets.
  • Shows all possible logical relations between a finite collection of different sets.
  • These diagrams depict elements as points in the plane, and sets as regions inside closed curves.
  • A Venn diagram consists of multiple overlapping closed curves, usually circles, each representing a set.
  • The points inside a curve labelled S represent elements of the set S, while points outside the boundary represent elements not in the set S. This lends itself to intuitive visualizations; for example, the set of all elements that are members of both sets S and T, denoted ST and read "the intersection of S and T", is represented visually by the area of overlap of the regions S and T. In Venn diagrams, the curves are overlapped in every possible way, showing all possible relations between the sets.
Iconography of correlations AirMerIconographyCorrelation.jpg
Iconography of correlations
Iconography of correlations
  • No axis
  • Solid line
  • dotted line
  • color
  • Exploratory data analysis.
  • Replace a correlation matrix by a diagram where the “remarkable” correlations are represented by a solid line (positive correlation), or a dotted line (negative correlation).
  • Points can be coded via color.

Interactivity

Interactive data visualization enables direct actions on a graphical plot to change elements and link between multiple plots. [38]

Interactive data visualization has been a pursuit of statisticians since the late 1960s. Examples of the developments can be found on the American Statistical Association video lending library. [39]

Common interactions include:

Other perspectives

There are different approaches on the scope of data visualization. One common focus is on information presentation, such as Friedman (2008). Friendly (2008) presumes two main parts of data visualization: statistical graphics, and thematic cartography. [40] In this line the "Data Visualization: Modern Approaches" (2007) article gives an overview of seven subjects of data visualization: [41]

All these subjects are closely related to graphic design and information representation.

On the other hand, from a computer science perspective, Frits H. Post in 2002 categorized the field into sub-fields: [12] [42]

Within The Harvard Business Review, Scott Berinato developed a framework to approach data visualisation. [43] To start thinking visually, users must consider two questions; 1) What you have and 2) what you're doing. The first step is identifying what data you want visualised. It is data-driven like profit over the past ten years or a conceptual idea like how a specific organisation is structured. Once this question is answered one can then focus on whether they are trying to communicate information (declarative visualisation) or trying to figure something out (exploratory visualisation). Scott Berinato combines these questions to give four types of visual communication that each have their own goals. [43]

These four types of visual communication are as follows;

Data presentation architecture

A data visualization from social media Kencf0618FacebookNetwork.jpg
A data visualization from social media

Data presentation architecture (DPA) is a skill-set that seeks to identify, locate, manipulate, format and present data in such a way as to optimally communicate meaning and proper knowledge.

Historically, the term data presentation architecture is attributed to Kelly Lautt: [lower-alpha 1] "Data Presentation Architecture (DPA) is a rarely applied skill set critical for the success and value of Business Intelligence. Data presentation architecture weds the science of numbers, data and statistics in discovering valuable information from data and making it usable, relevant and actionable with the arts of data visualization, communications, organizational psychology and change management in order to provide business intelligence solutions with the data scope, delivery timing, format and visualizations that will most effectively support and drive operational, tactical and strategic behaviour toward understood business (or organizational) goals. DPA is neither an IT nor a business skill set but exists as a separate field of expertise. Often confused with data visualization, data presentation architecture is a much broader skill set that includes determining what data on what schedule and in what exact format is to be presented, not just the best way to present data that has already been chosen. Data visualization skills are one element of DPA."

Objectives

DPA has two main objectives:

Scope

With the above objectives in mind, the actual work of data presentation architecture consists of:

DPA work shares commonalities with several other fields, including:

See also

Notes

  1. The first formal, recorded, public usages of the term data presentation architecture were at the three formal Microsoft Office 2007 Launch events in Dec, Jan and Feb of 2007–08 in Edmonton, Calgary and Vancouver (Canada) in a presentation by Kelly Lautt describing a business intelligence system designed to improve service quality in a pulp and paper company. The term was further used and recorded in public usage on December 16, 2009 in a Microsoft Canada presentation on the value of merging Business Intelligence with corporate collaboration processes.

Related Research Articles

Chart Graphical representation of data

A chart is a graphical representation for data visualization, in which "the data is represented by symbols, such as bars in a bar chart, lines in a line chart, or slices in a pie chart". A chart can represent tabular numeric data, functions or some kinds of quality structure and provides different info.

Information design Communication and graphic design

Information design is the practice of presenting information in a way that fosters an efficient and effective understanding of the information. The term has come to be used for a specific area of graphic design related to displaying information effectively, rather than just attractively or for artistic expression. Information design is closely related to the field of data visualization and is often taught as part of graphic design courses. The broad applications of information design along with its close connections to other fields of design and communication practices have created some overlap in the definitions of communication design, data visualization, and information architecture.

Edward Tufte American statistician (b.1942) noted for his writings on information design

Edward Rolf Tufte, sometimes known as "ET", is an American statistician and professor emeritus of political science, statistics, and computer science at Yale University. He is noted for his writings on information design and as a pioneer in the field of data visualization.

Scatter plot Plot using the dispersal of scattered dots to show the relationship between variables

A scatter plot is a type of plot or mathematical diagram using Cartesian coordinates to display values for typically two variables for a set of data. If the points are coded (color/shape/size), one additional variable can be displayed. The data are displayed as a collection of points, each having the value of one variable determining the position on the horizontal axis and the value of the other variable determining the position on the vertical axis.

A small multiple is a series of similar graphs or charts using the same scale and axes, allowing them to be easily compared. It uses multiple views to show different partitions of a dataset. The term was popularized by Edward Tufte.

A diagram is a symbolic representation of information using visualization techniques. Diagrams have been used since prehistoric times on walls of caves, but became more prevalent during the Enlightenment. Sometimes, the technique uses a three-dimensional visualization which is then projected onto a two-dimensional surface. The word graph is sometimes used as a synonym for diagram.

Visualization (graphics) Set of techniques for creating images, diagrams, or animations to communicate a message

Visualization or visualisation is any technique for creating images, diagrams, or animations to communicate a message. Visualization through visual imagery has been an effective way to communicate both abstract and concrete ideas since the dawn of humanity. Examples from history include cave paintings, Egyptian hieroglyphs, Greek geometry, and Leonardo da Vinci's revolutionary methods of technical drawing for engineering and scientific purposes.

Pie chart Circular statistical graph that is divisible into slice to illustrate numerical proportion

A pie chart is a circular statistical graphic, which is divided into slices to illustrate numerical proportion. In a pie chart, the arc length of each slice is proportional to the quantity it represents. While it is named for its resemblance to a pie which has been sliced, there are variations on the way it can be presented. The earliest known pie chart is generally credited to William Playfair's Statistical Breviary of 1801.

Chartjunk

Chartjunk refers to all visual elements in charts and graphs that are not necessary to comprehend the information represented on the graph, or that distract the viewer from this information.

Infographic Graphic visual representation of information

Infographics are graphic visual representations of information, data, or knowledge intended to present information quickly and clearly. They can improve cognition by utilizing graphics to enhance the human visual system's ability to see patterns and trends. Similar pursuits are information visualization, data visualization, statistical graphics, information design, or information architecture. Infographics have evolved in recent years to be for mass communication, and thus are designed with fewer assumptions about the readers' knowledge base than other types of visualizations. Isotypes are an early example of infographics conveying information quickly and easily to the masses.

Charles Joseph Minard

Charles Joseph Minard was a French civil engineer recognized for his significant contribution in the field of information graphics in civil engineering and statistics. Minard was, among other things, noted for his representation of numerical data on geographic maps, especially his flow maps.

Chernoff face

Chernoff faces, invented by applied mathematician, statistician and physicist Herman Chernoff in 1973, display multivariate data in the shape of a human face. The individual parts, such as eyes, ears, mouth and nose represent values of the variables by their shape, size, placement and orientation. The idea behind using faces is that humans easily recognize faces and notice small changes without difficulty. Chernoff faces handle each variable differently. Because the features of the faces vary in perceived importance, the way in which variables are mapped to the features should be carefully chosen.

Radar chart

A radar chart is a graphical method of displaying multivariate data in the form of a two-dimensional chart of three or more quantitative variables represented on axes starting from the same point. The relative position and angle of the axes is typically uninformative, but various heuristics, such as algorithms that plot data as the maximal total area, can be applied to sort the variables (axes) into relative positions that reveal distinct correlations, trade-offs, and a multitude of other comparative measures.

Statistical graphics, also known as statistical graphical techniques, are graphics used in the field of statistics for data visualization.

Michael Friendly

Michael Louis Friendly is an American-Canadian psychologist, Professor of Psychology at York University in Ontario, Canada, and director of its Statistical Consulting Service, especially known for his contributions to graphical methods for categorical and multivariate data, and on the history of data and information visualisation.

Plot (graphics)

A plot is a graphical technique for representing a data set, usually as a graph showing the relationship between two or more variables. The plot can be drawn by hand or by a computer. In the past, sometimes mechanical or electronic plotters were used. Graphs are a visual representation of the relationship between variables, which are very useful for humans who can then quickly derive an understanding which may not have come from lists of values. Given a scale or ruler, graphs can also be used to read off the value of an unknown variable plotted as a function of a known one, but this can also be done with data presented in tabular form. Graphs of functions are used in mathematics, sciences, engineering, technology, finance, and other areas.

Motion chart

A motion chart is a dynamic bubble chart which allows efficient and interactive exploration and visualization of longitudinal multivariate Data. Motion charts provide mechanisms for mapping ordinal, nominal and quantitative variables onto time, 2D coordinate axes, size, colors, glyphs and appearance characteristics, which facilitate the interactive display of multidimensional and temporal data.

In statistics, a misleading graph, also known as a distorted graph, is a graph that misrepresents data, constituting a misuse of statistics and with the result that an incorrect conclusion may be derived from it.

Howard G. Funkhouser American mathematician and historian

Howard Gray Funkhouser was an American mathematician, historian and Associate Professor of Mathematics at the Washington and Lee University and later at the Phillips Exeter Academy, particularly known for his early work on the history of graphical methods.

Graphical perception

Graphical perception is the human capacity for visually interpreting information on graphs and charts. Both quantitative and qualitative information can be said to be encoded into the image, and the human capacity to interpret it is sometimes called decoding. The importance of human graphical perception, what we discern easily versus what our brains have more difficulty decoding, is fundamental to good statistical graphics design, where clarity, transparency, accuracy and precision in data display and interpretation are essential for understanding the translation of data in a graph to clarify and interpret the science.

References

  1. Shewan, Dan (5 October 2016). "Data is Beautiful: 7 Data Visualization Tools for Digital Marketers". Business2Community.com. Archived from the original on 12 November 2016.
  2. 1 2 Nussbaumer Knaflic, Cole (2 November 2015). Storytelling with Data: A Data Visualization Guide for Business Professionals. ISBN   978-1-119-00225-3.
  3. "What is Data Visualization? - Whizlabs Blog".
  4. 1 2 Gershon, Nahum; Page, Ward (1 August 2001). "What storytelling can do for information visualization". Communications of the ACM. 44 (8): 31–37. doi:10.1145/381641.381653. S2CID   7666107.
  5. Mason, Betsy (November 12, 2019). "Why scientists need to be better at data visualization". Knowable Magazine. doi: 10.1146/knowable-110919-1 .
  6. O'Donoghue, Seán I.; Baldi, Benedetta Frida; Clark, Susan J.; Darling, Aaron E.; Hogan, James M.; Kaur, Sandeep; Maier-Hein, Lena; McCarthy, Davis J.; Moore, William J.; Stenau, Esther; Swedlow, Jason R.; Vuong, Jenny; Procter, James B. (2018-07-20). "Visualization of Biomedical Data". Annual Review of Biomedical Data Science. 1 (1): 275–304. doi:10.1146/annurev-biodatasci-080917-013424. hdl: 10453/125943 . S2CID   199591321 . Retrieved 25 June 2021.
  7. 1 2 "Stephen Few-Perceptual Edge-Selecting the Right Graph for Your Message-2004" (PDF). Archived (PDF) from the original on 2014-10-05. Retrieved 2014-09-08.
  8. "10 Examples of Interactive Map Data Visualizations".
  9. Engebretsen, Martin; Helen, Kennedy, eds. (2020-04-16). Data Visualization in Society. Nieuwe Prinsengracht 89 1018 VR Amsterdam Nederland: Amsterdam University Press. doi:10.5117/9789463722902_ch02. ISBN   978-90-485-4313-7.{{cite book}}: CS1 maint: location (link)
  10. Vitaly Friedman (2008) "Data Visualization and Infographics" Archived 2008-07-22 at the Wayback Machine in: Graphics, Monday Inspiration, January 14th, 2008.
  11. Viegas, Fernanda; Wattenberg, Martin (April 19, 2011). "How To Make Data Look Sexy". CNN.com. Archived from the original on May 6, 2011. Retrieved May 7, 2017.
  12. 1 2 Frits H. Post, Gregory M. Nielson and Georges-Pierre Bonneau (2002). Data Visualization: The State of the Art. Research paper TU delft, 2002. Archived 2009-10-07 at the Wayback Machine .
  13. Tukey, John (1977). Exploratory Data Analysis. Addison-Wesley. ISBN   0-201-07616-0.
  14. techatstate (7 August 2013). "Tech@State: Data Visualization - Keynote by Dr Edward Tufte". Archived from the original on 29 March 2017. Retrieved 29 November 2016 via YouTube.
  15. Cleveland, W. S.; McGill, R. (1985). "Graphical perception and graphical methods for analyzing scientific data". Science. 229 (4716): 828–33. Bibcode:1985Sci...229..828C. doi:10.1126/science.229.4716.828. PMID   17777913. S2CID   16342041.
  16. 1 2 3 Tufte, Edward (1983). The Visual Display of Quantitative Information. Cheshire, Connecticut: Graphics Press. ISBN   0-9613921-4-2. Archived from the original on 2013-01-14. Retrieved 2019-08-10.
  17. "Telling Visual Stories About Data - Congressional Budget Office". www.cbo.gov. Archived from the original on 2014-12-04. Retrieved 2014-11-27.
  18. "Stephen Few-Perceptual Edge-Graph Selection Matrix" (PDF). Archived (PDF) from the original on 2014-10-05. Retrieved 2014-09-08.
  19. 1 2 "Steven Few-Tapping the Power of Visual Perception-September 2004" (PDF). Archived (PDF) from the original on 2014-10-05. Retrieved 2014-10-08.
  20. 1 2 3 Data Visualization for Human Perception. The Interaction Design Foundation. Archived from the original on 2015-11-23. Retrieved 2015-11-23.
  21. "Visualization" (PDF). SFU. SFU lecture. Archived from the original (PDF) on 2016-01-22. Retrieved 2015-11-22.
  22. Graham, Fiona (2012-04-17). "Can images stop data overload?". BBC News. Retrieved 2020-07-30.
  23. 1 2 3 Friendly, Michael (2006). "A Brief History of Data Visualization". Handbook of Data Visualization. Springer-Verlag. pp. 15–56. doi:10.1007/978-3-540-33037-0_2. ISBN   9783540330370.
  24. Whitehouse, D. (9 August 2000). "Ice Age star map discovered". BBC News. Archived from the original on 6 January 2018. Retrieved 20 January 2018.
  25. Dragicevic, Pierre; Jansen, Yvonne (2012). "List of Physical Visualizations and Related Artefacts". Archived from the original on 2018-01-13. Retrieved 2018-01-12.
  26. Jansen, Yvonne; Dragicevic, Pierre; Isenberg, Petra; Alexander, Jason; Karnik, Abhijit; Kildal, Johan; Subramanian, Sriram; Hornbæk, Kasper (2015). "Opportunities and challenges for data physicalization". Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems: 3227–3236. Archived from the original on 2018-01-13. Retrieved 2018-01-12.
  27. 1 2 Friendly, Michael (2001). "Milestones in the history of thematic cartography, statistical graphics, and data visualization". Archived from the original on 2014-04-14.
  28. Funkhouser, Howard Gray (January 1936). "A Note on a Tenth Century Graph". Osiris. 1: 260–262. doi:10.1086/368425. JSTOR   301609. S2CID   144492131.
  29. "Data visualization: definition, examples, tools, advice [guide 2020]". Market research consulting. 2020-12-09. Retrieved 2020-12-09.
  30. Friendly, Michael (2006). "A Brief History of Data Visualization" (PDF). York University. Springer-Verlag. Archived (PDF) from the original on 2016-05-08. Retrieved 2015-11-22.
  31. "NY gets new boot camp for data scientists: It's free but harder to get into than Harvard". Venture Beat. Archived from the original on 2016-02-15. Retrieved 2016-02-21.
  32. Interactive Data Visualization
  33. 1 2 Bulmer, Michael (2013). A Portable Introduction to Data Analysis. The University of Queensland: Publish on Demand Centre. pp. 4–5. ISBN   978-1-921723-10-0.
  34. "Steven Few-Selecting the Right Graph for Your Message-September 2004" (PDF). Archived (PDF) from the original on 2014-10-05. Retrieved 2014-09-08.
  35. Lengler, Ralph; Eppler, Martin. J. "Periodic Table of Visualization Methods". www.visual-literacy.org. Archived from the original on 16 March 2013. Retrieved 15 March 2013.
  36. Kahn, Brian (June 17, 2019). "This Striking Climate Change Visualization Is Now Customizable for Any Place on Earth". Gizmodo. Archived from the original on June 26, 2019. Developed in May 2018 by Ed Hawkins, University of Reading.
  37. Mooney, Chris (11 May 2016). "This scientist just changed how we think about climate change with one GIF". The Washington Post. Archived from the original on 6 February 2019. Ed Hawkins took these monthly temperature data and plotted them in the form of a spiral, so that for each year, there are twelve points, one for each month, around the center of a circle – with warmer temperatures farther outward and colder temperatures nearer inward.
  38. Swayne, Deborah (1999). "Introduction to the special issue on interactive graphical data analysis: What is interaction?". Computational Statistics. 14 (1): 1–6. doi:10.1007/PL00022700. S2CID   86788346.
  39. American Statistics Association, Statistical Graphics Section. "Video Lending Library".
  40. Michael Friendly (2008). "Milestones in the history of thematic cartography, statistical graphics, and data visualization" Archived 2008-09-11 at the Wayback Machine .
  41. "Data Visualization: Modern Approaches" Archived 2008-07-22 at the Wayback Machine . in: Graphics, August 2nd, 2007
  42. Frits H. Post, Gregory M. Nielson and Georges-Pierre Bonneau (2002). Data Visualization: The State of the Art Archived 2009-10-07 at the Wayback Machine .
  43. 1 2 3 4 5 6 Berinato, Scott (June 2016). "Visualizations That Really Work". Harvard Business Review: 92–100.

Further reading