The Grammar of Graphics (GoG) is a grammar-based system for representing graphics to provide grammatical constraints on the composition of data and information visualizations. A graphical grammar differs from a graphics pipeline as it focuses on sematic components such as scales and guides, statistical functions, coordinate systems, marks and aesthetic attributes. [1] [2] For example, a bar chart can be converted into a pie chart by specifying a polar coordinate system without any other change in graphical specification.: [1] [3]
The grammar of graphics concept was launched by Leland Wilkinson in 2001 (Wilkinson et al., 2001; Wilkinson, 2005) and graphical grammars have since been written in a variety of languages with various parameterisations and extensions. [1] The major implementations of graphical grammars are nViZn created by a team at SPSS/IBM, followed by Polaris focusing on multidimensional relational databases which is commercialised as Tableau, a revised Layered Grammar of Graphics by Hadley Wickham in Ggplot2, and Vega-Lite which is a visualisation grammar with added interactivity. [4] The grammar of graphics continues to evolve with alternate parameterisations, extensions, or new specifications.
Wilkinson conceived the seven elements of a graphics to be [3] [1]
With these, Wilkinson hypothesised that [3] [1] [5]
Wilkinson wrote SYSTAT, a statistical software package, in the early 1980s. This program was noted for its comprehensive graphics, [6] including the first software implementation of the heatmap display now widely used among biologists. After his company grew to 50 employees, he sold it to SPSS in 1995. At SPSS, he assembled a team of graphics programmers who developed the nViZn platform [7] that produces the visualizations in SPSS, Clementine, and other analytics products.
While at Stanford, Tableau founders Hanrahan and Stolte, as well as Diane Tang, created the predecessor to Tableau, named Polaris. [8] Polaris was a data visualization software tool, built with the support of a United States Department of Energy defense program, the Accelerated Strategic Computing Initiative (ASCI). [9] [10] The main differences between Wilkinson's system and Polaris are the use of SQL relational algebra for database services and using shelves instead of cross and nest operators. [8]
Hadley Wickham conceived an alternate parameterisation of the syntax Wilkinson had derived, creating a layered grammar of graphics which he implemented as Ggplot2 for R (programming language) users. [11] [12] This added a hierarchy of defaults based around the idea of building up a graphic from multiple layers. [13] [14] Wickham conceived these elements to be:
Wilkinson is generally positive on Wickham's parameterisation and implementation of ggplot2, [4] praising its elegance and expressivity whilst claiming that his original Grammar of Graphics is capable of representing a wider range of statistical graphics. [5] [15]
Ggplot2 is the first implementation of a layered grammar of graphics in R and implementations in other programming languages have ensued. These include direct ports plotnine for Python, gramm for MATLAB, Lets-Plot for Kotlin and gadfly for Julia. Projects inspired by elements of Wickham's grammar include Vega-Lite which specifies plots in JSON and uses a JavaScript engine, and Seaborn for Python.
Vega-Lite combines ideas from Wilkinson's Grammar of Graphics and Wickham's Layered Grammar of Graphics with a composition algebra for layered and multi-view displays with a grammar of interaction. The Vega-Lite specification is instantiated in JSON and rendered by the lower-level Vega. [16] The graphical grammar implemented by Vega-Lite is composed of the following:
Whilst Vega-Lite is the sole implementation of this graphics grammar specification with compilation to Vega, other implementations do create JSON files which can be interpreted by Vega-Lite.