Vega and Vega-Lite visualisation grammars

Last updated
Developer(s) Jeffrey Heer, Arvind Satyanarayan, Dominik Moritz, Kanit Wongsuphasawat, and community
Initial release2 April 2013;11 years ago (2013-04-02)
Stable release
5.25.0 / 27 April 2023;17 months ago (2023-04-27) [1]
Written in JavaScript
Type Data visualization, JavaScript library
License BSD
Website vega.github.io

Vega and Vega-Lite are visualization tools implementing a grammar of graphics, similar to ggplot2. The Vega and Vega-Lite grammars extend Leland Wilkinson's Grammar of Graphics [2] by adding a novel grammar of interactivity to assist in the exploration of complex datasets.

Vega acts as a low-level language suited to explanatory figures (the same use case as D3.js), while Vega-Lite is a higher-level language suited to rapidly exploring data. [3] Vega is used in the back end of several data visualization systems, for example Voyager. [4] [5] Chart specifications are written in JSON and rendered in a browser or exported to either vector or bitmap images. Bindings for Vega-Lite have been written in several programming languages, such as the Python package Altair, [6] to make it easier to use. The grammars and associated tools are open source projects led by the University of Washington Interactive Data Lab and released under a BSD-3 license. [7]

Related Research Articles

<span class="mw-page-title-main">Scientific visualization</span> Interdisciplinary branch of science concerned with presenting scientific data visually

Scientific visualization is an interdisciplinary branch of science concerned with the visualization of scientific phenomena. It is also considered a subset of computer graphics, a branch of computer science. The purpose of scientific visualization is to graphically illustrate scientific data to enable scientists to understand, illustrate, and glean insight from their data. Research into how people read and misread various types of visualizations is helping to determine what types and features of visualizations are most understandable and effective in conveying information.

<span class="mw-page-title-main">Visualization (graphics)</span> Set of techniques for creating images, diagrams, or animations to communicate a message

Visualization, also known as Graphics Visualization, is any technique for creating images, diagrams, or animations to communicate a message. Visualization through visual imagery has been an effective way to communicate both abstract and concrete ideas since the dawn of humanity. from history include cave paintings, Egyptian hieroglyphs, Greek geometry, and Leonardo da Vinci's revolutionary methods of technical drawing for engineering purposes that actively involve scientific requirements.

<span class="mw-page-title-main">Cartogram</span> Map distorting size to show another value

A cartogram is a thematic map of a set of features, in which their geographic size is altered to be directly proportional to a selected variable, such as travel time, population, or gross national income. Geographic space itself is thus warped, sometimes extremely, in order to visualize the distribution of the variable. It is one of the most abstract types of map; in fact, some forms may more properly be called diagrams. They are primarily used to display emphasis and for analysis as nomographs.

<span class="mw-page-title-main">Treemapping</span> Visualisation method for hierchical data

In information visualization and computing, treemapping is a method for displaying hierarchical data using nested figures, usually rectangles.

<span class="mw-page-title-main">Data and information visualization</span> Visual representation of data

Data and information visualization is the practice of designing and creating easy-to-communicate and easy-to-understand graphic or visual representations of a large amount of complex quantitative and qualitative data and information with the help of static, dynamic or interactive visual items. Typically based on data and information collected from a certain domain of expertise, these visualizations are intended for a broader audience to help them visually explore and discover, quickly understand, interpret and gain important insights into otherwise difficult-to-identify structures, relationships, correlations, local and global patterns, trends, variations, constancy, clusters, outliers and unusual groupings within data. When intended for the general public to convey a concise version of known, specific information in a clear and engaging manner, it is typically called information graphics.

<span class="mw-page-title-main">Heat map</span> Data visualization technique

A heat map is a 2-dimensional data visualization technique that represents the magnitude of individual values within a dataset as a color. The variation in color may be by hue or intensity.

<span class="mw-page-title-main">Fernanda Viégas</span> Brazilian-American computer scientist (born 1971)

Fernanda Bertini Viégas is a Brazilian computer scientist and graphical designer, whose work focuses on the social, collaborative and artistic aspects of information visualization.

Tableau Software, LLC is an American interactive data visualization software company focused on business intelligence. It was founded in 2003 in Mountain View, California, and is currently headquartered in Seattle, Washington. In 2019, the company was acquired by Salesforce for $15.7 billion. At the time, this was the largest acquisition by Salesforce since its foundation. It was later surpassed by Salesforce's acquisition of Slack.

<span class="mw-page-title-main">Leland Wilkinson</span> American statistician and computer scientist (1944–2021)

Leland Wilkinson was an American statistician and computer scientist at H2O.ai and Adjunct Professor of Computer Science at University of Illinois at Chicago. Wilkinson developed the SYSTAT statistical package in the early 1980s, sold it to SPSS in 1995, and worked at SPSS for 10 years recruiting and managing the visualization team. He left SPSS in 2008 and became Executive VP of SYSTAT Software Inc. in Chicago. He then served as the VP of Data Visualization at Skytree, Inc and VP of Statistics at Tableau Software before joining H2O.ai. His research focused on scientific visualization and statistical graphics. In these communities he was well known for his book The Grammar of Graphics, which was the foundation for the R package ggplot2.

<span class="mw-page-title-main">Voreen</span> Volume visualization library and development platform

Voreen is an open-source volume visualization library and development platform. Through the use of GPU-based volume rendering techniques it allows high frame rates on standard graphics hardware to support interactive volume exploration.

<span class="mw-page-title-main">MeVisLab</span>

MeVisLab is a cross-platform application framework for medical image processing and scientific visualization. It includes advanced algorithms for image registration, segmentation, and quantitative morphological and functional image analysis. An IDE for graphical programming and rapid user interface prototyping is available.

ggplot2 Data visualization package for R

ggplot2 is an open-source data visualization package for the statistical programming language R. Created by Hadley Wickham in 2005, ggplot2 is an implementation of Leland Wilkinson's Grammar of Graphics—a general scheme for data visualization which breaks up graphs into semantic components such as scales and layers. ggplot2 can serve as a replacement for the base graphics in R and contains a number of defaults for web and print display of common scales. Since 2005, ggplot2 has grown in use to become one of the most popular R packages.

D3.js is a JavaScript library for producing dynamic, interactive data visualizations in web browsers. It makes use of Scalable Vector Graphics (SVG), HTML5, and Cascading Style Sheets (CSS) standards. It is the successor to the earlier Protovis framework. Its development was noted in 2011, as version 2.0.0 was released in August 2011. With the release of version 4.0.0 in June 2016, D3 was changed from a single library into a collection of smaller, modular libraries that can be used independently.

t-distributed stochastic neighbor embedding Technique for dimensionality reduction

t-distributed stochastic neighbor embedding (t-SNE) is a statistical method for visualizing high-dimensional data by giving each datapoint a location in a two or three-dimensional map. It is based on Stochastic Neighbor Embedding originally developed by Geoffrey Hinton and Sam Roweis, where Laurens van der Maaten and Hinton proposed the t-distributed variant. It is a nonlinear dimensionality reduction technique for embedding high-dimensional data for visualization in a low-dimensional space of two or three dimensions. Specifically, it models each high-dimensional object by a two- or three-dimensional point in such a way that similar objects are modeled by nearby points and dissimilar objects are modeled by distant points with high probability.

Michael Bostock is an American computer scientist and data visualization specialist. He is one of the co-creators of Observable and a key developer of D3.js, a JavaScript library used to produce dynamic, interactive data visualizations for web browsers. He also contributed to the preceding Protovis framework.

<span class="mw-page-title-main">Jeffrey Heer</span> American computer scientist

Jeffrey Michael Heer is an American computer scientist best known for his work on information visualization and interactive data analysis. He is a professor of computer science & engineering at the University of Washington, where he directs the UW Interactive Data Lab. He co-founded Trifacta with Joe Hellerstein and Sean Kandel in 2012.

The IEEE Visualization Conference (VIS) is an annual conference on scientific visualization, information visualization, and visual analytics administrated by the IEEE Computer Society Technical Committee on Visualization and Graphics. As ranked by Google Scholar's h-index metric in 2016, VIS is the highest rated venue for visualization research and the second-highest rated conference for computer graphics over all. It has an 'A' rating from the Australian Ranking of ICT Conferences, an 'A' rating from the Brazilian ministry of education, and an 'A' rating from the China Computer Federation (CCF). The conference is highly selective with generally < 25% acceptance rates for all papers.

<span class="mw-page-title-main">Studierfenster</span>

Studierfenster or StudierFenster (SF) is a free, non-commercial open science client/server-based medical imaging processing online framework. It offers capabilities, like viewing medical data (computed tomography (CT), magnetic resonance imaging (MRI), etc.) in two- and three-dimensional space directly in the standard web browsers, like Google Chrome, Mozilla Firefox, Safari, and Microsoft Edge. Other functionalities are the calculation of medical metrics (dice score and Hausdorff distance), manual slice-by-slice outlining of structures in medical images (segmentation), manual placing of (anatomical) landmarks in medical image data, viewing medical data in virtual reality, a facial reconstruction and registration of medical data for augmented reality, one click showcases for COVID-19 and veterinary scans, and a Radiomics module.

Steven Mark Drucker is an American computer scientist who studies how to help people understand data, and communicate their insights to others. He is a partner at Microsoft Research, where he also serves as the research manager of the VIDA group. Drucker is an affiliate professor at the University of Washington Computer Science and Engineering Department.

<span class="mw-page-title-main">UpSet plot</span> Data visualization method

UpSet plots are a data visualization method for showing set data with more than three intersecting sets. UpSet shows intersections in a matrix, with the rows of the matrix corresponding to the sets, and the columns to the intersections between these sets. The size of the sets and of the intersections are shown as bar charts.

References

  1. "vega Releases". Github.com.
  2. Wilkinson, Leland (1999). The Grammar of Graphics. New York: Springer. ISBN   9780387987743.
  3. Satyanarayan, Arvind; Moritz, Dominik; Wongsuphasawat, Kanit; Heer, Jeffrey (2017). "Vega-Lite: A Grammar of Interactive Graphics". IEEE Transactions on Visualization and Computer Graphics. 23 (1): 341–350. doi:10.1109/TVCG.2016.2599030. PMID   27875150. S2CID   206805969.
  4. Wongsuphasawat, Kanit; Moritz, Dominik; Anand, Anushka; MacKinlay, Jock; Howe, Bill; Heer, Jeffrey (2016). "Voyager: Exploratory Analysis via Faceted Browsing of Visualization Recommendations". IEEE Transactions on Visualization and Computer Graphics. 22 (1): 649–658. doi:10.1109/TVCG.2015.2467191. PMID   26390469. S2CID   2366653.
  5. Wongsuphasawat, Kanit; Qu, Zening; Moritz, Dominik; Chang, Riley; Ouk, Felix; Anand, Anushka; MacKinlay, Jock; Howe, Bill; Heer, Jeffrey (2017). "Voyager 2". Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. pp. 2648–2659. doi:10.1145/3025453.3025768. ISBN   9781450346559. S2CID   14999239.
  6. Vanderplas, Jacob; Granger, Brian; Heer, Jeffrey; Moritz, Dominik; Wongsuphasawat, Kanit; Satyanarayan, Arvind; Lees, Eitan; Timofeev, Ilia; Welsh, Ben; Sievert, Scott (2018). "Altair: Interactive Statistical Visualizations for Python". Journal of Open Source Software. 3 (32): 1057. Bibcode:2018JOSS....3.1057V. doi: 10.21105/joss.01057 .
  7. "Vega: A Visualization Grammar". Vega.