GGobi

Last updated
GGobi
Developer Deborah F. Swayne, Michael Lawrence, Hadley Wickham, Duncan Temple Lang, Di Cook, Heike Hofmann and Andreas Buja
Stable release
2.1.10.a  OOjs UI icon edit-ltr-progressive.svg / 10 June 2012;10 years ago (10 June 2012)
OS Windows, OS X, Linux
License GNU GPL, BSD, CPL [1]
Website www.ggobi.org

GGobi is a free statistical software tool for interactive data visualization. GGobi allows extensive exploration of the data with Interactive dynamic graphics. It is also a tool for looking at multivariate data. R can be used in sync with GGobi (through rggobi). The GGobi software can be embedded as a library in other programs and program packages using an application programming interface (API) (integration into a stand-alone application) or as an add-on to existing languages and scripting environments, e.g., with the R command line or from a Perl or Python scripts. GGobi prides itself on its ability to link multiple graphs together. [2]

Contents

Overview

GGobi was created to look at data matrices. The designers were interested in exploring multi-dimensional data. The program developers went through many name changes before settling on GGobi (A combination of the words GTK+ and the Gobi Desert). The original concept, Dataviewer, began in the mid-80s, and a predecessor, XGobi, began in 1989. Work began on the current version of GGobi in 1999. The main reason for the different versions was the change in technology. [3] Current version for MS Windows is 2.1.10a (12 March 2010) with an update for 64 bit usage from 10 June 2012.

Released under a combination of three free software licenses, GGobi is free software. [1]

GGobi Topics

This shows a projection from a 2D tour of 6D, where three clusters are visible. Two points are highlighted as yellow, which appear in the same color in other plots. Ggobi-flea1.png
This shows a projection from a 2D tour of 6D, where three clusters are visible. Two points are highlighted as yellow, which appear in the same color in other plots.
Parallel coordinate plot linked to scatterplot, show traces of the two points highlighted in the scatterplot. Ggobi-flea2.png
Parallel coordinate plot linked to scatterplot, show traces of the two points highlighted in the scatterplot.

Importance of graphics

Looking at data through various graphs can reveal more information about the distribution than just looking at the numbers or a summary of them. Using the different tools within GGobi, clusters, non-linear distributions, outliers, and other important variations in the data can be discovered. GGobi is a program which allows exploratory data analysis to occur for multi-dimensional data.

Supported data sources

GGobi can read CSV and XML file types. [4]

Types of graphics

Interactions

These tools can be used to pick out special points or clusters of data.

As the brush moves over a point, the point will be highlighted.
If "persistent" is selected, the points the brush has moved over will remain "painted".
As the cursor moves over a point, a label, or variable value will appear at the top of the graphic screen.
Multiple plots are linked so identifying one point in one plot will identify the same point on all other graphs, and brushing a group of points in one plot will highlight the same points in other plots. The linking can be one-to-one, or according to the values of a categorical variable in the data set.
Points in a plot can be moved interactively, e.g. to gauge results from multidimensional scaling.

See also

Related Research Articles

<span class="mw-page-title-main">Chart</span> Graphical representation of data

A chart is a graphical representation for data visualization, in which "the data is represented by symbols, such as bars in a bar chart, lines in a line chart, or slices in a pie chart". A chart can represent tabular numeric data, functions or some kinds of quality structure and provides different info.

In statistics, exploratory data analysis (EDA) is an approach of analyzing data sets to summarize their main characteristics, often using statistical graphics and other data visualization methods. A statistical model can be used or not, but primarily EDA is for seeing what the data can tell us beyond the formal modeling and thereby contrasts traditional hypothesis testing. Exploratory data analysis has been promoted by John Tukey since 1970 to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments. EDA is different from initial data analysis (IDA), which focuses more narrowly on checking assumptions required for model fitting and hypothesis testing, and handling missing values and making transformations of variables as needed. EDA encompasses IDA.

<span class="mw-page-title-main">Visual programming language</span> Programming language written graphically by a user

In computing, a visual programming language is any programming language that lets users create programs by manipulating program elements graphically rather than by specifying them textually. A VPL allows programming with visual expressions, spatial arrangements of text and graphic symbols, used either as elements of syntax or secondary notation. For example, many VPLs are based on the idea of "boxes and arrows", where boxes or other screen objects are treated as entities, connected by arrows, lines or arcs which represent relations.

<span class="mw-page-title-main">Visualization (graphics)</span> Set of techniques for creating images, diagrams, or animations to communicate a message

Visualization or visualisation is any technique for creating images, diagrams, or animations to communicate a message. Visualization through visual imagery has been an effective way to communicate both abstract and concrete ideas since the dawn of humanity. Examples from history include cave paintings, Egyptian hieroglyphs, Greek geometry, and Leonardo da Vinci's revolutionary methods of technical drawing for engineering and scientific purposes.

JMP is a suite of computer programs for statistical analysis developed by JMP, a subsidiary of SAS Institute. It was launched in 1989 to take advantage of the graphical user interface introduced by the Macintosh operating systems. It has since been significantly rewritten and made available also for the Windows operating system. JMP is used in applications such as Six Sigma, quality control, and engineering, design of experiments, as well as for research in science, engineering, and social sciences.

<span class="mw-page-title-main">Data and information visualization</span> Visual representation of data

Data and information visualization is an interdisciplinary field that deals with the graphic representation of data and information. It is a particularly efficient way of communicating when the data or information is numerous as for example a time series.

Smile is a free Macintosh computer programming and working environment based on AppleScript. Smile is primarily designed for scientists, engineers, desktop publishers, and web applications developers, to help them automate frequent tasks and control complex operations.

<span class="mw-page-title-main">GeoDa</span>

GeoDa is a free software package that conducts spatial data analysis, geovisualization, spatial autocorrelation and spatial modeling.

<span class="mw-page-title-main">VisIt</span>

VisIt is an open-source interactive parallel visualization and graphical analysis tool for viewing scientific data. It can be used to visualize scalar and vector fields defined on 2D and 3D structured and unstructured meshes. VisIt was designed to handle very large data set sizes in the terascale range and yet can also handle small data sets in the kilobyte range.

Web-based simulation (WBS) is the invocation of computer simulation services over the World Wide Web, specifically through a web browser. Increasingly, the web is being looked upon as an environment for providing modeling and simulation applications, and as such, is an emerging area of investigation within the simulation community.

<span class="mw-page-title-main">Plot (graphics)</span>

A plot is a graphical technique for representing a data set, usually as a graph showing the relationship between two or more variables. The plot can be drawn by hand or by a computer. In the past, sometimes mechanical or electronic plotters were used. Graphs are a visual representation of the relationship between variables, which are very useful for humans who can then quickly derive an understanding which may not have come from lists of values. Given a scale or ruler, graphs can also be used to read off the value of an unknown variable plotted as a function of a known one, but this can also be done with data presented in tabular form. Graphs of functions are used in mathematics, sciences, engineering, technology, finance, and other areas.

Mondrian is a general-purpose statistical data-visualization system, for interactive data visualization.

Interactive Visual Analysis (IVA) is a set of techniques for combining the computational power of computers with the perceptive and cognitive capabilities of humans, in order to extract knowledge from large and complex datasets. The techniques rely heavily on user interaction and the human visual system, and exist in the intersection between visual analytics and big data. It is a branch of data visualization. IVA is a suitable technique for analyzing high-dimensional data that has a large number of data points, where simple graphing and non-interactive techniques give an insufficient understanding of the information.

Heike Hofmann is a statistician and Professor in the Department of Statistics at Iowa State University.

Dianne Helen Cook is an Australian statistician, the editor of the Journal of Computational and Graphical Statistics, and an expert on the visualization of high-dimensional data. She is Professor of Business Analytics in the Department of Econometrics and Business Statistics at Monash University and professor emeritus of statistics at Iowa State University. The emeritus status was chosen so that she could continue to supervise graduate students at Iowa State after moving to Australia.

Andreas Buja is a Swiss statistician and professor of statistics. He is the Liem Sioe Liong/First Pacific Company professor in the Statistics department of The Wharton School at the University of Pennsylvania in Philadelphia, United States. Buja joined Center for Computational Mathematics (CCM) as a Senior Research Scientist in January 2020.

References

  1. 1 2 "GGobi licences page".
  2. XGobi is listed on Michael Friendly's Milestones of Statistical Graphics Archived 2014-04-14 at the Wayback Machine webpage.
  3. The history of GGobi
  4. XML - XML format for ggobi

Further reading