In information visualization and computing, treemapping is a method for displaying hierarchical data using nested figures, usually rectangles.
Treemaps display hierarchical (tree-structured) data as a set of nested rectangles. Each branch of the tree is given a rectangle, which is then tiled with smaller rectangles representing sub-branches. A leaf node's rectangle has an area proportional to a specified dimension of the data. [1] Often the leaf nodes are colored to show a separate dimension of the data.
When the color and size dimensions are correlated in some way with the tree structure, one can often easily see patterns that would be difficult to spot in other ways, such as whether a certain color is particularly prevalent. A second advantage of treemaps is that, by construction, they make efficient use of space. As a result, they can legibly display thousands of items on the screen simultaneously.
To create a treemap, one must define a tiling algorithm, that is, a way to divide a region into sub-regions of specified areas. Ideally, a treemap algorithm would create regions that satisfy the following criteria:
These properties have an inverse relationship. As the aspect ratio is optimized, the order of placement becomes less predictable. As the order becomes more stable, the aspect ratio is degraded.[ example needed ]
To date, fifteen primary rectangular treemap algorithms have been developed:
Algorithm | Order | Aspect ratios | Stability |
---|---|---|---|
BinaryTree | partially ordered | high | stable |
Slice And Dice [4] | ordered | very high | stable |
Strip [5] | ordered | medium | medium stability |
Pivot by middle [6] | ordered | medium | medium stability |
Pivot by split [6] | ordered | medium | low stability |
Pivot by size [6] | ordered | medium | medium stability |
Split [7] | ordered | medium | medium stability |
Spiral [8] | ordered | medium | medium stability |
Hilbert [9] | ordered | medium | medium stability |
Moore [9] | ordered | medium | medium stability |
Squarified [10] | ordered | low | low stability |
Mixed Treemaps [11] | unordered | low | medium stability |
Approximation [12] | unordered | low | medium stability |
Git [13] | unordered | medium | stable |
Local moves [14] | unordered | medium | stable |
Rectangular treemaps have the disadvantage that their aspect ratio might be arbitrarily high in the worst case. As a simple example, if the tree root has only two children, one with weight and one with weight , then the aspect ratio of the smaller child will be , which can be arbitrarily high. To cope with this problem, several algorithms have been proposed that use regions that are general convex polygons, not necessarily rectangular.
Convex treemaps were developed in several steps, each step improved the upper bound on the aspect ratio. The bounds are given as a function of - the total number of nodes in the tree, and - the total depth of the tree.
The latter two algorithms operate in two steps (greatly simplified for clarity):
In convex treemaps, the aspect ratio cannot be constant - it grows with the depth of the tree. To attain a constant aspect-ratio, Orthoconvex treemaps [17] can be used. There, all regions are orthoconvex rectilinear polygons with aspect ratio at most 64; and the leaves are either rectangles with aspect ratio at most 8, or L-shapes or S-shapes with aspect ratio at most 32.
For the special case where the depth is 1, they present an algorithm that uses only rectangles and L-shapes, and the aspect ratio is at most ; the internal nodes use only rectangles with aspect ratio at most .
Area-based visualizations have existed for decades. For example, mosaic plots (also known as Marimekko diagrams) use rectangular tilings to show joint distributions (i.e., most commonly they are essentially stacked column plots where the columns are of different widths). The main distinguishing feature of a treemap, however, is the recursive construction that allows it to be extended to hierarchical data with any number of levels. This idea was invented by professor Ben Shneiderman at the University of Maryland Human – Computer Interaction Lab in the early 1990s. [21] [22] Shneiderman and his collaborators then deepened the idea by introducing a variety of interactive techniques for filtering and adjusting treemaps.
These early treemaps all used the simple "slice-and-dice" tiling algorithm. Despite many desirable properties (it is stable, preserves ordering, and is easy to implement), the slice-and-dice method often produces tilings with many long, skinny rectangles. In 1994 Mountaz Hascoet and Michel Beaudouin-Lafon invented a "squarifying" algorithm, later popularized by Jarke van Wijk, that created tilings whose rectangles were closer to square. In 1999 Martin Wattenberg used a variation of the "squarifying" algorithm that he called "pivot and slice" to create the first Web-based treemap, the SmartMoney Map of the Market, which displayed data on hundreds of companies in the U.S. stock market. Following its launch, treemaps enjoyed a surge of interest, particularly in financial contexts.[ citation needed ]
A third wave of treemap innovation came around 2004, after Marcos Weskamp created the Newsmap, a treemap that displayed news headlines. This example of a non-analytical treemap inspired many imitators, and introduced treemaps to a new, broad audience.[ citation needed ] In recent years, treemaps have made their way into the mainstream media, including usage by the New York Times. [23] [24] The Treemap Art Project [25] produced 12 framed images for the National Academies (United States), shown at the Every AlgoRiThm has ART in It exhibit [26] in Washington, DC and another set for the collection of Museum of Modern Art in New York.
In computer science, a binary search tree (BST), also called an ordered or sorted binary tree, is a rooted binary tree data structure with the key of each internal node being greater than all the keys in the respective node's left subtree and less than the ones in its right subtree. The time complexity of operations on the binary search tree is linear with respect to the height of the tree.
A tree structure, tree diagram, or tree model is a way of representing the hierarchical nature of a structure in a graphical form. It is named a "tree structure" because the classic representation resembles a tree, although the chart is generally upside down compared to a biological tree, with the "stem" at the top and the "leaves" at the bottom.
Graph drawing is an area of mathematics and computer science combining methods from geometric graph theory and information visualization to derive two-dimensional depictions of graphs arising from applications such as social network analysis, cartography, linguistics, and bioinformatics.
R-trees are tree data structures used for spatial access methods, i.e., for indexing multi-dimensional information such as geographical coordinates, rectangles or polygons. The R-tree was proposed by Antonin Guttman in 1984 and has found significant use in both theoretical and applied contexts. A common real-world usage for an R-tree might be to store spatial objects such as restaurant locations or the polygons that typical maps are made of: streets, buildings, outlines of lakes, coastlines, etc. and then find answers quickly to queries such as "Find all museums within 2 km of my current location", "retrieve all road segments within 2 km of my location" or "find the nearest gas station". The R-tree can also accelerate nearest neighbor search for various distance metrics, including great-circle distance.
In computer science, a k-d tree is a space-partitioning data structure for organizing points in a k-dimensional space. K-dimensional is that which concerns exactly k orthogonal axes or a space of any number of dimensions. k-d trees are a useful data structure for several applications, such as:
Marching cubes is a computer graphics algorithm, published in the 1987 SIGGRAPH proceedings by Lorensen and Cline, for extracting a polygonal mesh of an isosurface from a three-dimensional discrete scalar field. The applications of this algorithm are mainly concerned with medical visualizations such as CT and MRI scan data images, and special effects or 3-D modelling with what is usually called metaballs or other metasurfaces. The marching cubes algorithm is meant to be used for 3-D; the 2-D version of this algorithm is called the marching squares algorithm.
Ben Shneiderman is an American computer scientist, a Distinguished University Professor in the University of Maryland Department of Computer Science, which is part of the University of Maryland College of Computer, Mathematical, and Natural Sciences at the University of Maryland, College Park, and the founding director (1983-2000) of the University of Maryland Human-Computer Interaction Lab. He conducted fundamental research in the field of human–computer interaction, developing new ideas, methods, and tools such as the direct manipulation interface, and his eight rules of design.
Fernanda Bertini Viégas is a Brazilian computer scientist and graphical designer, whose work focuses on the social, collaborative and artistic aspects of information visualization.
In fractal geometry, the H tree is a fractal tree structure constructed from perpendicular line segments, each smaller by a factor of the square root of 2 from the next larger adjacent segment. It is so called because its repeating pattern resembles the letter "H". It has Hausdorff dimension 2, and comes arbitrarily close to every point in a rectangle. Its applications include VLSI design and microwave engineering.
Martin M. Wattenberg is an American scientist and artist known for his work with data visualization. He is currently the Gordon McKay Professor of Computer Science at the Harvard University School of Engineering and Applied Sciences.
Jarke J. (Jack) van Wijk is a Dutch computer scientist, a professor in the Department of Mathematics and Computer Science at the Eindhoven University of Technology, and an expert in information visualization.
In the mathematical fields of numerical analysis and approximation theory, box splines are piecewise polynomial functions of several variables. Box splines are considered as a multivariate generalization of basis splines (B-splines) and are generally used for multivariate approximation/interpolation. Geometrically, a box spline is the shadow (X-ray) of a hypercube projected down to a lower-dimensional space. Box splines and simplex splines are well studied special cases of polyhedral splines which are defined as shadows of general polytopes.
Jean-Daniel Fekete is a French computer scientist.
An arc diagram is a style of graph drawing, in which the vertices of a graph are placed along a line in the Euclidean plane, with edges being drawn as semicircles in one or both of the two halfplanes bounded by the line, or as smooth curves formed by sequences of semicircles. In some cases, line segments of the line itself are also allowed as edges, as long as they connect only vertices that are consecutive along the line. Variations of this drawing style in which the semicircles are replaced by convex curves of some other type are also commonly called arc diagrams.
NodeXL is a network analysis and visualization software package for Microsoft Excel 2007/2010/2013/2016. The package is similar to other network visualization tools such as Pajek, UCINet, and Gephi. It is widely applied in ring, mapping of vertex and edge, and customizable visual attributes and tags. NodeXL enables researchers to undertake social network analysis work metrics such as centrality, degree, and clustering, as well as monitor relational data and describe the overall relational network structure. When applied to Twitter data analysis, it showed the total network of all users participating in public discussion and its internal structure through data mining. It allows social Network analysis (SNA) to emphasize the relationships rather than the isolated individuals or organizations, allowing interested parties to investigate the two-way dialogue between organizations and the public. SNA also provides a flexible measurement system and parameter selection to confirm the influential nodes in the network, such as in-degree and out-degree centrality. The software contains network visualization, social network analysis features, access to social media network data importers, advanced network metrics, and automation.
A software map represents static, dynamic, and evolutionary information of software systems and their software development processes by means of 2D or 3D map-oriented information visualization. It constitutes a fundamental concept and tool in software visualization, software analytics, and software diagnosis. Its primary applications include risk analysis for and monitoring of code quality, team activity, or software development progress and, generally, improving effectiveness of software engineering with respect to all related artifacts, processes, and stakeholders throughout the software engineering process and software maintenance.
A chord diagram is a graphical method of displaying the inter-relationships between data in a matrix. The data are arranged radially around a circle with the relationships between the data points typically drawn as arcs connecting the data.
The IEEE Visualization Conference (VIS) is an annual conference on scientific visualization, information visualization, and visual analytics administrated by the IEEE Computer Society Technical Committee on Visualization and Graphics. As ranked by Google Scholar's h-index metric in 2016, VIS is the highest rated venue for visualization research and the second-highest rated conference for computer graphics over all. It has an 'A' rating from the Australian Ranking of ICT Conferences, an 'A' rating from the Brazilian ministry of education, and an 'A' rating from the China Computer Federation (CCF). The conference is highly selective with generally < 25% acceptance rates for all papers.
Jessica Hullman is a computer scientist and the Ginni Rometty professor of Computer Science at Northwestern University. She is known for her research in Information visualization and Uncertainty quantification.
Steven Mark Drucker is an American computer scientist who studies how to help people understand data, and communicate their insights to others. He is a partner at Microsoft Research, where he also serves as the research manager of the VIDA group. Drucker is an affiliate professor at the University of Washington Computer Science and Engineering Department.