VisTrails

Last updated
VisTrails
VisTrails logo.png
Developer(s) University of Utah, NYU-Poly
Final release
2.2.4 / May 3, 2016;5 years ago (2016-05-03)
Repository https://github.com/VisTrails/VisTrails
Written in Python
Operating system Cross-platform
Type Scientific workflow management; Scientific visualization
License BSD License 3-clause [1]
Website www.vistrails.org

VisTrails is a scientific workflow management system developed at the Scientific Computing and Imaging Institute at the University of Utah that provides support for data exploration and visualization. It is written in Python and employs Qt via PyQt bindings. The system is open source, released under the GPL v2 license. The pre-compiled versions for Windows, Mac OS X, and Linux come with an installer and several packages, including VTK, matplotlib, and ImageMagick. VisTrails also supports user-defined packages.

Contents

Overview

VisTrails is a new system that provides provenance management support for exploratory computational tasks. It combines features of workflow and visualization systems. Similar to workflow systems, it allows the combination of loosely coupled resources, specialized libraries, and grid and Web services. Similar to some visualization systems, it provides a mechanism for parameter exploration and comparison of different results. But unlike these other systems, VisTrails was designed to manage exploratory processes in which computational tasks evolve over time as a user iteratively formulates and tests hypotheses. A key distinguishing feature of VisTrails is its comprehensive provenance infrastructure that maintains detailed history information about the steps followed in the course of an exploratory task. VisTrails leverages this information to provide novel operations and user interfaces that streamline this process.

VisTrails has been developed for exploratory visualization, [2] but the system is general, and provides functionality in the following areas:

History

VisTrails is the result of a collaborative effort between computer scientists Cláudio Silva and Juliana Freire. Initial development began in 2004 by graduate students at the University of Utah. Although the first prototypes were implemented in C++, the current version of VisTrails is written in Python. The first public release was in September 2007.

Functionality

A common use for VisTrails is scientific visualization. Visualizations generated as part of a workflow are rendered in a spreadsheet-style interface, allowing multiple visualizations from different versions of a workflow to be viewed and compared simultaneously. The VisTrails spreadsheet currently supports VTK and HTML rendering.

VisTrails supports four basic modes, or views. Each view interacts with the underlying workflow in a different way.

Commercial variants

In 2007, the University of Utah formed VisTrails, Inc., a spinoff company intended to commercialize VisTrails technology. Development for the free version of VisTrails is currently funded by the University of Utah and VisTrails, Inc. The company's first product is a plugin for the 3D modeling software Maya. [8] While the main VisTrails distribution is free software, the VisTrails plugin for Maya is distributed under a closed-source/proprietary license.

Version release dates history

See also

Related Research Articles

Scientific visualization Interdisciplinary branch of science concerned with presenting scientific data visually

Scientific visualization is an interdisciplinary branch of science concerned with the visualization of scientific phenomena. It is also considered a subset of computer graphics, a branch of computer science. The purpose of scientific visualization is to graphically illustrate scientific data to enable scientists to understand, illustrate, and glean insight from their data. Research into how people read and misread various types of visualizations is helping to determine what types and features of visualizations are most understandable and effective in conveying information.

VTK

The Visualization Toolkit (VTK) is an open-source software system for 3D computer graphics, image processing and scientific visualization.

MayaVi

MayaVi is a scientific data visualizer written in Python, which uses VTK and provides a GUI via Tkinter. MayaVi was developed by Prabhu Ramachandran, is free and distributed under the BSD License. It is cross-platform and runs on any platform where both Python and VTK are available. MayaVi is pronounced as a single name, "Ma-ya-vee", meaning "magical" in Sanskrit. The code of MayaVi has nothing in common with that of Autodesk Maya or the Vi text editor.

GenePattern is a freely available computational biology open-source software package originally created and developed at the Broad Institute for the analysis of genomic data. Designed to enable researchers to develop, capture, and reproduce genomic analysis methodologies, GenePattern was first released in 2004. GenePattern is currently developed at the University of California, San Diego.

ParaView

ParaView is an open-source multiple-platform application for interactive, scientific visualization. It has a client–server architecture to facilitate remote visualization of datasets, and generates level of detail (LOD) models to maintain interactive frame rates for large datasets. It is an application built on top of the Visualization Toolkit (VTK) libraries. ParaView is an application designed for data parallelism on shared-memory or distributed-memory multicomputers and clusters. It can also be run as a single-computer application.

VisIt

VisIt is an open-source interactive parallel visualization and graphical analysis tool for viewing scientific data. It can be used to visualize scalar and vector fields defined on 2D and 3D structured and unstructured meshes. VisIt was designed to handle very large data set sizes in the terascale range and yet can also handle small data sets in the kilobyte range.

Kepler is a free software system for designing, executing, reusing, evolving, archiving, and sharing scientific workflows. Kepler's facilities provide process and data monitoring, provenance information, and high-speed data movement. Workflows in general, and scientific workflows in particular, are directed graphs where the nodes represent discrete computational components, and the edges represent paths along which data and results can flow between components. In Kepler, the nodes are called 'Actors' and the edges are called 'channels'. Kepler includes a graphical user interface for composing workflows in a desktop environment, a runtime engine for executing workflows within the GUI and independently from a command-line, and a distributed computing option that allows workflow tasks to be distributed among compute nodes in a computer cluster or computing grid. The Kepler system principally targets the use of a workflow metaphor for organizing computational tasks that are directed towards particular scientific analysis and modeling goals. Thus, Kepler scientific workflows generally model the flow of data from one step to another in a series of computations that achieve some scientific goal.

UGENE

UGENE is computer software for bioinformatics. It works on personal computer operating systems such as Windows, macOS, or Linux. It is released as free and open-source software, under a GNU General Public License (GPL) version 2.

LONI Pipeline Scientific workflow software

The LONI Pipeline is a free distributed system for designing, executing, monitoring and sharing scientific workflows on grid computing architectures. Pipeline allows users to connect and run any number of different software tools, and conveniently visualize and download the results.

Scientific Computing and Imaging Institute

The Scientific Computing and Imaging (SCI) Institute is a permanent research institute at the University of Utah that focuses on the development of new scientific computing and visualization techniques, tools, and systems with primary applications to biomedical engineering. The SCI Institute is noted worldwide in the visualization community for contributions by faculty, alumni, and staff. Faculty are associated primarily with the School of Computing, Department of Bioengineering, Department of Mathematics, and Department of Electrical and Computer Engineering, with auxiliary faculty in the Medical School and School of Architecture.

KNIME, the Konstanz Information Miner, is a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining "Building Blocks of Analytics" concept. A graphical user interface and use of JDBC allows assembly of nodes blending different data sources, including preprocessing, for modeling, data analysis and visualization without, or with only minimal, programming.

A scientific workflow system is a specialized form of a workflow management system designed specifically to compose and execute a series of computational or data manipulation steps, or workflow, in a scientific application.

GIMIAS

GIMIAS is a workflow-oriented environment focused on biomedical image computing and simulation. The open-source framework is extensible through plug-ins and is focused on building research and clinical software prototypes. Gimias has been used to develop clinical prototypes in the fields of cardiac imaging and simulation, angiography imaging and simulation, and neurology

A bioinformatics workflow management system is a specialized form of workflow management system designed specifically to compose and execute a series of computational or data manipulation steps, or a workflow, that relate to bioinformatics.

Interactive Visual Analysis (IVA) is a set of techniques for combining the computational power of computers with the perceptive and cognitive capabilities of humans, in order to extract knowledge from large and complex datasets. The techniques rely heavily on user interaction and the human visual system, and exist in the intersection between visual analytics and big data. It is a branch of data visualization. IVA is a suitable technique for analyzing high-dimensional data that has a large number of data points, where simple graphing and non-interactive techniques give an insufficient understanding of the information.

Data lineage includes the data origin, what happens to it and where it moves over time. Data lineage gives visibility while greatly simplifying the ability to trace errors back to the root cause in a data analytics process.

Juliana Freire de Lima e Silva is a Brazilian computer scientist who works as a professor of computer science and engineering at the New York University Polytechnic School of Engineering. She is known for her research in information visualization, data provenance, and computerized assistance for scientific reproducibility.

Claudio Silva is a Brazilian American computer scientist and data scientist. He is a professor of computer science and engineering at the New York University Tandon School of Engineering, the head of disciplines at the NYU Center for Urban Science and Progress (CUSP) and affiliate faculty member at NYU's Courant Institute of Mathematical Sciences. He co-developed the open-source data-exploration system VisTrails with his wife Juliana Freire and many other collaborators. He is a former chair of the executive committee for the IEEE Computer Society Technical Committee on Visualization and Graphics.

Nirvana was virtual object storage software developed and maintained by General Atomics.

Kwan-Liu Ma American computer scientist


Kwan-Liu Ma is an American computer scientist. He was born and grew up in Taipei, Taiwan and came to the United States pursuing advanced study in 1983. He is a distinguished professor of computer science at the University of California, Davis. His research interests include visualization, computer graphics, human computer interaction, and high-performance computing.

References

  1. "LICENSE file in code repository". github.com.
  2. Cláudio T. Silva, Juliana Freire, and Steven Callahan. "Provenance for Visualizations: Reproducibility and Beyond" (PDF). Computing in Science & Engineering, 9(5), pp. 82-90, 2007.CS1 maint: multiple names: authors list (link)
  3. Juliana Freire, David Koop, Emanuele Santos, and Cláudio T. Silva. "Provenance for Computational Tasks: A Survey" (PDF). Computing in Science & Engineering, 10(3), pp. 11-21, 2008.CS1 maint: multiple names: authors list (link)
  4. Carlos E. Scheidegger, David Koop, Emanuele Santos, Huy T. Vo, Steven P. Callahan, Juliana Freire, and Cláudio T. Silva. "Tackling the Provenance Challenge one layer at a time" (PDF). Concurrency and Computation: Practice and Experience, 20(5), pp. 473-483, 2008.CS1 maint: multiple names: authors list (link)
  5. Carlos E. Scheidegger, Huy T. Vo, David Koop, Juliana Freire and Cláudio T. Silva. "Querying and Creating Visualizations by Analogy" (PDF). IEEE Transactions on Visualization and Computer Graphics, 13(6), pp. 1560-1567, 2007.CS1 maint: multiple names: authors list (link)
  6. Tommy Ellkvist, David Koop, Erik Anderson, Juliana Freire, and Cláudio T. Silva. "Using Provenance to Support Real-Time Collaborative Design of Workflows" (PDF). Proceedings of International Provenance and Annotation Workshop (IPAW), 2008.CS1 maint: multiple names: authors list (link)
  7. Louis Bavoil, Steven P. Callahan, Patricia J. Crossno, Juliana Freire, Carlos E. Scheidegger, Cláudio T. Silva, and Huy T. Vo. "VisTrails: Enabling Interactive Multiple-View Visualizations" (PDF). Proceedings of IEEE Visualization, pp. 135-142, 2005.CS1 maint: multiple names: authors list (link)
  8. "Announcement on VisTrails, Inc. website". www.vistrails.com.