List of open-source bioinformatics software

Last updated

This is a list of computer software which is made for bioinformatics and released under open-source software licenses with articles in Wikipedia.


.NET Bio Language-neutral toolkit built using the Microsoft 4.0 .NET Framework to help developers, researchers, and scientists .NET Framework Apache Collaborative project
AMPHORA Metagenomics analysis software Linux GPL ?
Anduril Component-based workflow framework for data analysis Linux, macOS, Windows GPL University of Helsinki
Ascalaph Designer Computer program for general purpose molecular modelling for molecular design and simulations.? GPLv2 Agile Molecule
AutoDock Suite of automated docking tools? GPL ?
Avogadro C++ (Qt) based molecule editor and visualizer for in computational chemistry, molecular modeling, bioinformatics, materials science, and related areas.? GPL ?
BEDtools "Genome arithmetic" -- manipulation of coordinate sets and the extraction of sequences from a BED file. Linux MIT QuinlanLab, University of Utah
Bioclipse Visual platform for chemo- and bioinformatics based on the Eclipse Rich Client Platform (RCP)? Eclipse Public The Bioclipse Project
Bioconductor R (programming language) language toolkit Linux, macOS, Windows Artistic 2.0 Fred Hutchinson Cancer Research Center
BioJava Java library functions for manipulating sequences, protein structures, file parsers, CORBA interoperability, Distributed Annotation System (DAS), access to AceDB, dynamic programming, and simple statistical routines Linux, macOS, Windows LGPL v2.1 Open Bioinformatics Foundation
BioJS JavaScript library of components to visualize biological data Web browser Apache ?
BioMOBY Registry of web services Web browser Artistic Open Bioinformatics Foundation
BioPerl Perl language toolkit Cross-platform Artistic, GPL Open Bioinformatics Foundation
BioPHP PHP language toolkit with classes for DNA and protein sequence analysis, alignment, database parsing, and other bioinformatics tools Cross-platform GPL v2 Open Bioinformatics Foundation
Biopython Python language toolkit Cross-platform Biopython [1] Open Bioinformatics Foundation
BioRuby Ruby language toolkit? GPL v2 or Ruby Open Bioinformatics Foundation
BLAST Algorithm and program for comparing primary biological sequence information, including DNA and protein sequences. Cross-platform Public domain National Center for Biotechnology Information
CP2K Perform atomistic simulations of solid state, liquid, molecular and biological systems, written in Fortran 2003.? GPL and LGPL Free open source GNU GPLv2 or later
EMBOSS Suite of packages for sequencing, searching, etc. written in C ? GPL and LGPL Collaborative project
Galaxy Scientific workflow and data integration system Unix-like Academic Free Collaborative project
GenePattern Scientific workflow system that provides access to hundreds of genomic analysis tools Unix-like (public server); Linux, macOS, Windows MIT Broad Institute, UC San Diego
Geworkbench Genomic data integration platform Linux, macOS, Windows GeWorkbench License [2] Columbia University
GMOD Toolkit to address many common challenges at biological databases Unix-like (server), Web browser (client)Varies depending on toolCollaborative project
GenGIS Application that allows combining digital map data with information about biological sequences collected from the environment Windows, macOS GPL Collaborative project
Genomespace Centralized web application that provides data format transformations and facilitates connections with other bioinformatics tools Web browser LGPL Broad Institute, collaborative project
GENtle An equivalent to the proprietary Vector NTI, a tool to analyze and edit DNA sequence files? GPL Magnus Manske
GROMACS Molecular dynamics package mainly designed for simulations of proteins, lipids and nucleic acids. Linux, macOS, Windows Common Public 1.0GenoViz
Integrated Genome Browser Java-based desktop genome browser Linux, macOS, Windows Common Public 1.0GenoViz
InterMine Extensive data warehouse system for the analysis and integration of biological datasets written in Java and JavaScript Cross-platform LGPL University of Cambridge
LabKey Server Software platform, allows organizations to integrate, analyze, and share complex biomedical data Linux, macOS, Windows Apache LabKey Software Foundation
LAMMPS Molecular dynamics program written in C++ Linux, macOS, Windows Apache Sandia National Laboratories.
mothur Software for analysis of the 16S rRNA gene Linux, macOS, Windows ? University of Michigan
PathVisio Desktop software for drawing, analyzing, and visualizing biological pathways Linux, macOS, Windows Apache 2.0 Maastricht University
Orange Component-based data mining and machine learning software suite written in C++, featuring a visual programming front-end for exploratory data analysis and interactive visualization, and Python bindings and libraries for scripting Linux, macOS, Windows GPL University of Ljubljana
SAMtools Utilities for interacting with high-throughput sequencing data and alignments in sam/bam format Unix/Linux MIT Collaborative project
SOAP Suite Suite of tools for assembly, alignment, and analysis of short read next generation sequencing data Unix/Linux, macOS GPL BGI
Staden Package Sequence assembly, editing, and analysis, mainly consisting of gap4, gap5, and spin. Written in C, C++, Fortran and Tcl. Linux, macOS, Windows BSD Wellcome Trust Sanger Institute, Medical Research Council
Taverna workbench Tool to design and execute workflows Linux, macOS, Windows LGPL myGrid
T-REX (web server) Inference, validation and visualization of phylogenetic trees and phylogenetic networks Web browser, Windows ? Université du Québec à Montréal
UGENE Integrated bioinformatics tools, written in C++ (Qt) Linux, macOS, Windows GPL 2Unipro
Unipept Metaproteomics biodiversity analysis written in Ruby and JavaScript Web browser MIT Ghent University
VOTCA A Coarse-grained modeling package for molecular dynamics, written in C++, Perl, BASH Linux, macOS, Windows, any other Unix variety Apache License 2.0 Max Planck Institute for Polymer Research

See also

Related Research Articles

<span class="mw-page-title-main">Bioinformatics</span> Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combines biology, chemistry, physics, computer science, information engineering, mathematics and statistics to analyze and interpret the biological data. Bioinformatics has been used for in silico analyses of biological queries using computational and statistical techniques.

<span class="mw-page-title-main">Computational biology</span> Branch of biology

Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has foundations in applied mathematics, chemistry, and genetics. It differs from biological computing, a subfield of computer engineering which uses bioengineering to build computers.

The Open Bioinformatics Foundation is a non-profit, volunteer-run organization focused on supporting open source programming in bioinformatics. The mission of the foundation is to support the development of open source toolkits for bioinformatics, organise developer-centric hackathon events and generally assist in the development and promotion of open source software development in the life sciences. The foundation also organises and runs the annual Bioinformatics Open Source Conference, a satellite meeting of the Intelligent Systems for Molecular Biology conference. The foundation participates in the Google Summer of Code, acting as an umbrella organisation for individual bioinformatics-related projects.

<span class="mw-page-title-main">Visual Molecular Dynamics</span> Visualization and modelling software

Visual Molecular Dynamics (VMD) is a molecular modelling and visualization computer program. VMD is developed mainly as a tool to view and analyze the results of molecular dynamics simulations. It also includes tools for working with volumetric data, sequence data, and arbitrary graphics objects. Molecular scenes can be exported to external rendering tools such as POV-Ray, RenderMan, Tachyon, Virtual Reality Modeling Language (VRML), and many others. Users can run their own Tcl and Python scripts within VMD as it includes embedded Tcl and Python interpreters. VMD runs on Unix, Apple Mac macOS, and Microsoft Windows. VMD is available to non-commercial users under a distribution-specific license which permits both use of the program and modification of its source code, at no charge.

<span class="mw-page-title-main">RasMol</span> Software for the visualisation of macromolecules

RasMol is a computer program written for molecular graphics visualization intended and used mainly to depict and explore biological macromolecule structures, such as those found in the Protein Data Bank. It was originally developed by Roger Sayle in the early 1990s.

<span class="mw-page-title-main">PyMOL</span> Proprietary open-sourced python biology structure tool for visualisation

PyMOL is an open source but proprietary molecular visualization system created by Warren Lyford DeLano. It was commercialized initially by DeLano Scientific LLC, which was a private software company dedicated to creating useful tools that become universally accessible to scientific and educational communities. It is currently commercialized by Schrödinger, Inc. As the original software license was a permissive licence, they were able to remove it; new versions are no longer released under the Python license, but under a custom license, and some of the source code is no longer released. PyMOL can produce high-quality 3D images of small molecules and biological macromolecules, such as proteins. According to the original author, by 2009, almost a quarter of all published images of 3D protein structures in the scientific literature were made using PyMOL.

Bioconductor is a free, open source and open development software project for the analysis and comprehension of genomic data generated by wet lab experiments in molecular biology.

<span class="mw-page-title-main">Jmol</span> Open-source Java viewer for 3D chemical structures

Jmol is computer software for molecular modelling chemical structures in 3-dimensions. Jmol returns a 3D representation of a molecule that may be used as a teaching tool, or for research e.g., in chemistry and biochemistry. It is written in the programming language Java, so it can run on the operating systems Windows, macOS, Linux, and Unix, if Java is installed. It is free and open-source software released under a GNU Lesser General Public License (LGPL) version 2.0. A standalone application and a software development kit (SDK) exist that can be integrated into other Java applications, such as Bioclipse and Taverna.

<span class="mw-page-title-main">Chemistry Development Kit</span> Computer software

The Chemistry Development Kit (CDK) is computer software, a library in the programming language Java, for chemoinformatics and bioinformatics. It is available for Windows, Linux, Unix, and macOS. It is free and open-source software distributed under the GNU Lesser General Public License (LGPL) 2.0.

<span class="mw-page-title-main">BALL</span>

BALL is a C++ class framework and set of algorithms and data structures for molecular modelling and computational structural bioinformatics, a Python interface to this library, and a graphical user interface to BALL, the molecule viewer BALLView.

Physiomics is a systematic study of physiome in biology. Physiomics employs bioinformatics to construct networks of physiological features that are associated with genes, proteins and their networks. A few of the methods for determining individual relationships between the DNA sequence and physiological function include metabolic pathway engineering and RNAi analysis. The relationships derived from methods such as these are organized and processed computationally to form distinct networks. Computer models use these experimentally determined networks to develop further predictions of gene function.

<span class="mw-page-title-main">Avogadro (software)</span>

Avogadro is a molecule editor and visualizer designed for cross-platform use in computational chemistry, molecular modeling, bioinformatics, materials science, and related areas. It is extensible via a plugin architecture.

Biology data visualization is a branch of bioinformatics concerned with the application of computer graphics, scientific visualization, and information visualization to different areas of the life sciences. This includes visualization of sequences, genomes, alignments, phylogenies, macromolecular structures, systems biology, microscopy, and magnetic resonance imaging data. Software tools used for visualizing biological data range from simple, standalone programs to complex, integrated systems.

geWorkbench is an open-source software platform for integrated genomic data analysis. It is a desktop application written in the programming language Java. geWorkbench uses a component architecture. As of 2016, there are more than 70 plug-ins available, providing for the visualization and analysis of gene expression, sequence, and structure data.

Pharmaceutical bioinformatics is a research field related to bioinformatics but with the focus on studying biological and chemical processes in the pharmaceutical area; to understand how xenobiotics interact with the human body and the drug discovery process.


  1. Biopython License
  2. "GeWorkbench License". geWorkbench. Columbia University. 15 June 2014.