List of alignment visualization software

Last updated

This page is a subsection of the list of sequence alignment software.

Multiple alignment visualization tools typically serve four purposes:

The rest of this article is focused on only multiple global alignments of homologous proteins. The first two are a natural consequence of most representations of alignments and their annotation being human-unreadable and best portrayed in the familiar sequence row and alignment column format, of which examples are widespread in the literature. The third is necessary because algorithms for both multiple sequence alignment and structural alignment use heuristics which do not always perform perfectly. The fourth is a great example of how interactive graphical tools enable a worker involved in sequence analysis to conveniently execute a variety if different computational tools to explore an alignment's phylogenetic implications; or, to predict the structure and functional properties of a specific sequence, e.g., comparative modelling.

Alignment viewers, editors

NameStructure prediction tools integratedCan align sequencesCan calculate phylogenetic treesOther featuresFormat support License Can run on BrowserOperating PlatformsLink
AlanNoNoNoAllows sequence alignments to be viewed quickly and directly in a linux terminal without X-forwarding FASTA, Clustal Free, GPL 3NoLinux Terminal Official website
Ale (emacs plugin)NoYesNoNoGenBank, EMBL, FASTA, PHYLIP Free, GPL NoGNU Emacs Official website
AliView 2021NoMUSCLE integrated; other programs such as MAFFT can be definedExternal programs such as FastTree can be called from withinFast, easy navigation through unlimited mouse wheel zoom in-out feature. Handles unlimited file size alignments. Degenerate primer design. FASTA, FASTQ, PHYLIP, Nexus, MSF, Clustal Free, GPL 3?Cross-platform -Mac OS,

Linux,

Windows

Official website
alvNoNoNoConsole-based (no GUI), yet with colors. Coding DNA is coloured by codon. FASTA, PHYLIP, Nexus, Clustal, Stockholm Free, GPL 3NoCross-platform Official website , see also alv on GitHub
arb structure editable, show bond in helix sequence regions, 2D molecule viewerMUSCLE, MAFFT, ClustalW, ProbCons, FastAligner (region-align+auto-reference)arb-parsimony & -NJ, RAxML, PHYML, Phylip, FastTree2, MrBayesEdits huge alignments and trees. Supports NUCs + AA. Displays codons below DNA. Custom column highlighting (e.g. by conservation profiles). Designs, matches and visualizes probes. FASTA, GenBank, EMBL, Newick Proprietary, freeware, arb license, open modifiable sourceNoLinux, Mac OS (homebrew) Official website
Base-By-BaseNoMUSCLEUPGMA, NJ, complete and single linkages, WPMGAVisual summary, percent identity tables, some integrated advanced analysis toolsGenbank, FASTA, EMBEL, Clustal, base-by-base files Proprietary, freeware, must register?? Official website
BioEditNo ClustalWRudimentary, can read PHYLIP Plasmid drawing, ABI chromatograms,Genbank, FASTA, PHYLIP 3.2 and 4, NBRF-PIR Proprietary, freeware NoWindows (95/98/NT/2000/XP) Official website
BioNumerics NoYesYes?Genbank, FASTA Proprietary, commercial ?? Official website
bioSyntaxNoNoNoNative syntax highlighting support for Vim, less, gedit and Sublime FASTA, FASTQ, Clustal, SAM, VCF and moreFree, GPL 3NoVim, Less, GEdit, & Sublime Official website
BoxShadeNoNoNoSpecifically for multiple alignmentsMSF format as written by PILEUP, READSEQ, or SEQIO (fmtseq); ALN format as written by ClustalWFree, public domain NoMSDOS, VMS Official website
CINEMANo, but can read-show 2D structure annotations ClustalWNoDotplot, 6 frame translation, Blast Nexus, MSF, Clustal, FASTA, PHYLIP, PIR, PRINTS Proprietary, freeware NoCross-platform -Mac OS, Linux, Windows Official website
CLC viewer (free version)Commercial version only Clustal, MUSCLE, T-Coffee, MAFFT, Kalign, variousUPGMA, NJ Workflows, blast-genbank searchmany Proprietary, freeware. More options available in commercial versions.No? Official website
CIAlignNoNoNoAlignment visualisation as publication-ready images, alignment cleaning. FASTA Free, MIT NoLinux, Windows, MacOS Official website

Publication

ClustalX viewerNo ClustalW NJ Alignment quality analysis Nexus, MSF, Clustal, FASTA, PHYLIP Proprietary, freeware for academic useNoCommand line Official website
Cylindrical Alignment AppNoNoNo3D, animation, drilldown, legend selectionBLAST XML, proprietary XML, GFF3, ClustalW, INSDSet, user expandable with XSLT Free, CDDL 1. Available for dual licensing.?Cross-platform -Mac OS, Linux, Windows Official website
Cylindrical BLAST ViewerNoNoNo3D, animation, drilldown, legend selectionBLAST XML, proprietary XML, GFF3, ClustalW, INSDSet, user expandable with XSLT Free, GPL ?? Official website
DECIPHERYesYesUPGMA, NJ, MLPrimer-Probe design, Chimera findingFASTA, FASTQ, GenBankFree, GPL NoMac OS, Windows Official website
Discovery StudioYesAlign123, ClustalW, S-ALIGNUPGMA, NJ, with bootstrap and best treeVisualizer supports 2D and 3D structure and sequence; full version has comprehensive functionality for protein, nucleotides, moreBSML, EMBL, GB, HELM, Clustal, FASTA, GDE, PDB, SEQ, SPT, ... Proprietary, commercial, Viewer is Freeware ?Linux, Windows Official website
DnaSP???Can compute several population genetics statistics, reconstruct haplotypes with PHASE FASTA, Nexus, MEGA, PHYLIP Proprietary, commercial, freeware for noncommercial use?Cross-platform -Mac OS, Linux, Windows Official website
DNASTAR Lasergene Molecular Biology SuiteYesYesYesAlign DNA, RNA, protein, or DNA + protein sequences via a variety of pairwise and multiple sequence alignment algorithms, generate phylogenetic trees to predict evolutionary relationships, explore sequence tracks to view GC content, gap fraction, sequence logos, translationABI, DNA Multi-Seq, FASTA, GCG Pileup, GenBank, Phred Proprietary, commercial, academic licenses available?Mac OS, Windows Official website
emacs - biomode?????Free, GPL ?? Official website
FLAKNoCan perform fuzzy whole genome alignmentNoVery fast, highly customisable, visualisation is WYSIWYG with filtering and fuzzy optionsFASTA Proprietary, commercial, freeware for noncommercial use?? Official website
GenedocNo, but can read-show annotationsPairwiseNo, but can read-show annotationsgel simulation, stats, multiple views, simplemany Proprietary, freeware ?? Official website table of features
GeneiousYes - powered by EMBOSS tools Clustal, MUSCLE, MAUVE, profile, translationUPGMA, NJ, PhyML, MrBayes plugin, PAUP* pluginWhole genome assembly, restriction analysis, cloning, primer design, dotplot, much more>40 file formats imported and exported Proprietary, commercial; personal, floating?Cross-platform - Mac, Windows, Linux Official website
Integrated Genome Browser (IGB)NoNoNoSequences and features from files, URLs, and arbitrary DAS and QuickLoad serversBAM, FASTA, PSLFree, CPL ?Cross-platform - Mac, Windows, Linux Official website
interactive Tree Of Life (iTOL)NoNo?Phylogenetic tree viewer-annotation tool which can visualise alignments directly on the tree. Various other dataset types can be displayed in addition to alignments.FASTA Proprietary, free useYesBrowser Official website
IVisTMSANo Clustal Omega, ClustalW2, MAFFT, MUSCLE, BioJava are integrated to construct alignmentTree calculation tool calculates phylogenetic tree using BioJava API and lets user draw trees using Archaeopteryx Software is package of 7 interactive visual tools for multiple sequence alignments. Major focus is manipulating large alignments. Includes MSApad, MSA comparator, MSA reconstruction tool, FASTA generator and MSA ID matrix calculator ClustalW, MSF, PHYLIP, PIR, GDE, Nexus Proprietary, freeware ??www.ivistmsa.com
Jalview Secondary structure prediction via JPred 4 Clustal O, Clustal, GLprobs, MSAprobs, MUSCLE, MAFFT, Probcons, TCoffee, via web services UPGMA, NJ Sequences and features retrieved from user-configurable and publicly registered servers, e.g. EMBL, EBI, PDB, Pfam, Rfam, UniProt Accession retrieval. Structure/model data retrieval from PDB and 3D-Beacons including PDBe, AlphaFold DB, SWISS-MODEL. FASTA, Pfam, MSF, Clustal, BLC, PIR, Stockholm, VCF, AMSA, BioJSON, Clustal, ENA, GenBank, GFF2, GFF3, JnetFile, PHYLIP, PileUp, RNAML, CIF, mmCIF, PDB.Free, GPL JalviewJS (Javascript)Cross-platform - macOS, Linux, Windows, other with Java Virtual Machine. Official website
JevtraceIntegrated with structure viewer WebMol NoNoA multivalent browser for sequence alignment, phylogeny, and structure. Performs an interactive Evolutionary Trace and other phylogeny inspired analysis. FASTA, MSF, Clustal, PHYLIP, Newick, PDB Proprietary, commercial, freeware for academic use?Cross-platform -Mac OS, Linux,

Windows

Official website manual
JSAVNoNoNoA JavaScript component allowing integrating an alignment viewer into web pagesAn array of JavaScript objectsFree, GPL 2YesBrowser Official website
Lucid AlignNoNoNoNative desktop alignment viewer, uses trackpad/mouse gestures. Allows streaming remote dataBAM, FASTQ, FASTA Proprietary, commercial, freeware for academic useNoMac OS Official website
MaestroYes ClustalXYesMapping from sequence to 3D structure, structure-sequence editing-modeling Clustal, FASTA PDB Proprietary, freeware for academic use?? Official website
MEGA NoNative ClustalWUPGMA, NJ, ME, MP, with bootstrap and confidence testExtended support to phylogenetics analysisFASTA, Clustal, Nexus, MEGA, etc. Proprietary, freeware, must register?? Official website
Molecular Operating Environment (MOE)YesYesYesPart of an extensive collection of applications for sequence to structure, including homology modelling; 3D visualisation, etc. Clustal, FASTA, PDB, EMBL, GCG, GCG_MSF, Genbank, PHYLIP, PIR, raw_seq Proprietary ?? Official website
MSAReveal.orgNoNoNoOptional coloring. Touching AA shows 3-letter code and sequence number. Touching consensus shows AA frequencies in that column. Counts and percentages of aromatics, charged, gaps.FASTAFree, Creative Commons Attribution NonCommercial Share-alike ?? Official website
Multiseq (VMD plugin)No, but can display and align 3D structures ClustalW, MAFFT, Stamp (Structural)Percent identity, Clustal, MAFFT, StructuralScripting via Tcl, mapping from sequence to 3D structureFASTA, PDB, ALN, PHYLIP, NEXUS Proprietary, freeware, but VMD is free for noncommercial use only?? Official website
MViewNoNoNoStacked alignments from blast and fasta suites, various MSA format conversions, HTML markup, consensus patternsBLAST search, FASTA search, Clustal, HSSP, FASTA, PIR, MSFFree, GPL NoCross-platform - Mac OS, Linux, Windows Official website
PFAATNo, but can display 3D structures ClustalW NJ Manual annotation, conservation scores Nexus, MSF, Clustal, FASTA, PFAAT Proprietary, freeware ?? Official website
Ralee (emacs plugin for RNA al. editing)?RNA structure?? Stockholm Free, GPL ?? Official website
S2S RNA editor2D structureRnalignNoBase-base interactions, 2D-3D viewer FASTA, RnaML Proprietary, freeware ?? Official website
SeaviewNolocal MUSCLE-ClustalWParsimony, distance methods, PhyMLDot-plot, vim-like editing keys Nexus, MSF, Clustal, FASTA, PHYLIP, MASE Proprietary, freeware ?? Official website
SeqotronNo MUSCLE, MAFFT UPGMA, NJ, ML (Physher)Manual alignment, tree visualisation Nexus, Clustal, FASTA, PHYLIP, MEGA, Stockholm, NBRF/PIR, GDE flatFree, GPL ?Mac OS X Official website publication
SequilabYesYesNoLink alignment results to analysis tools (Primer design, Gel mobility and Maps, Plasmapper, siRNA design Epitope prediction), Save research logs, Create custom toolbarsAccession number, GI number, PDB ID, FASTA, drag-drop from external URL from within the user interface Proprietary, freeware ?? Official website
SeqPupNo???? Proprietary, freeware ?? Official website
SequlatorNoPairwise alignmentNoeasy alignment editingMSF Proprietary, freeware ?? Official website
SnipVizNoNoNo (but can display them)Pure Javascript and HTML; suitable to integrate in websitesFASTA, newickFree, Apache 2.0YesBrowsers Official website , publication
StrapJnet, NNPREDICT, Coiled coil, 16 different TM-helix15 different methods NJ Dot-plot, structure-neighbors, 3D-superposition, Blast-search, Mutation-SNP analysis, Sequence features, BioJava-interfaceMSF, Stockholm, ClustalW, Nexus, FASTA, PDB, Embl, GenBank, hssp, Pfam Free, GPL ?? Official website
TabletNoNoNoHigh-performance graphical viewer for viewing next generation sequence assemblies and alignments. ACE, AFG, MAQ, SOAP2, SAM, BAM, FASTA, FASTQ, GFF3 Free, BSD 2-clause?? Official website
UGENE Yes MUSCLE, Kalign, ClustalW, ClustalO, ClustalX, MAFFT, T-Coffee, Smith–Waterman algorithm YesMany FASTA, FASTQ, GenBank, EMBL, ABIF, SCF, ClustalW, Stockholm, Newick, PDB, MSF, GFFFree, GPL ?? Official website
VISSA sequence-structure viewerDSSP secondary structure ClustalXNoMapping from sequence to 3D structure Clustal, FASTA Proprietary, freeware ?? Official website
DNApyNo MUSCLE NoEditing of GenBank files, plasmid drawing, ABI chromatograms, FASTA, FASTQ, GenBank Free, GPL 3?? Official website
Alignment AnnotatorYesBy sequence or mixed sequence and structureIncludes Archaeopteryx DAS and user defined annotations. Scriptable. Export to HTML, Word, Jalview.ManyFree, GPL YesiOS, Android, MS-Mobile,

Browsers

Official website

See also

Related Research Articles

<span class="mw-page-title-main">Bioinformatics</span> Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field that develops methods and software tools for understanding biological data, in particular when the data sets are large and complex. As an interdisciplinary field of science, bioinformatics combines biology, chemistry, physics, computer science, information engineering, mathematics and statistics to analyze and interpret the biological data. Bioinformatics has been used for in silico analyses of biological queries using computational and statistical techniques.

<span class="mw-page-title-main">Sequence alignment</span> Process in bioinformatics that identifies equivalent sites within molecular sequences

In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Aligned sequences of nucleotide or amino acid residues are typically represented as rows within a matrix. Gaps are inserted between the residues so that identical or similar characters are aligned in successive columns. Sequence alignments are also used for non-biological sequences, such as calculating the distance cost between strings in a natural language or in financial data.

In bioinformatics, sequence analysis is the process of subjecting a DNA, RNA or peptide sequence to any of a wide range of analytical methods to understand its features, function, structure, or evolution. Methodologies used include sequence alignment, searches against biological databases, and others.

In bioinformatics, BLAST is an algorithm and program for comparing primary biological sequence information, such as the amino-acid sequences of proteins or the nucleotides of DNA and/or RNA sequences. A BLAST search enables a researcher to compare a subject protein or nucleotide sequence with a library or database of sequences, and identify database sequences that resemble alphabet above a certain threshold. For example, following the discovery of a previously unknown gene in the mouse, a scientist will typically perform a BLAST search of the human genome to see if humans carry a similar gene; BLAST will identify sequences in the human genome that resemble the mouse gene based on similarity of sequence.

<span class="mw-page-title-main">Structural bioinformatics</span> Bioinformatics subfield

Structural bioinformatics is the branch of bioinformatics that is related to the analysis and prediction of the three-dimensional structure of biological macromolecules such as proteins, RNA, and DNA. It deals with generalizations about macromolecular 3D structures such as comparisons of overall folds and local motifs, principles of molecular folding, evolution, binding interactions, and structure/function relationships, working both from experimentally solved structures and from computational models. The term structural has the same meaning as in structural biology, and structural bioinformatics can be seen as a part of computational structural biology. The main objective of structural bioinformatics is the creation of new methods of analysing and manipulating biological macromolecular data in order to solve problems in biology and generate new knowledge.

BioJava is an open-source software project dedicated to provide Java tools to process biological data. BioJava is a set of library functions written in the programming language Java for manipulating sequences, protein structures, file parsers, Common Object Request Broker Architecture (CORBA) interoperability, Distributed Annotation System (DAS), access to AceDB, dynamic programming, and simple statistical routines. BioJava supports a huge range of data, starting from DNA and protein sequences to the level of 3D protein structures. The BioJava libraries are useful for automating many daily and mundane bioinformatics tasks such as to parsing a Protein Data Bank (PDB) file, interacting with Jmol and many more. This application programming interface (API) provides various file parsers, data models and algorithms to facilitate working with the standard data formats and enables rapid application development and analysis.

In molecular biology and bioinformatics, the consensus sequence is the calculated sequence of most frequent residues, either nucleotide or amino acid, found at each position in a sequence alignment. It represents the results of multiple sequence alignments in which related sequences are compared to each other and similar sequence motifs are calculated. Such information is important when considering sequence-dependent enzymes such as RNA polymerase.

<span class="mw-page-title-main">RasMol</span> Software for the visualisation of macromolecules

RasMol is a computer program written for molecular graphics visualization intended and used mainly to depict and explore biological macromolecule structures, such as those found in the Protein Data Bank. It was originally developed by Roger Sayle in the early 1990s.

<span class="mw-page-title-main">Multiple sequence alignment</span> Alignment of more than two molecular sequences

Multiple sequence alignment (MSA) may refer to the process or the result of sequence alignment of three or more biological sequences, generally protein, DNA, or RNA. In many cases, the input set of query sequences are assumed to have an evolutionary relationship by which they share a linkage and are descended from a common ancestor. From the resulting MSA, sequence homology can be inferred and phylogenetic analysis can be conducted to assess the sequences' shared evolutionary origins. Visual depictions of the alignment as in the image at right illustrate mutation events such as point mutations that appear as differing characters in a single alignment column, and insertion or deletion mutations that appear as hyphens in one or more of the sequences in the alignment. Multiple sequence alignment is often used to assess sequence conservation of protein domains, tertiary and secondary structures, and even individual amino acids or nucleotides.

InterPro is a database of protein families, protein domains and functional sites in which identifiable features found in known proteins can be applied to new protein sequences in order to functionally characterise them.

<span class="mw-page-title-main">UCSF Chimera</span>

UCSF Chimera is an extensible program for interactive visualization and analysis of molecular structures and related data, including density maps, supramolecular assemblies, sequence alignments, docking results, trajectories, and conformational ensembles. High-quality images and movies can be created. Chimera includes complete documentation and can be downloaded free of charge for noncommercial use.

<span class="mw-page-title-main">Dot plot (bioinformatics)</span>

In bioinformatics a dot plot is a graphical method for comparing two biological sequences and identifying regions of close similarity after sequence alignment. It is a type of recurrence plot.

<span class="mw-page-title-main">UTOPIA (bioinformatics tools)</span>

UTOPIA is a suite of free tools for visualising and analysing bioinformatics data. Based on an ontology-driven data model, it contains applications for viewing and aligning protein sequences, rendering complex molecular structures in 3D, and for finding and using resources such as web services and data objects. There are two major components, the protein analysis suite and UTOPIA documents.

<span class="mw-page-title-main">UGENE</span>

UGENE is computer software for bioinformatics. It works on personal computer operating systems such as Windows, macOS, or Linux. It is released as free and open-source software, under a GNU General Public License (GPL) version 2.

<span class="mw-page-title-main">HMMER</span> Software package for sequence analysis

HMMER is a free and commonly used software package for sequence analysis written by Sean Eddy. Its general usage is to identify homologous protein or nucleotide sequences, and to perform sequence alignments. It detects homology by comparing a profile-HMM to either a single sequence or a database of sequences. Sequences that score significantly better to the profile-HMM compared to a null model are considered to be homologous to the sequences that were used to construct the profile-HMM. Profile-HMMs are constructed from a multiple sequence alignment in the HMMER package using the hmmbuild program. The profile-HMM implementation used in the HMMER software was based on the work of Krogh and colleagues. HMMER is a console utility ported to every major operating system, including different versions of Linux, Windows, and Mac OS.

Biology data visualization is a branch of bioinformatics concerned with the application of computer graphics, scientific visualization, and information visualization to different areas of the life sciences. This includes visualization of sequences, genomes, alignments, phylogenies, macromolecular structures, systems biology, microscopy, and magnetic resonance imaging data. Software tools used for visualizing biological data range from simple, standalone programs to complex, integrated systems.

The Virus Pathogen Database and Analysis Resource (ViPR) is an integrative and comprehensive publicly available database and analysis resource to search, analyze, visualize, save and share data for viral pathogens in the U.S. National Institute of Allergy and Infectious Diseases (NIAID) Category A-C Priority Pathogen lists for biodefense research, and other viral pathogens causing emerging/reemerging infectious diseases. ViPR is one of the five Bioinformatics Resource Centers (BRC) funded by NIAID, a component of the National Institutes of Health (NIH), which is an agency of the United States Department of Health and Human Services.

The Influenza Research Database (IRD) is an integrative and comprehensive publicly available database and analysis resource to search, analyze, visualize, save and share data for influenza virus research. IRD is one of the five Bioinformatics Resource Centers (BRC) funded by the National Institute of Allergy and Infectious Diseases (NIAID), a component of the National Institutes of Health (NIH), which is an agency of the United States Department of Health and Human Services.