EMBOSS

Last updated
EMBOSS
Written in C
Available in English
Type Bioinformatics tool
License GNU General Public Licence
Website emboss.open-bio.org

EMBOSS is a free c software analysis package developed for the needs of the molecular biology and bioinformatics user community. [1] The software automatically copes with data in a variety of formats and even allows transparent retrieval of sequence data from the web. Also, as extensive libraries are provided with the package, it is a platform to allow other scientists to develop and release software in true open source spirit. EMBOSS also integrates a range of currently available packages and tools for sequence analysis into a seamless whole.

Contents

EMBOSS is an acronym for European Molecular Biology Open Software Suite. The European part of the name hints at the wider scope. The core EMBOSS groups are collaborating with many other groups to develop the new applications that the users need. This was done from the beginning with EMBnet, the European Molecular Biology Network. EMBnet has many nodes worldwide most of which are national bioinformatics services. EMBnet has the programming expertise. In September 1998, the first workshop was held, when 30 people from EMBnet went to Hinxton to learn about EMBOSS and to discuss the way forward. [2]

The EMBOSS package contains a variety of applications for sequence alignment, rapid database searching with sequence patterns, protein motif identification (including domain analysis), and much more.

The AJAX and NUCLEUS libraries are released under the GNU Library General Public Licence. EMBOSS applications are released under the GNU General Public Licence. [3]

EMBOSS application groups

GroupDescription
AcdAcd file utilities
Alignment consensusMerging sequences to make a consensus
Alignment differencesFinding differences between sequences
Alignment dot plotsDot plot sequence comparisons
Alignment globalGlobal sequence alignment
Alignment localLocal sequence alignment
Alignment multiple Multiple sequence alignment
DisplayPublication-quality display
EditSequence editing
Enzyme kineticsEnzyme kinetics calculations
Feature tablesManipulation and display of sequence annotation
HMMHidden markov model analysis
InformationInformation and general help for users
MenusMenu interface(s)
Nucleic 2d structureNucleic acid secondary structure
Nucleic codon usageCodon usage analysis
Nucleic compositionComposition of nucleotide sequences
Nucleic CpG islandsCpG island detection and analysis
Nucleic gene findingPredictions of genes and other genomic features
Nucleic motifsNucleic acid motif searches
Nucleic mutationNucleic acid sequence mutation
Nucleic primersPrimer prediction
Nucleic profilesNucleic acid profile generation and searching
Nucleic repeatsNucleic acid repeat detection
Nucleic restrictionRestriction enzyme sites in nucleotide sequences
Nucleic RNA foldingRNA folding methods and analysis
Nucleic transcriptionTranscription factors, promoters and terminator prediction
Nucleic translationTranslation of nucleotide sequence to protein sequence
Phylogeny consensusPhylogenetic consensus methods
Phylogeny continuous charactersPhylogenetic continuous character methods
Phylogeny discrete charactersPhylogenetic discrete character methods
Phylogeny distance matrixPhylogenetic distance matrix methods
Phylogeny gene frequenciesPhylogenetic gene frequency methods
Phylogeny molecular sequencePhylogenetic molecular sequence methods
Phylogeny tree drawingPhylogenetic tree drawing methods
Protein 2d structureProtein secondary structure
Protein 3d structureProtein tertiary structure
Protein compositionComposition of protein sequences
Protein motifsProtein motif searches
Protein mutationProtein sequence mutation
Protein profilesProtein profile generation and searching
TestTesting tools, not for general use
Utils database creationDatabase installation
Utils database indexingDatabase indexing
Utils miscUtility tools

See also

Related Research Articles

An inverted repeat is a single stranded sequence of nucleotides followed downstream by its reverse complement. The intervening sequence of nucleotides between the initial sequence and the reverse complement can be any length including zero. For example, 5'---TTACGnnnnnnCGTAA---3' is an inverted repeat sequence. When the intervening length is zero, the composite sequence is a palindromic sequence.

<span class="mw-page-title-main">BioRuby</span>

BioRuby is a collection of open-source Ruby code, comprising classes for computational molecular biology and bioinformatics. It contains classes for DNA and protein sequence analysis, sequence alignment, biological database parsing, structural biology and other bioinformatics tasks. BioRuby is released under the GNU GPL version 2 or Ruby licence and is one of a number of Bio* projects, designed to reduce code duplication.

BioJava is an open-source software project dedicated to provide Java tools to process biological data. BioJava is a set of library functions written in the programming language Java for manipulating sequences, protein structures, file parsers, Common Object Request Broker Architecture (CORBA) interoperability, Distributed Annotation System (DAS), access to AceDB, dynamic programming, and simple statistical routines. BioJava supports a huge range of data, starting from DNA and protein sequences to the level of 3D protein structures. The BioJava libraries are useful for automating many daily and mundane bioinformatics tasks such as to parsing a Protein Data Bank (PDB) file, interacting with Jmol and many more. This application programming interface (API) provides various file parsers, data models and algorithms to facilitate working with the standard data formats and enables rapid application development and analysis.

<span class="mw-page-title-main">RasMol</span> Software for the visualisation of macromolecules

RasMol is a computer program written for molecular graphics visualization intended and used mainly to depict and explore biological macromolecule structures, such as those found in the Protein Data Bank. It was originally developed by Roger Sayle in the early 1990s.

<span class="mw-page-title-main">Clustal</span>

Clustal is a series of widely used computer programs used in bioinformatics for multiple sequence alignment. There have been many versions of Clustal over the development of the algorithm that are listed below. The analysis of each tool and its algorithm is also detailed in their respective categories. Available operating systems listed in the sidebar are a combination of the software availability and may not be supported for every current version of the Clustal tools. Clustal Omega has the widest variety of operating systems out of all the Clustal tools.

Bioconductor is a free, open source and open development software project for the analysis and comprehension of genomic data generated by wet lab experiments in molecular biology.

gEDA

The term gEDA refers to two things:

  1. A set of software applications used for electronic design released under the GPL. As such, gEDA is an ECAD or EDA application suite. gEDA is mostly oriented towards printed circuit board design. The gEDA applications are often referred to collectively as "the gEDA Suite".
  2. The collaboration of free software/open-source developers who work to develop and maintain the gEDA toolkit. The developers communicate via gEDA mailing lists, and have participated in the annual "Google Summer of Code" event as a single project. This collaboration is often referred to as "the gEDA Project".
<span class="mw-page-title-main">BALL</span>

BALL is a C++ class framework and set of algorithms and data structures for molecular modelling and computational structural bioinformatics, a Python interface to this library, and a graphical user interface to BALL, the molecule viewer BALLView.

<span class="mw-page-title-main">Dot plot (bioinformatics)</span>

In bioinformatics a dot plot is a graphical method for comparing two biological sequences and identifying regions of close similarity after sequence alignment. It is a type of recurrence plot.

<span class="mw-page-title-main">UTOPIA (bioinformatics tools)</span>

UTOPIA is a suite of free tools for visualising and analysing bioinformatics data. Based on an ontology-driven data model, it contains applications for viewing and aligning protein sequences, rendering complex molecular structures in 3D, and for finding and using resources such as web services and data objects. There are two major components, the protein analysis suite and UTOPIA documents.

The European Molecular Biology network (EMBnet) is an international scientific network and interest group that aims to enhance bioinformatics services by bringing together bioinformatics expertises and capacities. On 2011 EMBnet has 37 nodes spread over 32 countries. The nodes include bioinformatics related university departments, research institutes and national service providers.

<span class="mw-page-title-main">UGENE</span>

UGENE is computer software for bioinformatics. It works on personal computer operating systems such as Windows, macOS, or Linux. It is released as free and open-source software, under a GNU General Public License (GPL) version 2.

<span class="mw-page-title-main">BioSLAX</span>

BioSLAX is a Live CD/Live DVD/Live USB comprising a suite of more than 300 bioinformatics tools and application suites. It has been released by the Bioinformatics Resource Unit of the Life Sciences Institute (LSI), National University of Singapore (NUS) and is bootable from any PC that allows a CD/DVD or USB boot option and runs the compressed Slackware flavour of the Linux Operating System (OS), also known as Slax. Slax was created by Tomáš Matějíček in the Czech Republic using the Linux Live Scripts which he also developed. The BioSLAX derivative was created by Mark De Silva, Lim Kuan Siong and Tan Tin Wee.

<span class="mw-page-title-main">Terri Attwood</span> British bioinformatics researcher

Teresa K. Attwood is a professor of Bioinformatics in the Department of Computer Science and School of Biological Sciences at the University of Manchester and a visiting fellow at the European Bioinformatics Institute (EMBL-EBI). She held a Royal Society University Research Fellowship at University College London (UCL) from 1993 to 1999 and at the University of Manchester from 1999 to 2002.

The 'German Network for Bioinformatics Infrastructure – de.NBI' is a national, academic and non-profit infrastructure initiated by the Federal Ministry of Education and Research funding 2015-2021. The network provides bioinformatics services to users in life sciences research and biomedicine in Germany and Europe. The partners organize training events, courses and summer schools on tools, standards and compute services provided by de.NBI to assist researchers to more effectively exploit their data. From 2022, the network will be integrated into Forschungszentrum Jülich.

<span class="mw-page-title-main">Coiled-coil domain containing 166</span> Protein found in humans

Coiled-coil domain containing 166 is a protein that in humans is encoded by the CCDC166 gene. Its function is currently unknown. It contains a coiled-coil domain, hence the current origin of its name. It is primarily expressed in the testes.

References

  1. Rice P, Longden I, Bleasby A (2000). "EMBOSS: The European Molecular Biology Open Software Suite". Trends in Genetics. 16 (6): 276–277. doi:10.1016/S0168-9525(00)02024-2. PMID   10827456.
  2. Rice P, Bleasby A (1999). "EMBOSS: The European Molecular Biology Open Software Suite". Biochemist E-volution. 16 (6): 276–7. doi:10.1016/s0168-9525(00)02024-2. PMID   10827456.
  3. "1.1. Licence Information".