Written in | C |
---|---|
Available in | English |
Type | Bioinformatics tool |
License | GNU General Public Licence |
Website | emboss |
EMBOSS is a free c software analysis package developed for the needs of the molecular biology and bioinformatics user community. [1] The software automatically copes with data in a variety of formats and even allows transparent retrieval of sequence data from the web. Also, as extensive libraries are provided with the package, it is a platform to allow other scientists to develop and release software in true open source spirit. EMBOSS also integrates a range of currently available packages and tools for sequence analysis into a seamless whole.
EMBOSS is an acronym for European Molecular Biology Open Software Suite. The European part of the name hints at the wider scope. The core EMBOSS groups are collaborating with many other groups to develop the new applications that the users need. This was done from the beginning with EMBnet, the European Molecular Biology Network. EMBnet has many nodes worldwide most of which are national bioinformatics services. EMBnet has the programming expertise. In September 1998, the first workshop was held, when 30 people from EMBnet went to Hinxton to learn about EMBOSS and to discuss the way forward. [2]
The EMBOSS package contains a variety of applications for sequence alignment, rapid database searching with sequence patterns, protein motif identification (including domain analysis), and much more.
The AJAX and NUCLEUS libraries are released under the GNU Library General Public Licence. EMBOSS applications are released under the GNU General Public Licence. [3]
Group | Description |
---|---|
Acd | Acd file utilities |
Alignment consensus | Merging sequences to make a consensus |
Alignment differences | Finding differences between sequences |
Alignment dot plots | Dot plot sequence comparisons |
Alignment global | Global sequence alignment |
Alignment local | Local sequence alignment |
Alignment multiple | Multiple sequence alignment |
Display | Publication-quality display |
Edit | Sequence editing |
Enzyme kinetics | Enzyme kinetics calculations |
Feature tables | Manipulation and display of sequence annotation |
HMM | Hidden markov model analysis |
Information | Information and general help for users |
Menus | Menu interface(s) |
Nucleic 2d structure | Nucleic acid secondary structure |
Nucleic codon usage | Codon usage analysis |
Nucleic composition | Composition of nucleotide sequences |
Nucleic CpG islands | CpG island detection and analysis |
Nucleic gene finding | Predictions of genes and other genomic features |
Nucleic motifs | Nucleic acid motif searches |
Nucleic mutation | Nucleic acid sequence mutation |
Nucleic primers | Primer prediction |
Nucleic profiles | Nucleic acid profile generation and searching |
Nucleic repeats | Nucleic acid repeat detection |
Nucleic restriction | Restriction enzyme sites in nucleotide sequences |
Nucleic RNA folding | RNA folding methods and analysis |
Nucleic transcription | Transcription factors, promoters and terminator prediction |
Nucleic translation | Translation of nucleotide sequence to protein sequence |
Phylogeny consensus | Phylogenetic consensus methods |
Phylogeny continuous characters | Phylogenetic continuous character methods |
Phylogeny discrete characters | Phylogenetic discrete character methods |
Phylogeny distance matrix | Phylogenetic distance matrix methods |
Phylogeny gene frequencies | Phylogenetic gene frequency methods |
Phylogeny molecular sequence | Phylogenetic molecular sequence methods |
Phylogeny tree drawing | Phylogenetic tree drawing methods |
Protein 2d structure | Protein secondary structure |
Protein 3d structure | Protein tertiary structure |
Protein composition | Composition of protein sequences |
Protein motifs | Protein motif searches |
Protein mutation | Protein sequence mutation |
Protein profiles | Protein profile generation and searching |
Test | Testing tools, not for general use |
Utils database creation | Database installation |
Utils database indexing | Database indexing |
Utils misc | Utility tools |
An inverted repeat is a single stranded sequence of nucleotides followed downstream by its reverse complement. The intervening sequence of nucleotides between the initial sequence and the reverse complement can be any length including zero. For example, 5'---TTACGnnnnnnCGTAA---3' is an inverted repeat sequence. When the intervening length is zero, the composite sequence is a palindromic sequence.
BioRuby is a collection of open-source Ruby code, comprising classes for computational molecular biology and bioinformatics. It contains classes for DNA and protein sequence analysis, sequence alignment, biological database parsing, structural biology and other bioinformatics tasks. BioRuby is released under the GNU GPL version 2 or Ruby licence and is one of a number of Bio* projects, designed to reduce code duplication.
BioJava is an open-source software project dedicated to provide Java tools to process biological data. BioJava is a set of library functions written in the programming language Java for manipulating sequences, protein structures, file parsers, Common Object Request Broker Architecture (CORBA) interoperability, Distributed Annotation System (DAS), access to AceDB, dynamic programming, and simple statistical routines. BioJava supports a huge range of data, starting from DNA and protein sequences to the level of 3D protein structures. The BioJava libraries are useful for automating many daily and mundane bioinformatics tasks such as to parsing a Protein Data Bank (PDB) file, interacting with Jmol and many more. This application programming interface (API) provides various file parsers, data models and algorithms to facilitate working with the standard data formats and enables rapid application development and analysis.
RasMol is a computer program written for molecular graphics visualization intended and used mainly to depict and explore biological macromolecule structures, such as those found in the Protein Data Bank. It was originally developed by Roger Sayle in the early 1990s.
Clustal is a series of widely used computer programs used in bioinformatics for multiple sequence alignment. There have been many versions of Clustal over the development of the algorithm that are listed below. The analysis of each tool and its algorithm is also detailed in their respective categories. Available operating systems listed in the sidebar are a combination of the software availability and may not be supported for every current version of the Clustal tools. Clustal Omega has the widest variety of operating systems out of all the Clustal tools.
Bioconductor is a free, open source and open development software project for the analysis and comprehension of genomic data generated by wet lab experiments in molecular biology.
The term gEDA refers to two things:
BALL is a C++ class framework and set of algorithms and data structures for molecular modelling and computational structural bioinformatics, a Python interface to this library, and a graphical user interface to BALL, the molecule viewer BALLView.
In bioinformatics a dot plot is a graphical method for comparing two biological sequences and identifying regions of close similarity after sequence alignment. It is a type of recurrence plot.
UTOPIA is a suite of free tools for visualising and analysing bioinformatics data. Based on an ontology-driven data model, it contains applications for viewing and aligning protein sequences, rendering complex molecular structures in 3D, and for finding and using resources such as web services and data objects. There are two major components, the protein analysis suite and UTOPIA documents.
The European Molecular Biology network (EMBnet) is an international scientific network and interest group that aims to enhance bioinformatics services by bringing together bioinformatics expertises and capacities. On 2011 EMBnet has 37 nodes spread over 32 countries. The nodes include bioinformatics related university departments, research institutes and national service providers.
UGENE is computer software for bioinformatics. It works on personal computer operating systems such as Windows, macOS, or Linux. It is released as free and open-source software, under a GNU General Public License (GPL) version 2.
BioSLAX is a Live CD/Live DVD/Live USB comprising a suite of more than 300 bioinformatics tools and application suites. It has been released by the Bioinformatics Resource Unit of the Life Sciences Institute (LSI), National University of Singapore (NUS) and is bootable from any PC that allows a CD/DVD or USB boot option and runs the compressed Slackware flavour of the Linux Operating System (OS), also known as Slax. Slax was created by Tomáš Matějíček in the Czech Republic using the Linux Live Scripts which he also developed. The BioSLAX derivative was created by Mark De Silva, Lim Kuan Siong and Tan Tin Wee.
Teresa K. Attwood is a professor of Bioinformatics in the Department of Computer Science and School of Biological Sciences at the University of Manchester and a visiting fellow at the European Bioinformatics Institute (EMBL-EBI). She held a Royal Society University Research Fellowship at University College London (UCL) from 1993 to 1999 and at the University of Manchester from 1999 to 2002.
The 'German Network for Bioinformatics Infrastructure – de.NBI' is a national, academic and non-profit infrastructure initiated by the Federal Ministry of Education and Research funding 2015-2021. The network provides bioinformatics services to users in life sciences research and biomedicine in Germany and Europe. The partners organize training events, courses and summer schools on tools, standards and compute services provided by de.NBI to assist researchers to more effectively exploit their data. From 2022, the network will be integrated into Forschungszentrum Jülich.
Coiled-coil domain containing 166 is a protein that in humans is encoded by the CCDC166 gene. Its function is currently unknown. It contains a coiled-coil domain, hence the current origin of its name. It is primarily expressed in the testes.