AMPHORA

Last updated
AMPHORA
Developer(s) Martin Wu, Jonathan Eisen et al.
Stable release
2.0 / 2013
Repository
Written in Perl
Operating system Linux
Available in English
Type Bioinformatics
License GNU General Public License
Website http://wolbachia.biology.virginia.edu/WuLab/Software.html

AMPHORA ("AutoMated Phylogenomic infeRence Application") is an open-source bioinformatics workflow. [1] [2] AMPHORA2 uses 31 bacterial and 104 archaeal phylogenetic marker genes for inferring phylogenetic information from metagenomic datasets. Most of the marker genes are single copy genes, therefore AMPHORA2 is suitable for inferring the accurate taxonomic composition of bacterial and archaeal communities from metagenomic shotgun sequencing data.

Contents

First AMPHORA was used for re-analysis of the Sargasso Sea metagenomic data [3] in 2008, but recently there are more and more metagenomic datasets in the Sequence Read Archive waiting for analysis with AMPHORA2.

AmphoraNet

AmphoraNet [4] is the web server implementation of the AMPHORA2 workflow developed by the PIT Bioinformatics Group. AmphoraNet uses the default options of AMPHORA2.

AmphoraVizu

AmphoraVizu [5] is a web server developed by the PIT Bioinformatics Group which is capable to visualize outputs generated by the AMPHORA2 or its webserver implementation AmphoraNet.

Related Research Articles

<span class="mw-page-title-main">Bioinformatics</span> Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.

In genetics, shotgun sequencing is a method used for sequencing random DNA strands. It is named by analogy with the rapidly expanding, quasi-random shot grouping of a shotgun.

<span class="mw-page-title-main">Metagenomics</span> Study of genes found in the environment

Metagenomics is the study of genetic material recovered directly from environmental or clinical samples by a method called sequencing. The broad field may also be referred to as environmental genomics, ecogenomics, community genomics or microbiomics.

GenePattern is a freely available computational biology open-source software package originally created and developed at the Broad Institute for the analysis of genomic data. Designed to enable researchers to develop, capture, and reproduce genomic analysis methodologies, GenePattern was first released in 2004. GenePattern is currently developed at the University of California, San Diego.

Global Ocean Sampling Expedition Ocean exploration genome project to assess genetic diversity in marine microbial communities

The Global Ocean Sampling Expedition (GOS) is an ocean exploration genome project whose goal is to assess genetic diversity in marine microbial communities and to understand their role in nature's fundamental processes. It was begun as a Sargasso Sea pilot sampling project in August 2003; Craig Venter announced the full expedition on 4 March 2004. The two-year journey, which used Craig Venter's personal yacht, originated in Halifax, Canada, circumnavigated the globe and terminated in the U.S. in January 2006. The expedition sampled water from Halifax, Nova Scotia to the Eastern Tropical Pacific Ocean. During 2007, sampling continued along the west coast of North America.

<span class="mw-page-title-main">Human Microbiome Project</span> Former research initiative

The Human Microbiome Project (HMP) was a United States National Institutes of Health (NIH) research initiative to improve understanding of the microbiota involved in human health and disease. Launched in 2007, the first phase (HMP1) focused on identifying and characterizing human microbiota. The second phase, known as the Integrative Human Microbiome Project (iHMP) launched in 2014 with the aim of generating resources to characterize the microbiome and elucidating the roles of microbes in health and disease states. The program received $170 million in funding by the NIH Common Fund from 2007 to 2016.

MEGAN is a computer program that allows optimized analysis of large metagenomic datasets.

SOAP is a suite of bioinformatics software tools from the BGI Bioinformatics department enabling the assembly, alignment, and analysis of next generation DNA sequencing data. It is particularly suited to short read sequencing data.

GeneNetwork is a combined database and open-source bioinformatics data analysis software resource for systems genetics. This resource is used to study gene regulatory networks that link DNA sequence differences to corresponding differences in gene and protein expression and to variation in traits such as health and disease risk. Data sets in GeneNetwork are typically made up of large collections of genotypes and phenotypes from groups of individuals, including humans, strains of mice and rats, and organisms as diverse as Drosophila melanogaster, Arabidopsis thaliana, and barley. The inclusion of genotypes makes it practical to carry out web-based gene mapping to discover those regions of genomes that contribute to differences among individuals in mRNA, protein, and metabolite levels, as well as differences in cell function, anatomy, physiology, and behavior.

In metagenomics, binning is the process of grouping reads or contigs and assigning them to individual genome. Binning methods can be based on either compositional features or alignment (similarity), or both.

Microbial phylogenetics is the study of the manner in which various groups of microorganisms are genetically related. This helps to trace their evolution. To study these relationships biologists rely on comparative genomics, as physiology and comparative anatomy are not possible methods.

MG-RAST is an open-source web application server that suggests automatic phylogenetic and functional analysis of metagenomes. It is also one of the biggest repositories for metagenomic data. The name is an abbreviation of Metagenomic Rapid Annotations using Subsystems Technology. The pipeline automatically produces functional assignments to the sequences that belong to the metagenome by performing sequence comparisons to databases in both nucleotide and amino-acid levels. The applications supply phylogenetic and functional assignments of the metagenome being analysed, as well as tools for comparing different metagenomes. It also provides a RESTful API for programmatic access.

<span class="mw-page-title-main">Viral metagenomics</span>

Viral metagenomics uses metagenomic technologies to detect viral genomic material from diverse environmental and clinical samples. Viruses are the most abundant biological entity and are extremely diverse; however, only a small fraction of viruses have been sequenced and only an even smaller fraction have been isolated and cultured. Sequencing viruses can be challenging because viruses lack a universally conserved marker gene so gene-based approaches are limited. Metagenomics can be used to study and analyze unculturable viruses and has been an important tool in understanding viral diversity and abundance and in the discovery of novel viruses. For example, metagenomics methods have been used to describe viruses associated with cancerous tumors and in terrestrial ecosystems.

Mark J. Pallen is a research leader at the Quadram Institute and Professor of Microbial Genomics at the University of East Anglia. In recent years, he has been at the forefront of efforts to apply next-generation sequencing to problems in microbiology and ancient DNA research.

Metatranscriptomics is the set of techniques used to study gene expression of microbes within natural environments, i.e., the metatranscriptome.

PICRUSt is a bioinformatics software package. The name is an abbreviation for Phylogenetic Investigation of Communities by Reconstruction of Unobserved States.

Machine learning in bioinformatics is the application of machine learning algorithms to bioinformatics, including genomics, proteomics, microarrays, systems biology, evolution, and text mining.

Nikos Kyrpides is a Greek-American bioscientist who has worked on the origins of life, information processing, bioinformatics, microbiology, metagenomics and microbiome data science. He is a senior staff scientist at the Berkeley National Laboratory, head of the Prokaryote Super Program and leads the Microbiome Data Science program at the US Department of Energy Joint Genome Institute.

<span class="mw-page-title-main">Genome skimming</span> Method of genome sequencing

Genome skimming is a sequencing approach that uses low-pass, shallow sequencing of a genome, to generate fragments of DNA, known as genome skims. These genome skims contain information about the high-copy fraction of the genome. The high-copy fraction of the genome consists of the ribosomal DNA, plastid genome (plastome), mitochondrial genome (mitogenome), and nuclear repeats such as microsatellites and transposable elements. It employs high-throughput, next generation sequencing technology to generate these skims. Although these skims are merely 'the tip of the genomic iceberg', phylogenomic analysis of them can still provide insights on evolutionary history and biodiversity at a lower cost and larger scale than traditional methods. Due to the small amount of DNA required for genome skimming, its methodology can be applied in other fields other than genomics. Tasks like this include determining the traceability of products in the food industry, enforcing international regulations regarding biodiversity and biological resources, and forensics.

References

  1. Wu, Martin; J.A. Eisen (2008). "A simple, fast, and accurate method of phylogenomic inference". Genome Biol. 9 (10): R151. doi: 10.1186/gb-2008-9-10-r151 . PMC   2760878 . PMID   18851752.
  2. Wu, Martin; A.J. Scott (2012). "Phylogenomic analysis of bacterial and archaeal sequences with AMPHORA2". Bioinformatics. 28 (7): 1033–1034. doi: 10.1093/bioinformatics/bts079 . PMID   22332237.
  3. Venter, J. Craig; et al. (2004). "Environmental Genome Shotgun Sequencing of the Sargasso Sea". Science. 304 (5667): 66–74. Bibcode:2004Sci...304...66V. CiteSeerX   10.1.1.124.1840 . doi:10.1126/science.1093857. PMID   15001713. S2CID   1454587.
  4. Kerepesi, Csaba; et al. (2014). "The webserver implementation of the AMPHORA2 metagenomic workflow suite". Gene. 533 (2): 538–540. doi:10.1016/j.gene.2013.10.015. PMID   24144838.
  5. Kerepesi, Csaba; et al. (2014). "Visual Analysis of the Quantitative Composition of Metagenomic Communities: the AmphoraVizu Webserver". Microbial Ecology. 69 (3): 695–697. doi:10.1007/s00248-014-0502-6. PMID   25296554. S2CID   14207754.

_AMPHORA]