AMPHORA

Last updated
AMPHORA
Developer(s) Martin Wu, Jonathan Eisen et al.
Stable release
2.0 / 2013
Repository
Written in Perl
Operating system Linux
Available in English
Type Bioinformatics
License GNU General Public License
Website http://wolbachia.biology.virginia.edu/WuLab/Software.html

AMPHORA ("AutoMated Phylogenomic infeRence Application") is an open-source bioinformatics workflow. [1] [2] AMPHORA2 uses 31 bacterial and 104 archaeal phylogenetic marker genes for inferring phylogenetic information from metagenomic datasets. Most of the marker genes are single copy genes, therefore AMPHORA2 is suitable for inferring the accurate taxonomic composition of bacterial and archaeal communities from metagenomic shotgun sequencing data.

Contents

First AMPHORA was used for re-analysis of the Sargasso Sea metagenomic data [3] in 2008, but recently there are more and more metagenomic datasets in the Sequence Read Archive waiting for analysis with AMPHORA2.

AmphoraNet

AmphoraNet [4] is the web server implementation of the AMPHORA2 workflow developed by the PIT Bioinformatics Group. AmphoraNet uses the default options of AMPHORA2.

AmphoraVizu

AmphoraVizu [5] is a web server developed by the PIT Bioinformatics Group which is capable to visualize outputs generated by the AMPHORA2 or its webserver implementation AmphoraNet.

Related Research Articles

<span class="mw-page-title-main">Bioinformatics</span> Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.

In bioinformatics, sequence analysis is the process of subjecting a DNA, RNA or peptide sequence to any of a wide range of analytical methods to understand its features, function, structure, or evolution. It can be performed on the entire genome, transcriptome or proteome of an organism, and can also involve only selected segments or regions, like tandem repeats and transposable elements. Methodologies used include sequence alignment, searches against biological databases, and others.

<span class="mw-page-title-main">Metagenomics</span> Study of genes found in the environment

Metagenomics is the study of genetic material recovered directly from environmental or clinical samples by a method called sequencing. The broad field may also be referred to as environmental genomics, ecogenomics, community genomics or microbiomics.

Global Ocean Sampling Expedition Ocean exploration genome project to assess genetic diversity in marine microbial communities

The Global Ocean Sampling Expedition (GOS) is an ocean exploration genome project whose goal is to assess genetic diversity in marine microbial communities and to understand their role in nature's fundamental processes. It was begun as a Sargasso Sea pilot sampling project in August 2003; Craig Venter announced the full expedition on 4 March 2004. The two-year journey, which used Craig Venter's personal yacht, originated in Halifax, Canada, circumnavigated the globe and terminated in the U.S. in January 2006. The expedition sampled water from Halifax, Nova Scotia to the Eastern Tropical Pacific Ocean. During 2007, sampling continued along the west coast of North America.

<span class="mw-page-title-main">Galaxy (computational biology)</span>

Galaxy is a scientific workflow, data integration, and data and analysis persistence and publishing platform that aims to make computational biology accessible to research scientists that do not have computer programming or systems administration experience. Although it was initially developed for genomics research, it is largely domain agnostic and is now used as a general bioinformatics workflow management system.

<span class="mw-page-title-main">Eugene Myers</span> American scientist

Eugene Wimberly "Gene" Myers, Jr. is an American computer scientist and bioinformatician, who is best known for contributing to the early development of the NCBI's BLAST tool for sequence analysis.

<span class="mw-page-title-main">Human Microbiome Project</span> Former research initiative

The Human Microbiome Project (HMP) was a United States National Institutes of Health (NIH) research initiative to improve understanding of the microbiota involved in human health and disease. Launched in 2007, the first phase (HMP1) focused on identifying and characterizing human microbiota. The second phase, known as the Integrative Human Microbiome Project (iHMP) launched in 2014 with the aim of generating resources to characterize the microbiome and elucidating the roles of microbes in health and disease states. The program received $170 million in funding by the NIH Common Fund from 2007 to 2016.

MEGAN is a computer program that allows optimized analysis of large metagenomic datasets.

In metagenomics, binning is the process of grouping reads or contigs and assigning them to individual genome. Binning methods can be based on either compositional features or alignment (similarity), or both.

Microbial phylogenetics is the study of the manner in which various groups of microorganisms are genetically related. This helps to trace their evolution. To study these relationships biologists rely on comparative genomics, as physiology and comparative anatomy are not possible methods.

MG-RAST, an open-source web application server, facilitates automatic phylogenetic and functional analysis of metagenomes. It stands as one of the largest repositories for metagenomic data, employing the acronym for Metagenomic Rapid Annotations using Subsystems Technology (MG-RAST). This platform utilizes a pipeline that automatically assigns functions to metagenomic sequences, conducting sequence comparisons at both nucleotide and amino acid levels. Users benefit from phylogenetic and functional insights into the analyzed metagenomes, along with tools for comparing different datasets. MG-RAST also offers a RESTful API for programmatic access.

<span class="mw-page-title-main">Viral metagenomics</span>

Viral metagenomics uses metagenomic technologies to detect viral genomic material from diverse environmental and clinical samples. Viruses are the most abundant biological entity and are extremely diverse; however, only a small fraction of viruses have been sequenced and only an even smaller fraction have been isolated and cultured. Sequencing viruses can be challenging because viruses lack a universally conserved marker gene so gene-based approaches are limited. Metagenomics can be used to study and analyze unculturable viruses and has been an important tool in understanding viral diversity and abundance and in the discovery of novel viruses. For example, metagenomics methods have been used to describe viruses associated with cancerous tumors and in terrestrial ecosystems.

Mark J. Pallen is a research leader at the Quadram Institute and Professor of Microbial Genomics at the University of East Anglia. In recent years, he has been at the forefront of efforts to apply next-generation sequencing to problems in microbiology and ancient DNA research.

Metatranscriptomics is the set of techniques used to study gene expression of microbes within natural environments, i.e., the metatranscriptome.

<span class="mw-page-title-main">Lokiarchaeota</span> Phylum of archaea

Lokiarchaeota is a proposed phylum of the Archaea. The phylum includes all members of the group previously named Deep Sea Archaeal Group, also known as Marine Benthic Group B. Lokiarchaeota is part of the superphylum Asgard containing the phyla: Lokiarchaeota, Thorarchaeota, Odinarchaeota, Heimdallarchaeota, and Helarchaeota. A phylogenetic analysis disclosed a monophyletic grouping of the Lokiarchaeota with the eukaryotes. The analysis revealed several genes with cell membrane-related functions. The presence of such genes support the hypothesis of an archaeal host for the emergence of the eukaryotes; the eocyte-like scenarios.

PICRUSt is a bioinformatics software package. The name is an abbreviation for Phylogenetic Investigation of Communities by Reconstruction of Unobserved States.

Machine learning in bioinformatics is the application of machine learning algorithms to bioinformatics, including genomics, proteomics, microarrays, systems biology, evolution, and text mining.

Nikos Kyrpides is a Greek-American bioscientist who has worked on the origins of life, information processing, bioinformatics, microbiology, metagenomics and microbiome data science. He is a senior staff scientist at the Berkeley National Laboratory, head of the Prokaryote Super Program and leads the Microbiome Data Science program at the US Department of Energy Joint Genome Institute.

Clinical metagenomic next-generation sequencing (mNGS) is the comprehensive analysis of microbial and host genetic material in clinical samples from patients by next-generation sequencing. It uses the techniques of metagenomics to identify and characterize the genome of bacteria, fungi, parasites, and viruses without the need for a prior knowledge of a specific pathogen directly from clinical specimens. The capacity to detect all the potential pathogens in a sample makes metagenomic next generation sequencing a potent tool in the diagnosis of infectious disease especially when other more directed assays, such as PCR, fail. Its limitations include clinical utility, laboratory validity, sense and sensitivity, cost and regulatory considerations.

References

  1. Wu, Martin; J.A. Eisen (2008). "A simple, fast, and accurate method of phylogenomic inference". Genome Biol. 9 (10): R151. doi: 10.1186/gb-2008-9-10-r151 . PMC   2760878 . PMID   18851752.
  2. Wu, Martin; A.J. Scott (2012). "Phylogenomic analysis of bacterial and archaeal sequences with AMPHORA2". Bioinformatics. 28 (7): 1033–1034. doi: 10.1093/bioinformatics/bts079 . PMID   22332237.
  3. Venter, J. Craig; et al. (2004). "Environmental Genome Shotgun Sequencing of the Sargasso Sea". Science. 304 (5667): 66–74. Bibcode:2004Sci...304...66V. CiteSeerX   10.1.1.124.1840 . doi:10.1126/science.1093857. PMID   15001713. S2CID   1454587.
  4. Kerepesi, Csaba; et al. (2014). "The webserver implementation of the AMPHORA2 metagenomic workflow suite". Gene. 533 (2): 538–540. doi:10.1016/j.gene.2013.10.015. PMID   24144838.
  5. Kerepesi, Csaba; et al. (2014). "Visual Analysis of the Quantitative Composition of Metagenomic Communities: the AmphoraVizu Webserver". Microbial Ecology. 69 (3): 695–697. doi:10.1007/s00248-014-0502-6. PMID   25296554. S2CID   14207754.

_AMPHORA]