Content | |
---|---|
Description | MEGARes is an antimicrobial resistance database made for high throughput sequencing based at collaborating centers |
Data types captured | Antimicrobial resistance genes and phenotypes |
Organisms | Bacteria |
Contact | |
Research center | Texas A&M University, University of Minnesota, University of Florida, Colorado State Uninversity |
Primary citation | PMID 27899569 |
Access | |
Website | http://meglab.org |
Download URL | Download |
Miscellaneous | |
Bookmarkable entities | yes |
MEGARes is a hand-curated antibiotic resistance database which incorporates previously published resistance sequences for antimicrobial drugs, while also expanding to include published sequences for metal and biocide resistance determinants. In MEGARes 3.0, the nodes of the acyclic hierarchical ontology include four antimicrobial compound types, 59 classes, 223 mechanisms of resistance, and 1,448 gene groups that classify the 8,733 gene accessions. [1] [2] [3] This works in conjunction with the AMR++ bioinformatics pipelin (version 3.0) to classify resistome sequences directly from FASTA.[ citation needed ]
The database focuses on the analysis of large-scale, ecological sequence datasets with an annotation structure that allows for the development of high throughput acyclical classifiers and hierarchical statistical analysis of big data. MEGARes annotation consists of three hierarchical levels when looking at AMR genes: drug class, mechanism, and group. The comprehensive MEGARes content was compiled from all published sequences included various other databases: Resfinder, ARG-ANNOT, Comprehensive Antibiotic Resistance Database (CARD), and the National Center for Biotechnology Information (NCBI) Lahey Clinic beta-lactamase archive.[ citation needed ]
MEGARes allows users to analyze antimicrobial resistance on a population-level, similar to a microbiome analysis, from a FASTA sequence. Furthermore, users can access AMR++, a bioiinformatics pipeline for resistome analysis of metagenomic datasets that can be integrated with the MEGARes database.[ citation needed ]
In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Aligned sequences of nucleotide or amino acid residues are typically represented as rows within a matrix. Gaps are inserted between the residues so that identical or similar characters are aligned in successive columns. Sequence alignments are also used for non-biological sequences, such as calculating the distance cost between strings in a natural language or in financial data.
In computational biology, gene prediction or gene finding refers to the process of identifying the regions of genomic DNA that encode genes. This includes protein-coding genes as well as RNA genes, but may also include prediction of other functional elements such as regulatory regions. Gene finding is one of the first and most important steps in understanding the genome of a species once it has been sequenced.
Metagenomics is the study of genetic material recovered directly from environmental or clinical samples by a method called sequencing. The broad field may also be referred to as environmental genomics, ecogenomics, community genomics or microbiomics.
KEGG is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances. KEGG is utilized for bioinformatics research and education, including data analysis in genomics, metagenomics, metabolomics and other omics studies, modeling and simulation in systems biology, and translational research in drug development.
The Integrated Microbial Genomes system is a genome browsing and annotation platform developed by the U.S. Department of Energy (DOE)-Joint Genome Institute. IMG contains all the draft and complete microbial genomes sequenced by the DOE-JGI integrated with other publicly available genomes. IMG provides users a set of tools for comparative analysis of microbial genomes along three dimensions: genes, genomes and functions. Users can select and transfer them in the comparative analysis carts based upon a variety of criteria. IMG also includes a genome annotation pipeline that integrates information from several tools, including KEGG, Pfam, InterPro, and the Gene Ontology, among others. Users can also type or upload their own gene annotations and the IMG system will allow them to generate Genbank or EMBL format files containing these annotations.
Rfam is a database containing information about non-coding RNA (ncRNA) families and other structured RNA elements. It is an annotated, open access database originally developed at the Wellcome Trust Sanger Institute in collaboration with Janelia Farm, and currently hosted at the European Bioinformatics Institute. Rfam is designed to be similar to the Pfam database for annotating protein families.
16S ribosomal RNA is the RNA component of the 30S subunit of a prokaryotic ribosome. It binds to the Shine-Dalgarno sequence and provides most of the SSU structure.
MicrobesOnline is a publicly and freely accessible website that hosts multiple comparative genomic tools for comparing microbial species at the genomic, transcriptomic and functional levels. MicrobesOnline was developed by the Virtual Institute for Microbial Stress and Survival, which is based at the Lawrence Berkeley National Laboratory in Berkeley, California. The site was launched in 2005, with regular updates until 2011.
In metagenomics, binning is the process of grouping reads or contigs and assigning them to individual genome. Binning methods can be based on either compositional features or alignment (similarity), or both.
Viral metagenomics uses metagenomic technologies to detect viral genomic material from diverse environmental and clinical samples. Viruses are the most abundant biological entity and are extremely diverse; however, only a small fraction of viruses have been sequenced and only an even smaller fraction have been isolated and cultured. Sequencing viruses can be challenging because viruses lack a universally conserved marker gene so gene-based approaches are limited. Metagenomics can be used to study and analyze unculturable viruses and has been an important tool in understanding viral diversity and abundance and in the discovery of novel viruses. For example, metagenomics methods have been used to describe viruses associated with cancerous tumors and in terrestrial ecosystems.
Machine learning in bioinformatics is the application of machine learning algorithms to bioinformatics, including genomics, proteomics, microarrays, systems biology, evolution, and text mining.
The Comprehensive Antibiotic Resistance Database (CARD) is a biological database that collects and organizes reference information on antimicrobial resistance genes, proteins and phenotypes. The database covers all types of drug classes and resistance mechanisms and structures its data based on an ontology. The CARD database was one of the first resources that covered antimicrobial resistance genes. The resource is updated monthly and provides tools to allow users to find potential antibiotic resistance genes in newly-sequenced genomes.
FARME also known as Functional Antibiotic Resistance Metagenomic Element is a database that compiles publicly available DNA elements and predicted proteins that confer antibiotic resistance, regulatory elements and mobile genetic elements. It is the first database to focus on functional metagenomics. This allows the database to understand 99% of bacteria which cannot be cultured, the relationship between environmental antibiotic resistance sequences and antibiotic genes derived from cultured isolates. This information was derived from 20 metagenomics projects from GenBank. Also from GenBank are the protein sequence predictions and annotations.
VFDB also known as Virulence Factor Database is a database that provides scientist quick access to virulence factors in bacterial pathogens. It can be navigated and browsed using genus or words. A BLAST tool is provided for search against known virulence factors. VFDB contains a collection of 16 important bacterial pathogens. Perl scripts were used to extract positions and sequences of VF from GenBank. Clusters of Orthologous Groups (COG) was used to update incomplete annotations. More information was obtained by NCBI. VFDB was built on Linux operation systems on DELL PowerEdge 1600SC servers.
In molecular biology, MvirDB is a publicly available database that stores information on toxins, virulence factors and antibiotic resistance genes. Sources that this database uses for DNA and protein information include: Tox-Prot, SCORPION, the PRINTS Virulence Factors, VFDB, TVFac, Islander, ARGO and VIDA. The database provides a BLAST tool that allows the user to query their sequence against all DNA and protein sequences in MvirDB. Information on virulence factors can be obtained from the usage of the provided browser tool. Once the browser tool is used, the results are returned as a readable table that is organized by ascending E-Values, each of which are hyperlinked to their related page. MvirDB is implemented in an Oracle 10g relational database.
BacMet is an antimicrobial resistance database. It tracks bacterial genes that give resistance to antibacterial biocides and metals.
The SARG database also known as Structured Antibiotic Resistance Gene database is a collection of antimicrobial resistance genes. The hierarchical structure of the database is clear to be 1) Type: antibiotic type 2) Subtype: genotype 3) Sequence: reference sequence. The SARG database helps in quick survey of antimicrobial resistance genes from environmental samples. The database was initially integrated from ARDB and Comprehensive Antibiotic Resistance Database, followed by hand curation including removing non-ARG sequences, removing redundant sequences and SNP sequences. Other sources include NCBI nr database and published papers.
Clinical metagenomic next-generation sequencing (mNGS) is the comprehensive analysis of microbial and host genetic material in clinical samples from patients by next-generation sequencing. It uses the techniques of metagenomics to identify and characterize the genome of bacteria, fungi, parasites, and viruses without the need for a prior knowledge of a specific pathogen directly from clinical specimens. The capacity to detect all the potential pathogens in a sample makes metagenomic next generation sequencing a potent tool in the diagnosis of infectious disease especially when other more directed assays, such as PCR, fail. Its limitations include clinical utility, laboratory validity, sense and sensitivity, cost and regulatory considerations.
The AMRFinderPlus tool from the National Center for Biotechnology Information (NCBI) is a bioinformatic tool that allows users to identify antimicrobial resistance determinants, stress response, and virulence genes in bacterial genomes. This tool's development began in 2018 and is still underway. The National Institutes of Health funds the development of the software and the databases it uses.