Part of a series on |
Microbiomes |
---|
Virome refers to the assemblage of viruses [1] [2] that is often investigated and described by metagenomic sequencing of viral nucleic acids [3] that are found associated with a particular ecosystem, organism or holobiont. The word is frequently used to describe environmental viral shotgun metagenomes. Viruses, including bacteriophages, are found in all environments, and studies of the virome have provided insights into nutrient cycling, [4] [5] development of immunity, [6] and a major source of genes through lysogenic conversion. [7] Also, the human virome has been characterized in nine organs (colon, liver, lung, heart, brain, kidney, skin, blood, hair) of 31 Finnish individuals using qPCR and NGS methodologies. [8]
The first comprehensive studies of viromes were by shotgun community sequencing, [9] which is frequently referred to as metagenomics. In the 2000s, the Rohwer lab sequenced viromes from seawater, [9] [10] marine sediments, [11] adult human stool, [12] infant human stool, [13] soil, [14] and blood. [15] This group also performed the first RNA virome with collaborators from the Genomic Institute of Singapore. [16] From these early works, it was concluded that most of the genomic diversity is contained in the global virome and that most of this diversity remains uncharacterized. [17] This view was supported by individual genomic sequencing project, particularly the mycobacterium phage. [18]
By the late 2010s advances in sequencing technologies have allowed for a deep probing of viromes. [19] The virome of the human gut in particular has gained increased attention as a result of these advancements. [20] [21]
In order to study the virome, virus-like particles are separated from cellular components, usually using a combination of filtration, density centrifugation, and enzymatic treatments to get rid of free nucleic acids. [22] The nucleic acids are then sequenced and analyzed using metagenomic methods. Alternatively, there are recent computational methods that use directly metagenomic assembled sequences to discover viruses. [23]
The Global Ocean Viromes (GOV) is a dataset consisting of deep sequencing from over 150 samples collected across the world's oceans in two survey periods by an international team. [24]
Viruses are the most abundant biological entities on Earth, but challenges in detecting, isolating, and classifying unknown viruses have prevented exhaustive surveys of the global virome. [25] Over 5 Tb of metagenomic sequence data were used from 3,042 geographically diverse samples to assess the global distribution, phylogenetic diversity, and host specificity of viruses. [25]
In August 2016, over 125,000 partial DNA viral genomes, including the largest phage yet identified, increased the number of known viral genes by 16-fold. [25] A suite of computational methods was used to identify putative host virus connections. [25] The isolate viral host information was projected onto a group, resulting in host assignments for 2.4% of viral groups. [25]
Then the CRISPR–Cas prokaryotic immune system which holds a "library" of genome fragments from phages (proto-spacers) that have previously infected the host. [25] Spacers from isolate microbial genomes with matches to metagenomic viral contigs (mVCs) were identified for 4.4% of the viral groups and 1.7% of singletons. [25] The hypothesis was explored that viral transfer RNA (tRNA) genes originate from their host. [25]
Viral tRNAs identified in 7.6% of the mVCs were matched to isolate genomes from a single species or genus. [25] The specificity of tRNA-based host viral assignment was confirmed by CRISPR–Cas spacer matches showing a 94% agreement at the genus level. These approaches identified 9,992 putative host–virus associations enabling host assignment to 7.7% of mVCs. [25] The majority of these connections were previously unknown, and include hosts from 16 prokaryotic phyla for which no viruses have previously been identified. [25]
Many viruses specialize in infecting related hosts. [25] Viral generalists that infect hosts across taxonomic orders may exist. [25] Most CRISPR spacer matches were from viral sequences to hosts within one species or genus. [25] Some mVCs were linked to multiple hosts from higher taxa. A viral group composed of macs from human oral samples contained three distinct photo-spacers with nearly exact matches to spacers in Actionbacteria and Bacillota. [25]
In January 2017, the IMG/VR system [26] -the largest interactive public virus database contained 265,000 metagenomic viral sequences and isolate viruses. This number scaled up to over 760,000 in November 2018 (IMG/VR v.2.0). [27] The IMG/VR systems serve as a starting point for the sequence analysis of viral fragments derived from metagenomic samples.
The human virome encompasses the diverse viral communities residing in the body. Prior advances in high-throughput sequencing (HTS) revealed insights into their diversity, evolutionary dynamics, and genome integrations. However, due to shallow sequencing in the past, the genetic composition and diversity of tissue-resident viruses remained poorly characterized, hindering understanding of their roles in pathogenesis and viral evolution. In 2024, a study of the virome examined persistent viruses in multiple organs from individuals who died of non-viral causes, revealing that viral sequences were highly conserved within each person, indicating persistence from single dominant strains. Increased viral diversity in two cases suggested that reactivation may influence variability. The study also identified selective pressures from the host and unexpected viral genome integrations, including MCPyV truncations and novel links between herpesvirus 6B and mitochondrial DNA, even in non-cancerous individuals, offering new insights into tissue-resident viruses and their potential health impacts. [28]
Viroids are small single-stranded, circular RNAs that are infectious pathogens. Unlike viruses, they have no protein coating. All known viroids are inhabitants of angiosperms, and most cause diseases, whose respective economic importance to humans varies widely. A recent metatranscriptomics study suggests that the host diversity of viroids and other viroid-like elements is broader than previously thought and that it would not be limited to plants, encompassing even the prokaryotes.
Genomics is an interdisciplinary field of molecular biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dimensional structural configuration. In contrast to genetics, which refers to the study of individual genes and their roles in inheritance, genomics aims at the collective characterization and quantification of all of an organism's genes, their interrelations and influence on the organism. Genes may direct the production of proteins with the assistance of enzymes and messenger molecules. In turn, proteins make up body structures such as organs and tissues as well as control chemical reactions and carry signals between cells. Genomics also involves the sequencing and analysis of genomes through uses of high throughput DNA sequencing and bioinformatics to assemble and analyze the function and structure of entire genomes. Advances in genomics have triggered a revolution in discovery-based research and systems biology to facilitate understanding of even the most complex biological systems such as the brain.
Metagenomics is the study of genetic material recovered directly from environmental or clinical samples by a method called sequencing. The broad field may also be referred to as environmental genomics, ecogenomics, community genomics or microbiomics.
The Integrated Microbial Genomes system is a genome browsing and annotation platform developed by the U.S. Department of Energy (DOE)-Joint Genome Institute. IMG contains all the draft and complete microbial genomes sequenced by the DOE-JGI integrated with other publicly available genomes. IMG provides users a set of tools for comparative analysis of microbial genomes along three dimensions: genes, genomes and functions. Users can select and transfer them in the comparative analysis carts based upon a variety of criteria. IMG also includes a genome annotation pipeline that integrates information from several tools, including KEGG, Pfam, InterPro, and the Gene Ontology, among others. Users can also type or upload their own gene annotations and the IMG system will allow them to generate Genbank or EMBL format files containing these annotations.
Marnaviridae is a family of positive-stranded RNA viruses in the order Picornavirales that infect various photosynthetic marine protists. Members of the family have non-enveloped, icosahedral capsids. Replication occurs in the cytoplasm and causes lysis of the host cell. The first species of this family that was isolated is Heterosigma akashiwo RNA virus (HaRNAV) in the genus Marnavirus, which infects the toxic bloom-forming Raphidophyte alga, Heterosigma akashiwo. As of 2021, there are twenty species across seven genera in this family, as well as many other related virus sequences discovered through metagenomic sequencing that are currently unclassified.
The Human Microbiome Project (HMP) was a United States National Institutes of Health (NIH) research initiative to improve understanding of the microbiota involved in human health and disease. Launched in 2007, the first phase (HMP1) focused on identifying and characterizing human microbiota. The second phase, known as the Integrative Human Microbiome Project (iHMP) launched in 2014 with the aim of generating resources to characterize the microbiome and elucidating the roles of microbes in health and disease states. The program received $170 million in funding by the NIH Common Fund from 2007 to 2016.
A virus is a submicroscopic infectious agent that replicates only inside the living cells of an organism. Viruses infect all life forms, from animals and plants to microorganisms, including bacteria and archaea. Viruses are found in almost every ecosystem on Earth and are the most numerous type of biological entity. Since Dmitri Ivanovsky's 1892 article describing a non-bacterial pathogen infecting tobacco plants and the discovery of the tobacco mosaic virus by Martinus Beijerinck in 1898, more than 11,000 of the millions of virus species have been described in detail. The study of viruses is known as virology, a subspeciality of microbiology.
Forest Rohwer is an American microbial ecologist and Professor of Biology at San Diego State University. His particular interests include coral reef microbial ecology and viruses as both evolutionary agents and opportunistic pathogens in various environments.
Biological dark matter is an informal term for unclassified or poorly understood genetic material. This genetic material may refer to genetic material produced by unclassified microorganisms. By extension, biological dark matter may also refer to the un-isolated microorganisms whose existence can only be inferred from the genetic material that they produce. Some of the genetic material may not fall under the three existing domains of life: Bacteria, Archaea and Eukaryota; thus, it has been suggested that a possible fourth domain of life may yet be discovered, although other explanations are also probable. Alternatively, the genetic material may refer to non-coding DNA and non-coding RNA produced by known organisms.
The human virome is the total collection of viruses in and on the human body. Viruses in the human body may infect both human cells and other microbes such as bacteria. Some viruses cause disease, while others may be asymptomatic. Certain viruses are also integrated into the human genome as proviruses or endogenous viral elements.
Viral metagenomics uses metagenomic technologies to detect viral genomic material from diverse environmental and clinical samples. Viruses are the most abundant biological entity and are extremely diverse; however, only a small fraction of viruses have been sequenced and only an even smaller fraction have been isolated and cultured. Sequencing viruses can be challenging because viruses lack a universally conserved marker gene so gene-based approaches are limited. Metagenomics can be used to study and analyze unculturable viruses and has been an important tool in understanding viral diversity and abundance and in the discovery of novel viruses. For example, metagenomics methods have been used to describe viruses associated with cancerous tumors and in terrestrial ecosystems.
Auxiliary metabolic genes (AMGs) are found in many bacteriophages but originated in bacterial cells. AMGs modulate host cell metabolism during infection so that the phage can replicate more efficiently. For instance, bacteriophages that infect the abundant marine cyanobacteria Synechococcus and Prochlorococcus (cyanophages) carry AMGs that have been acquired from their immediate host as well as more distantly-related bacteria. Cyanophage AMGs support a variety of functions including photosynthesis, carbon metabolism, nucleic acid synthesis and metabolism. AMGs also have broader ecological impacts beyond their host including their influence on biogeochemical cycling.
Nikos Kyrpides is a Greek-American bioscientist who has worked on the origins of life, information processing, bioinformatics, microbiology, metagenomics and microbiome data science. He is a senior staff scientist at the Berkeley National Laboratory, head of the Prokaryote Super Program and leads the Microbiome Data Science program at the US Department of Energy Joint Genome Institute.
Mya Breitbart is an American biologist and professor of biological oceanography at the University of South Florida's College of Marine Science. She is best known for her contributions to the field of viral metagenomics. Popular Science recognized her because of her approach of not trying to sequence individual viruses or organisms but to sequence everything in a given ecosystem.
Riboviria is a realm of viruses that includes all viruses that use a homologous RNA-dependent polymerase for replication. It includes RNA viruses that encode an RNA-dependent RNA polymerase, as well as reverse-transcribing viruses that encode an RNA-dependent DNA polymerase. RNA-dependent RNA polymerase (RdRp), also called RNA replicase, produces RNA from RNA. RNA-dependent DNA polymerase (RdDp), also called reverse transcriptase (RT), produces DNA from RNA. These enzymes are essential for replicating the viral genome and transcribing viral genes into messenger RNA (mRNA) for translation of viral proteins.
Clinical metagenomic next-generation sequencing (mNGS) is the comprehensive analysis of microbial and host genetic material in clinical samples from patients by next-generation sequencing. It uses the techniques of metagenomics to identify and characterize the genome of bacteria, fungi, parasites, and viruses without the need for a prior knowledge of a specific pathogen directly from clinical specimens. The capacity to detect all the potential pathogens in a sample makes metagenomic next generation sequencing a potent tool in the diagnosis of infectious disease especially when other more directed assays, such as PCR, fail. Its limitations include clinical utility, laboratory validity, sense and sensitivity, cost and regulatory considerations.
Redondoviruses are a family of human-associated DNA viruses. Their name derives from the inferred circular structure of the viral genome . Redondoviruses have been identified in DNA sequence based surveys of samples from humans, primarily samples from the oral cavity and upper airway.
Marine viruses are defined by their habitat as viruses that are found in marine environments, that is, in the saltwater of seas or oceans or the brackish water of coastal estuaries. Viruses are small infectious agents that can only replicate inside the living cells of a host organism, because they need the replication machinery of the host to do so. They can infect all types of life forms, from animals and plants to microorganisms, including bacteria and archaea.
Orthornavirae is a kingdom of viruses that have genomes made of ribonucleic acid (RNA), including genes which encode an RNA-dependent RNA polymerase (RdRp). The RdRp is used to transcribe the viral RNA genome into messenger RNA (mRNA) and to replicate the genome. Viruses in this kingdom share a number of characteristics which promote rapid evolution, including high rates of genetic mutation, recombination, and reassortment.
Virosphere was coined to refer to all those places in which viruses are found or which are affected by viruses. However, more recently virosphere has also been used to refer to the pool of viruses that occurs in all hosts and all environments, as well as viruses associated with specific types of hosts, type of genome or ecological niche.