Auxiliary metabolic genes (AMGs) are found in many bacteriophages but originated in bacterial cells. [1] AMGs modulate host cell metabolism during infection so that the phage can replicate more efficiently. For instance, bacteriophages that infect the abundant marine cyanobacteria Synechococcus and Prochlorococcus (cyanophages) carry AMGs that have been acquired from their immediate host as well as more distantly-related bacteria. [2] Cyanophage AMGs support a variety of functions including photosynthesis, [3] carbon metabolism, [4] nucleic acid synthesis and metabolism. [5] AMGs also have broader ecological impacts beyond their host including their influence on biogeochemical cycling. [6]
AMGs employ diverse functions including pathways not involved in metabolism despite what the name suggests. They are categorized in two classes based on their presence in the Kyoto Encyclopedia of Genes and Genomes (KEGG). [7] AMGs do not encompass metabolic genes involved in typical viral functions, such as nucleotide and protein metabolism since their functions achieve direct viral reproduction, rather than augmenting host function to indirectly enhance it. [8]
Class I AMGs encode for metabolism pathways in the cell and are found in KEGG. In particular, these genes are found in photosynthesis and carbon metabolism. psbA is almost a ubiquitous photosynthetic AMG for the photosystem Il reaction center D1 found in Synechococcus and Prochlorococcus cyanophages. [9] Photosynthetic machinery for other reaction centers and electron transport are also found in many viruses infecting phototrophs. Phages encode for nearly all genes involved in carbon metabolism. [7] In particular, viruses redirect host metabolism to increase dNTP biosynthesis for viral genome replication. [10] glgA can induce starvation by converting glucose-6-phosphate to glycogen, forcing the host to compensate by deriving ribulose-5-phosphate from glyceraldehyde-3-phosphate and fructose-6-phosphate. [7]
Class II AMGs encode for peripheral functions absent from the KEGG metabolic pathways. This includes genes typically involved in transport and assembly. [8] Major representatives of this class are involved in balancing TCA cycle intermediates. [7] Additionally, the acquisition of biogenic elements outside of carbon like phosphate, governed by pstS, are prevalent for this class. [11] Confidence of AMG identification for Class II AMGs is reduced without a database for reference. [12]
Virus survival through inclusion of AMGs is governed by the laws of natural selection and has been made highly selective through co-evolution with their hosts. [13] As such, the AMGs that confer a fitness advantage to the virus's ability to infect a host and reproduce will be more abundant. AMG abundance is largely dictated by the lifestyle of the virus, environmental conditions surrounding it, and host characteristics. [6]
Lytic and lysogenic viruses have different lifestyles which impact what AMGs they acquire. Lytic viruses tend to use AMGs to repurpose host cell metabolism and steal nutrients when in high cell density. Therefore, AMGs related to metabolism and transport are found more abundantly in lytic viruses. [14] Lytic viruses also encompass a more diverse set of AMGs than lysogenic viruses, in part due to their larger host range and higher infection frequency. Temperate viruses, on the other hand, may employ AMGs to improve host fitness and virulence due to their often longer lifespan in the cell as a prophage. [15] Gene density in these viruses is higher when compared to their lytic counterparts. Higher rates of HGT in lysogenic viruses allows for more AMG transfer but also lowers overall gene diversity. [6]
Photosynthesis capacity has also been correlated to AMG diversity. Aphotic viral communities possess greater AMG diversity than those in the photic zone. [16]
Pathways utilizing nutrients found in low concentrations in the local environment are generally found in higher abundance in the virus. In marine environments, AMGs can confer fitness advantages for both host and viruses under relatively nutrient-limited conditions compared to sediment and strong ultraviolet stress of water. [6] In sunlit versus dark ocean waters, AMGs in distinct pathways are unequally distributed to reprogram host energy production and viral replication based on available nutrients. [17] In sedimentary environments, carbon and sulfur metabolism AMGs are typically more prevalent to outcompete other organisms for the abundant resources. [18]
A virus's host range determines which host it can acquire AMGs from. Additionally, the abundance of a host surrounding a virus will affect its likelihood to acquire genes from the host. Virus populations increasingly occupy lytic lifestyles as bacterial production increases. [14] The strong evolutionary connection between viruses and their hosts makes AMG acquisition mirror the host's own adaptation to its environment over time. [6]
Synechococcus and Prochlorococcus are the most abundant picocyanobacteria, accounting for up to 50% of primary production in the marine environment. [19] As such, many AMGs characterized have been discovered in phages of these host systems.
DRAM-v [20] is the standard for AMG annotation of metagenome assembled genomes (MAGs) identified as viruses. [21] DRAM-v searches the following databases for AMGs that match the input MAGs: Pfam, KEGG, UniProt, CAZy, MEROPS, VOGDB, and NCBI Viral RefSeq. [20] KEGG can then be referenced to classify annotated AMGs through VIBRANT. [22]
Since AMGs originate in hosts, distinguishing host and viral genes is critical for their study. This is not easily achieved as cultivation of viral-host systems in a laboratory setting proves challenging if even possible. [8] Additionally, filtering out cellular sequences before entry in bioinformatic pipelines is not possible with cellular gene transfer agents and membrane vesicles are unable to distinguish from viruses due to their many shared properties at this step of analysis. [23] [24] The extent to which they have contaminated existing viral databases is unknown. [8] Some genes have distinctions between host and viral versions such as cyanophage photosynthesis easing the task of computational distinction. The most definitive way developed to determine gene origin has been identification of taxonomically informative genes colocalized on assembled contigs. ViromeQC [25] can display contamination for the dataset overall and DRAM-v assigns a confidence score for the AMG being on a viral MAG. [20] Viral identification is most popularly performed by VIBRANT, [22] VirSorter2, [26] DeepVirFinder, [27] and CheckV. [28]
AMGs are not randomly distributed throughout genomes. Current research is being done to determine the genes that most commonly surround specific AMGs. [29] Hyperplastic regions including the region between genes g15-g18 has been classified as locales where multiple AMGs have been inserted. [30] Possible AMG contexts can be divided into locally collinear blocks (LCBs), or homologous regions shared by multiple viruses without rearrangements. [31] AMGs have been found in just one or up to 14 LCBs. Those found in more diverse contexts have also shown up in variable locales within the LCB. [29]
Horizontal gene transfer (HGT) from host to virus allows for AMGs to be acquired. Gene transfer from host eukaryotes to viruses occur about twice as frequently as virus to host gene transfers due to a higher number viral recipients than donors. The vast majority of gene transfer occurs in double-stranded DNA viruses since they have large and flexible genomes, co-evolution with eukaryotes, and wide host breadth. Additionally, unicellular hosts more commonly transfer genes. [13]
AMGs may influence gene expression by modulating the activity of transcription factors, which control the rate at which specific genes are transcribed into mRNA, thereby impacting the levels of corresponding proteins involved in metabolic pathways.
Certain AMGs encode proteins that directly interact with enzymes involved in metabolic reactions. This interaction can either enhance or inhibit enzyme activity, leading to changes in the rate of metabolic flux through specific pathways.
AMGs may be integrated into cellular signaling pathways, influencing the transmission of signals related to energy status, nutrient availability, or stress. By modulating these signaling pathways, AMGs can indirectly regulate metabolic processes.
AMGs have a large impact on biogeochemical cycles in multiple environments through nutrient degradation, mineralization, transportation, assimilation, and transformation. [6] By enhancing the metabolic capabilities of their hosts, bacteriophages contribute to the recycling of organic matter, influencing the availability of nutrients for other organisms in the ecosystem. Lytic viruses in particular have been shown to increase ammonium oxidation, nitric oxide reduction, nitrification, and denitrification to balance nutrient levels in nitrogen polluted environments. [6] Nutrient-enriched wetlands contain AMGs related to sulfur transport and metabolism. [32] AMG modification of host processes is another means other than the viral shunt by which viruses can directly impact biogeochemical cycles. [33]
The ability of AMGs modulating the metabolic capacities of their hosts can influence the abundance and distribution of specific microbial taxa. [6] In turn, this shapes the overall composition of microbial communities, with potential cascading effects on higher trophic levels.[ citation needed ]
AMGs play a crucial role in microbial adaptation to environmental changes. In extreme environments, AMGs can encode for alternate energy pathways such as subunits of dissimilatory sulfite reductase. [34] The ability of viruses to confer new metabolic traits to their hosts enhances the resilience of microbial communities facing shifts in temperature, nutrient availability, or other environmental stressors. [6] AMGs can also serve as a genetic pool in shaping the evolution of their hosts. [35]
A bacteriophage, also known informally as a phage, is a virus that infects and replicates within bacteria and archaea. The term was derived from bacteria and the Ancient Greek word φαγεῖν, meaning 'to devour'. Bacteriophages are composed of proteins that encapsulate a DNA or RNA genome, and may have structures that are either simple or elaborate. Their genomes may encode as few as four genes and as many as hundreds of genes. Phages replicate within the bacterium following the injection of their genome into its cytoplasm.
Enterobacteria phage λ is a bacterial virus, or bacteriophage, that infects the bacterial species Escherichia coli. It was discovered by Esther Lederberg in 1950. The wild type of this virus has a temperate life cycle that allows it to either reside within the genome of its host through lysogeny or enter into a lytic phase, during which it kills and lyses the cell to produce offspring. Lambda strains, mutated at specific sites, are unable to lysogenize cells; instead, they grow and enter the lytic cycle after superinfecting an already lysogenized cell.
A prophage is a bacteriophage genome that is integrated into the circular bacterial chromosome or exists as an extrachromosomal plasmid within the bacterial cell. Integration of prophages into the bacterial host is the characteristic step of the lysogenic cycle of temperate phages. Prophages remain latent in the genome through multiple cell divisions until activation by an external factor, such as UV light, leading to production of new phage particles that will lyse the cell and spread. As ubiquitous mobile genetic elements, prophages play important roles in bacterial genetics and evolution, such as in the acquisition of virulence factors.
Viral replication is the formation of biological viruses during the infection process in the target host cells. Viruses must first get into the cell before viral replication can occur. Through the generation of abundant copies of its genome and packaging these copies, the virus continues infecting new hosts. Replication between viruses is greatly varied and depends on the type of genes involved in them. Most DNA viruses assemble in the nucleus while most RNA viruses develop solely in cytoplasm.
In virology, temperate refers to the ability of some bacteriophages to display a lysogenic life cycle. Many temperate phages can integrate their genomes into their host bacterium's chromosome, together becoming a lysogen as the phage genome becomes a prophage. A temperate phage is also able to undergo a productive, typically lytic life cycle, where the prophage is expressed, replicates the phage genome, and produces phage progeny, which then leave the bacterium. With phage the term virulent is often used as an antonym to temperate, but more strictly a virulent phage is one that has lost its ability to display lysogeny through mutation rather than a phage lineage with no genetic potential to ever display lysogeny.
Lysogeny, or the lysogenic cycle, is one of two cycles of viral reproduction. Lysogeny is characterized by integration of the bacteriophage nucleic acid into the host bacterium's genome or formation of a circular replicon in the bacterial cytoplasm. In this condition the bacterium continues to live and reproduce normally, while the bacteriophage lies in a dormant state in the host cell. The genetic material of the bacteriophage, called a prophage, can be transmitted to daughter cells at each subsequent cell division, and later events can release it, causing proliferation of new phages via the lytic cycle.
Cyanophages are viruses that infect cyanobacteria, also known as Cyanophyta or blue-green algae. Cyanobacteria are a phylum of bacteria that obtain their energy through the process of photosynthesis. Although cyanobacteria metabolize photoautotrophically like eukaryotic plants, they have prokaryotic cell structure. Cyanophages can be found in both freshwater and marine environments. Marine and freshwater cyanophages have icosahedral heads, which contain double-stranded DNA, attached to a tail by connector proteins. The size of the head and tail vary among species of cyanophages. Cyanophages infect a wide range of cyanobacteria and are key regulators of the cyanobacterial populations in aquatic environments, and may aid in the prevention of cyanobacterial blooms in freshwater and marine ecosystems. These blooms can pose a danger to humans and other animals, particularly in eutrophic freshwater lakes. Infection by these viruses is highly prevalent in cells belonging to Synechococcus spp. in marine environments, where up to 5% of cells belonging to marine cyanobacterial cells have been reported to contain mature phage particles.
The mobilome is the entire set of mobile genetic elements in a genome. Mobilomes are found in eukaryotes, prokaryotes, and viruses. The compositions of mobilomes differ among lineages of life, with transposable elements being the major mobile elements in eukaryotes, and plasmids and prophages being the major types in prokaryotes. Virophages contribute to the viral mobilome.
Photosynthetic reaction centre proteins are main protein components of photosynthetic reaction centres (RCs) of bacteria and plants. They are transmembrane proteins embedded in the chloroplast thylakoid or bacterial cell membrane.
The manA RNA motif refers to a conserved RNA structure that was identified by bioinformatics. Instances of the manA RNA motif were detected in bacteria in the genus Photobacterium and phages that infect certain kinds of cyanobacteria. However, most predicted manA RNA sequences are derived from DNA collected from uncultivated marine bacteria. Almost all manA RNAs are positioned such that they might be in the 5' untranslated regions of protein-coding genes, and therefore it was hypothesized that manA RNAs function as cis-regulatory elements. Given the relative complexity of their secondary structure, and their hypothesized cis-regulatory role, they might be riboswitches.
The wcaG RNA motif is an RNA structure conserved in some bacteria that was detected by bioinformatics. wcaG RNAs are found in certain phages that infect cyanobacteria. Most known wcaG RNAs were found in sequences of DNA extracted from uncultivated marine bacteria. wcaG RNAs might function as cis-regulatory elements, in view of their consistent location in the possible 5' untranslated regions of genes. It was suggested the wcaG RNAs might further function as riboswitches.
Poribacteria are a candidate phylum of bacteria originally discovered in the microbiome of marine sponges (Porifera). Poribacteria are Gram-negative primarily aerobic mixotrophs with the ability for oxidative phosphorylation, glycolysis, and autotrophic carbon fixation via the Wood – Ljungdahl pathway. Poribacterial heterotrophy is characterised by an enriched set of glycoside hydrolases, uronic acid degradation, as well as several specific sulfatases. This heterotrophic repertoire of poribacteria was suggested to be involved in the degradation of the extracellular sponge host matrix.
Virome refers to the assemblage of viruses that is often investigated and described by metagenomic sequencing of viral nucleic acids that are found associated with a particular ecosystem, organism or holobiont. The word is frequently used to describe environmental viral shotgun metagenomes. Viruses, including bacteriophages, are found in all environments, and studies of the virome have provided insights into nutrient cycling, development of immunity, and a major source of genes through lysogenic conversion. Also, the human virome has been characterized in nine organs of 31 Finnish individuals using qPCR and NGS methodologies.
The "Kill the Winner" hypothesis (KtW) is an ecological model of population growth involving prokaryotes, viruses and protozoans that links trophic interactions to biogeochemistry. The model is related to the Lotka–Volterra equations. It assumes that prokaryotes adopt one of two strategies when competing for limited resources: priority is either given to population growth ("winners") or survival ("defenders"). As "winners" become more abundant and active in their environment, their contact with host-specific viruses increases, making them more susceptible to viral infection and lysis. Thus, viruses moderate the population size of "winners" and allow multiple species to coexist. Current understanding of KtW primarily stems from studies of lytic viruses and their host populations.
Genomic streamlining is a theory in evolutionary biology and microbial ecology that suggests that there is a reproductive benefit to prokaryotes having a smaller genome size with less non-coding DNA and fewer non-essential genes. There is a lot of variation in prokaryotic genome size, with the smallest free-living cell's genome being roughly ten times smaller than the largest prokaryote. Two of the free-living bacterial taxa with the smallest genomes are Prochlorococcus and Pelagibacter ubique, both highly abundant marine bacteria commonly found in oligotrophic regions. Similar reduced genomes have been found in uncultured marine bacteria, suggesting that genomic streamlining is a common feature of bacterioplankton. This theory is typically used with reference to free-living organisms in oligotrophic environments.
The viral shunt is a mechanism that prevents marine microbial particulate organic matter (POM) from migrating up trophic levels by recycling them into dissolved organic matter (DOM), which can be readily taken up by microorganisms. The DOM recycled by the viral shunt pathway is comparable to the amount generated by the other main sources of marine DOM.
The hydrothermal vent microbial community includes all unicellular organisms that live and reproduce in a chemically distinct area around hydrothermal vents. These include organisms in the microbial mat, free floating cells, or bacteria in an endosymbiotic relationship with animals. Chemolithoautotrophic bacteria derive nutrients and energy from the geological activity at Hydrothermal vents to fix carbon into organic forms. Viruses are also a part of the hydrothermal vent microbial community and their influence on the microbial ecology in these ecosystems is a burgeoning field of research.
Arbitrium is a viral peptide produced by bacteriophages to communicate with each other and decide host cell fate. It is six amino acids(aa) long, and so is also referred to as a hexapeptide. It is produced when a phage infects a bacterial host. and signals to other phages that the host has been infected.
Marine viruses are defined by their habitat as viruses that are found in marine environments, that is, in the saltwater of seas or oceans or the brackish water of coastal estuaries. Viruses are small infectious agents that can only replicate inside the living cells of a host organism, because they need the replication machinery of the host to do so. They can infect all types of life forms, from animals and plants to microorganisms, including bacteria and archaea.
Algal viruses are the viruses infecting algae, which are photosynthetic single-celled eukaryotes. As of 2020, there were 61 viruses known to infect algae. Algae are integral components of aquatic food webs and drive nutrient cycling, so the viruses infecting algal populations also impacts the organisms and nutrient cycling systems that depend on them. Thus, these viruses can have significant, worldwide economic and ecological effects. Their genomes varied between 4.4 to 560 kilobase pairs (kbp) long and used double-stranded Deoxyribonucleic Acid (dsDNA), double-stranded Ribonucleic Acid (dsRNA), single-stranded Deoxyribonucleic Acid (ssDNA), and single-stranded Ribonucleic Acid (ssRNA). The viruses ranged between 20 and 210 nm in diameter. Since the discovery of the first algae-infecting virus in 1979, several different techniques have been used to find new viruses infecting algae and it seems that there are many algae-infecting viruses left to be discovered