Microbial phylogenetics

Last updated

Microbial phylogenetics is the study of the manner in which various groups of microorganisms are genetically related. This helps to trace their evolution. [1] [2] To study these relationships biologists rely on comparative genomics, as physiology and comparative anatomy are not possible methods. [3]

Contents

History

1960s–1970s

Microbial phylogenetics emerged as a field of study in the 1960s, scientists started to create genealogical trees based on differences in the order of amino acids of proteins and nucleotides of genes instead of using comparative anatomy and physiology. [4] [5]

One of the most important figures in the early stage of this field is Carl Woese, who in his researches, focused on Bacteria, looking at RNA instead of proteins. More specifically, he decided to compare the small subunit ribosomal RNA (16rRNA) oligonucleotides. Matching oligonucleotides in different bacteria could be compared to one another to determine how closely the organisms were related. In 1977, after collecting and comparing 16s rRNA fragments for almost 200 species of bacteria, Woese and his team in 1977 concluded that Archaebacteria were not part of Bacteria but completely independent organisms. [3] [6]

1980s–1990s

In the 1980s microbial phylogenetics went into its golden age, as the techniques for sequencing RNA and DNA improved greatly. [7] [8] For example, comparison of the nucleotide sequences of whole genes was facilitated by the development of the means to clone DNA, making possible to create many copies of sequences from minute samples. Of incredible impact for the microbial phylogenetics was the invention of the polymerase chain reaction (PCR). [9] [10] All these new techniques led to the formal proposal of the three domains of life: Bacteria, Archaea (Woese himself proposed this name to replace the old nomination of Archaebacteria), and Eukarya, arguably one of the key passage in the history of taxonomy. [11]

One of the intrinsic problems of studying microbial organisms was the dependence of the studies from pure culture in a laboratory. Biologists tried to overcome this limitation by sequencing rRNA genes obtained from DNA isolated directly from the environment. [12] [13] This technique made possible to fully appreciate that bacteria, not only to have the greatest diversity but to constitute the greatest biomass on earth. [14]

In the late 1990s sequencing of genomes from various microbial organisms started and by 2005, 260 complete genomes had been sequenced resulting in the classification of 33 eucaryotes, 206 eubacteria, and 21 archeons. [15]

2000s

In the early 2000s, scientists started creating phylogenetic trees based not on rRNA, but on other genes with different function (for example the gene for the enzyme RNA polymerase [16] ). The resulting genealogies differed greatly from the ones based on the rRNA. These gene histories were so different between them that the only hypothesis that could explain these divergences was a major influence of horizontal gene transfer (HGT), a mechanism which permits a bacterium to acquire one or more genes from a completely unrelated organism. [17] HGT explains why similarities and differences in some genes have to be carefully studied before being used as a measure of genealogical relationship for microbial organisms. [18]

Studies aimed at understanding the widespread of HGT suggested that the ease with which genes are transferred among bacteria made impossible to apply ‘the biological species concept’ for them. [19] [20]

Phylogenetic representation

Since Darwin, every phylogeny for every organism has been represented in the form of a tree. Nonetheless, due to the great role that HGT plays for microbes some evolutionary microbiologists suggested abandoning this classical view in favor of a representation of genealogies more closely resembling a web, also known as network. However, there are some issues with this network representation, such as the inability to precisely establish the donor organism for a HGT event and the difficulty to determine the correct path across organisms when multiple HGT events happened. Therefore, there is not still a consensus between biologists on which representation is a better fit for the microbial world. [21]

Methods for Microbial Phylogenetic Analysis

Most microbial taxa have never been cultivated or experimentally characterized. Utilizing taxonomy and phylogeny are essential tools for organizing the diversity of life. Collecting gene sequences, aligning such sequences based on homologies and thus using models of mutation to infer evolutionary history are common methods to estimate microbial phylogenies. [22] Small subunit (SSU) rRNA (SSU rRNA) have revolutionized microbial classification since the 1970s and has since become the most sequenced gene [23] . Phylogenetic inferences are determined based on the genes chosen, for example, 16S rRNA gene is commonly selected to investigate inferences in Bacteria and Archaea, and microbial eukaryotes most commonly use the 18S RNA gene. [24]

Phylogenetic comparative methods

Phylogenetic comparative methods (PCMs) are commonly utilized to compare multiple traits across organisms. Within the scope of microbiome studies, it is not common for the use of PCMs, however, recent studies have been successful in identifying genes associated with colonization of human gut. [22] This challenge was addressed through measuring the statistical association between a species that harbors the gene and the probability the species is present in the gut microbiome. The analyses showcase the combination of shotgun metagenomics paired with phylogenetically aware models. [25]

Ancestral state reconstruction

This method is commonly used for estimation of genetic and metabolic profiles of extant communities using a set of reference genomes, commonly performed with PICRUSt (Phylogenetic Investigation of Communities by Reconstructing of Unobserved States) in microbiome studies. [22] PICRUSt is a computational approach capable of prediction functional composition of a metagenome with marker data and a database of reference genomes. To predict which gene families are present, PICRUSt uses extended ancestral-state reconstruction algorithm and then combines the gene families to estimate composite metagenome. [26]

Analysis of phylogenetic variables and distances

Phylogenetic variables are used to describe variables that are constructed using features in the phylogeny to summarize and contrast data of species in the phylogenetic tree. Microbiome datasets can be simplifies using phylogenetic variables by reducing the dimensions of the data to a few variables carrying biological information. [22] Recent methods such as PhILR and phylofactorization address the challenges of phylogenetic variables analysis. The PhILR transform combines statistical and phylogenetic models to overcome compositional data challenges. Incorporating both microbial evolutionary models with the isometric log-ratio transform creates the PhILR transform. [27] Phylofactorization is a dimensionality-reducing tool used to identify edges in the phylogeny from which putative functional ecological traits may have arisen. [28]

Challenges

Inferences in phylogenetics requires the assumption of common ancestry or homology but when this assumption is violated the signal can be disrupted by noise. [23] It is possible for microbial traits to be unrelated due to horizontal gene transfer causing the taxonomic composition to reveal little about the function of a system. [29]

See also

Related Research Articles

<span class="mw-page-title-main">Carl Woese</span> American microbiologist (1928–2012)

Carl Richard Woese was an American microbiologist and biophysicist. Woese is famous for defining the Archaea in 1977 through a pioneering phylogenetic taxonomy of 16S ribosomal RNA, a technique that has revolutionized microbiology. He also originated the RNA world hypothesis in 1967, although not by that name. Woese held the Stanley O. Ikenberry Chair and was professor of microbiology at the University of Illinois Urbana–Champaign.

<span class="mw-page-title-main">Genomics</span> Discipline in genetics

Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dimensional structural configuration. In contrast to genetics, which refers to the study of individual genes and their roles in inheritance, genomics aims at the collective characterization and quantification of all of an organism's genes, their interrelations and influence on the organism. Genes may direct the production of proteins with the assistance of enzymes and messenger molecules. In turn, proteins make up body structures such as organs and tissues as well as control chemical reactions and carry signals between cells. Genomics also involves the sequencing and analysis of genomes through uses of high throughput DNA sequencing and bioinformatics to assemble and analyze the function and structure of entire genomes. Advances in genomics have triggered a revolution in discovery-based research and systems biology to facilitate understanding of even the most complex biological systems such as the brain.

Molecular phylogenetics is the branch of phylogeny that analyzes genetic, hereditary molecular differences, predominantly in DNA sequences, to gain information on an organism's evolutionary relationships. From these analyses, it is possible to determine the processes by which diversity among species has been achieved. The result of a molecular phylogenetic analysis is expressed in a phylogenetic tree. Molecular phylogenetics is one aspect of molecular systematics, a broader term that also includes the use of molecular data in taxonomy and biogeography.

<span class="mw-page-title-main">Horizontal gene transfer</span> Type of nonhereditary genetic change

Horizontal gene transfer (HGT) or lateral gene transfer (LGT) is the movement of genetic material between organisms other than by the ("vertical") transmission of DNA from parent to offspring (reproduction). HGT is an important factor in the evolution of many organisms. HGT is influencing scientific understanding of higher order evolution while more significantly shifting perspectives on bacterial evolution.

<span class="mw-page-title-main">Thermoproteota</span> Phylum of archaea

The Thermoproteota are prokaryotes that have been classified as a phylum of the Archaea domain. Initially, the Thermoproteota were thought to be sulfur-dependent extremophiles but recent studies have identified characteristic Thermoproteota environmental rRNA indicating the organisms may be the most abundant archaea in the marine environment. Originally, they were separated from the other archaea based on rRNA sequences; other physiological features, such as lack of histones, have supported this division, although some crenarchaea were found to have histones. Until recently all cultured Thermoproteota had been thermophilic or hyperthermophilic organisms, some of which have the ability to grow at up to 113°C. These organisms stain Gram negative and are morphologically diverse, having rod, cocci, filamentous and oddly-shaped cells.

<span class="mw-page-title-main">Metagenomics</span> Study of genes found in the environment

Metagenomics is the study of genetic material recovered directly from environmental or clinical samples by a method called sequencing. The broad field may also be referred to as environmental genomics, ecogenomics, community genomics or microbiomics.

<span class="mw-page-title-main">16S ribosomal RNA</span> RNA component

16S ribosomal RNA is the RNA component of the 30S subunit of a prokaryotic ribosome. It binds to the Shine-Dalgarno sequence and provides most of the SSU structure.

<span class="mw-page-title-main">Archaea</span> Domain of single-celled organisms

Archaea is a domain of single-celled organisms. These microorganisms lack cell nuclei and are therefore prokaryotes. Archaea were initially classified as bacteria, receiving the name archaebacteria, but this term has fallen out of use.

<span class="mw-page-title-main">Microbiota</span> Community of microorganisms

Microbiota are the range of microorganisms that may be commensal, mutualistic, or pathogenic found in and on all multicellular organisms, including plants. Microbiota include bacteria, archaea, protists, fungi, and viruses, and have been found to be crucial for immunologic, hormonal, and metabolic homeostasis of their host.

<span class="mw-page-title-main">Horizontal gene transfer in evolution</span> Evolutionary consequences of transfer of genetic material between organisms of different taxa

Horizontal gene transfer (HGT) refers to the transfer of genes between distant branches on the tree of life. In evolution, it can scramble the information needed to reconstruct the phylogeny of organisms, how they are related to one another.

Evolution of cells refers to the evolutionary origin and subsequent evolutionary development of cells. Cells first emerged at least 3.8 billion years ago approximately 750 million years after Earth was formed.

For the American folk-rock singer-songwriter, see Nancy Moran.

Woese's dogma is a principle of evolutionary biology first put forth by biophysicist Carl Woese in 1977. It states that the evolution of ribosomal RNA was a necessary precursor to the evolution of modern life forms. This led to the advancement of the phylogenetic tree of life consisting of three domains rather than the previously accepted two. While the existence of Eukarya and Prokarya were already accepted, Woese was responsible for the distinction between Bacteria and Archaea. Despite initial criticism and controversy surrounding his claims, Woese's three domain system, based on his work regarding the role of rRNA in the evolution of modern life, has become widely accepted.

<span class="mw-page-title-main">Eocyte hypothesis</span> Hypothesis in evolutionary biology

The eocyte hypothesis in evolutionary biology proposes that the eukaryotes originated from a group of prokaryotes called eocytes. After his team at the University of California, Los Angeles discovered eocytes in 1984, James A. Lake formulated the hypothesis as "eocyte tree" that proposed eukaryotes as part of archaea. Lake hypothesised the tree of life as having only two primary branches: prokaryotes, which include Bacteria and Archaea, and karyotes, that comprise Eukaryotes and eocytes. Parts of this early hypothesis were revived in a newer two-domain system of biological classification which named the primary domains as Archaea and Bacteria.

The Woeseian revolution was the progression of the phylogenetic tree of life concept from two main divisions, known as the Prokarya and Eukarya, into three domains now classified as Bacteria, Archaea, and Eukaryotes. The discovery of the new domain stemmed from the work of biophysicist Carl Woese in 1977 from a principle of evolutionary biology designated as Woese's dogma. It states that the evolution of ribosomal RNA (rRNA) was a necessary precursor to the evolution of modern life forms. Although the three-domain system has been widely accepted, the initial introduction of Woese’s discovery received criticism from the scientific community.

<span class="mw-page-title-main">Microbiome</span> Microbial community assemblage and activity

A microbiome is the community of microorganisms that can usually be found living together in any given habitat. It was defined more precisely in 1988 by Whipps et al. as "a characteristic microbial community occupying a reasonably well-defined habitat which has distinct physio-chemical properties. The term thus not only refers to the microorganisms involved but also encompasses their theatre of activity". In 2020, an international panel of experts published the outcome of their discussions on the definition of the microbiome. They proposed a definition of the microbiome based on a revival of the "compact, clear, and comprehensive description of the term" as originally provided by Whipps et al., but supplemented with two explanatory paragraphs. The first explanatory paragraph pronounces the dynamic character of the microbiome, and the second explanatory paragraph clearly separates the term microbiota from the term microbiome.

Microbial dark matter comprises the vast majority of microbial organisms that microbiologists are unable to culture in the laboratory, due to lack of knowledge or ability to supply the required growth conditions. Microbial dark matter is unrelated to the dark matter of physics and cosmology, but is so-called for the difficulty in effectively studying it as a result of its inability to be cultured by current methods. It is difficult to estimate its relative magnitude, but the accepted gross estimate is that as little as one percent of microbial species in a given ecological niche are culturable. In recent years, more effort has been directed towards deciphering microbial dark matter by means of recovering genome DNA sequences from environmental samples via culture independent methods such as single cell genomics and metagenomics. These studies have enabled insights into the evolutionary history and the metabolism of the sequenced genomes, providing valuable knowledge required for the cultivation of microbial dark matter lineages.

Horizontal or lateral gene transfer is the transmission of portions of genomic DNA between organisms through a process decoupled from vertical inheritance. In the presence of HGT events, different fragments of the genome are the result of different evolutionary histories. This can therefore complicate investigations of the evolutionary relatedness of lineages and species. Also, as HGT can bring into genomes radically different genotypes from distant lineages, or even new genes bearing new functions, it is a major source of phenotypic innovation and a mechanism of niche adaptation. For example, of particular relevance to human health is the lateral transfer of antibiotic resistance and pathogenicity determinants, leading to the emergence of pathogenic lineages.

<span class="mw-page-title-main">Darwinian threshold</span> Period during the evolution of the first cells

Darwinian threshold or Darwinian transition is a term introduced by Carl Woese to describe a transition period during the evolution of the first cells when genetic transmission moves from a predominantly horizontal mode to a vertical mode. The process starts when the ancestors of the Last Universal Common Ancestor become refractory to horizontal gene transfer (HGT) and become individual entities with vertical heredity upon which natural selection is effective. After this transition, life is characterized by genealogies that have a modern tree-like phylogeny.

Microbial DNA barcoding is the use of DNA metabarcoding to characterize a mixture of microorganisms. DNA metabarcoding is a method of DNA barcoding that uses universal genetic markers to identify DNA of a mixture of organisms.

References

  1. Oren, A (2010). Papke, RT (ed.). Molecular Phylogeny of Microorganisms. Caister Academic Press. ISBN   978-1-904455-67-7.
  2. Blum, P, ed. (2010). Archaea: New Models for Prokaryotic Biology. Caister Academic Press. ISBN   978-1-904455-27-1.
  3. 1 2 Sapp, J. (2007). "The structure of microbial evolutionary theory". Stud. Hist. Phil. Biol. & Biomed. Sci. 38 (4): 780–795. doi:10.1016/j.shpsc.2007.09.011. PMID   18053933.
  4. Dietrich, M. (1998). "Paradox and persuasion: Negotiating the place of molecular evolution within evolutionary biology". Journal of the History of Biology. 31 (1): 85–111. doi:10.1023/A:1004257523100. PMID   11619919. S2CID   29935487.
  5. Dietrich, M. (1994). "The origins of the neutral theory of molecular evolution". Journal of the History of Biology. 27 (1): 21–59. doi:10.1007/BF01058626. PMID   11639258. S2CID   367102.
  6. Woese, C.R.; Fox, G.E. (1977). "Phylogenetic structure of the procaryote domain: The primary kingdoms". Proceedings of the National Academy of Sciences. 75 (11): 5088–5090. Bibcode:1977PNAS...74.5088W. doi: 10.1073/pnas.74.11.5088 . PMC   432104 . PMID   270744.
  7. Sanger, F.; Nicklen, S.; Coulson, A.R. (1977). "DNA sequencing with chain-terminating inhibitors". Proceedings of the National Academy of Sciences. 74 (12): 5463–5467. Bibcode:1977PNAS...74.5463S. doi: 10.1073/pnas.74.12.5463 . PMC   431765 . PMID   271968.
  8. Maxam, A.M. (1977). "A new method for sequencing DNA". Proceedings of the National Academy of Sciences. 74 (2): 560–564. Bibcode:1977PNAS...74..560M. doi: 10.1073/pnas.74.2.560 . PMC   392330 . PMID   265521.
  9. Mullis, K.F.; et al. (1986). "Specific enzymatic amplification of DNA in vitro: The polymerase chain reaction". Cold Spring Harbor Symposia on Quantitative Biology. 51: 263–273. doi:10.1101/SQB.1986.051.01.032. PMID   3472723. S2CID   26180176.
  10. Mullis, K.B.; Faloona, F.A. (1989). Recombinant DNA Methodology. Academic Press. pp. 189–204. ISBN   978-0-12-765560-4.
  11. Woese, C.R.; et al. (1990). "Towards a natural system of organisms: Proposal for the domains Archaea, Bacteria, and Eucarya". Proceedings of the National Academy of Sciences. 87 (12): 4576–4579. Bibcode:1990PNAS...87.4576W. doi: 10.1073/pnas.87.12.4576 . PMC   54159 . PMID   2112744.
  12. Pace, N (1997). "A molecular view of microbial diversity and the biosphere". Science. 276 (5313): 734–740. doi:10.1126/science.276.5313.734. PMID   9115194.
  13. Pace, N.R.; et al. (1985). "Analyzing natural microbial populations by rRNA sequences". American Society of Microbiology News. 51: 4–12.
  14. Whitman, W. B; et al. (1998). "Procaryotes: The unseen majority". Proceedings of the National Academy of Sciences. 95 (12): 6578–6583. Bibcode:1998PNAS...95.6578W. doi: 10.1073/pnas.95.12.6578 . PMC   33863 . PMID   9618454.
  15. Delusc, F.; Brinkmann, H.; Philippe, H. (2005). "Phylogenomics and the reconstruction of the tree of life" (PDF). Nature Reviews Genetics. 6 (5): 361–375. doi:10.1038/nrg1603. PMID   15861208. S2CID   16379422.
  16. Doolittle, W.F. (1999). "Phylogenetic classification and the universal tree". Science. 284 (5423): 2124–2128. doi:10.1126/science.284.5423.2124. PMID   10381871.
  17. Bushman, F. (2002). Lateral DNA transfer: mechanisms and consequences. New York: Cold Spring Harbor Laboratory Press. ISBN   0879696036.
  18. Andam, Cheryl P.; Williams, David; Gogarten, J. Peter (2010-06-08). "Biased gene transfer mimics patterns created through shared ancestry". Proceedings of the National Academy of Sciences. 107 (23): 10679–10684. doi: 10.1073/pnas.1001418107 . ISSN   0027-8424. PMC   2890805 . PMID   20495090.
  19. Ochman, H.; Lawrence, J.G.; Groisman, E.A. (2000). "Lateral gene transfer and the nature of bacterial innovation". Nature. 405 (6784): 299–304. Bibcode:2000Natur.405..299O. doi:10.1038/35012500. PMID   10830951. S2CID   85739173.
  20. Eisen, J. (2000). "Horizontal gene transfer among microbial genomes: new insights from complete genome analysis". Current Opinion in Genetics & Development. 10 (6): 606–611. doi:10.1016/S0959-437X(00)00143-X. PMID   11088009.
  21. Kunin, V.; Goldovsky, L.; Darzentas, N.; Ouzounis, C. A. (2005). "The net of life: Reconstructing the microbial phylogenetic network". Genome Research. 15 (7): 954–959. doi:10.1101/gr.3666505. PMC   1172039 . PMID   15965028.
  22. 1 2 3 4 Washburne, Alex D.; Morton, James T.; Sanders, Jon; McDonald, Daniel; Zhu, Qiyun; Oliverio, Angela M.; Knight, Rob (2018-05-24). "Methods for phylogenetic analysis of microbiome data". Nature Microbiology. 3 (6): 652–661. doi:10.1038/s41564-018-0156-0. ISSN   2058-5276. PMID   29795540. S2CID   43962376.
  23. 1 2 Wu, Martin; Eisen, Jonathan A (2008). "A simple, fast, and accurate method of phylogenomic inference". Genome Biology. 9 (10): R151. doi: 10.1186/gb-2008-9-10-r151 . ISSN   1465-6906. PMC   2760878 . PMID   18851752.
  24. Hillis, David M.; Dixon, Michael T. (1991). "Ribosomal DNA: Molecular Evolution and Phylogenetic Inference". The Quarterly Review of Biology. 66 (4): 411–453. doi:10.1086/417338. ISSN   0033-5770. PMID   1784710. S2CID   32027097.
  25. Bradley, Patrick H.; Nayfach, Stephen; Pollard, Katherine S. (2018-08-09). "Phylogeny-corrected identification of microbial gene families relevant to human gut colonization". PLOS Computational Biology. 14 (8): e1006242. doi: 10.1371/journal.pcbi.1006242 . ISSN   1553-7358. PMC   6084841 . PMID   30091981.
  26. Langille, Morgan G I; Zaneveld, Jesse; Caporaso, J Gregory; McDonald, Daniel; Knights, Dan; Reyes, Joshua A; Clemente, Jose C; Burkepile, Deron E; Vega Thurber, Rebecca L; Knight, Rob; Beiko, Robert G; Huttenhower, Curtis (2013). "Predictive functional profiling of microbial communities using 16S rRNA marker gene sequences". Nature Biotechnology. 31 (9): 814–821. doi:10.1038/nbt.2676. ISSN   1087-0156. PMC   3819121 . PMID   23975157.
  27. Silverman, Justin D; Washburne, Alex D; Mukherjee, Sayan; David, Lawrence A (2017-02-15). "A phylogenetic transform enhances analysis of compositional microbiota data". eLife. 6. doi: 10.7554/eLife.21887 . ISSN   2050-084X. PMC   5328592 . PMID   28198697.
  28. Washburne, Alex D.; Silverman, Justin D.; Leff, Jonathan W.; Bennett, Dominic J.; Darcy, John L.; Mukherjee, Sayan; Fierer, Noah; David, Lawrence A. (2017-02-09). "Phylogenetic factorization of compositional data yields lineage-level associations in microbiome datasets". PeerJ. 5: e2969. doi: 10.7717/peerj.2969 . ISSN   2167-8359. PMC   5345826 . PMID   28289558.
  29. Martiny, Jennifer B. H.; Jones, Stuart E.; Lennon, Jay T.; Martiny, Adam C. (2015-11-06). "Microbiomes in light of traits: A phylogenetic perspective". Science. 350 (6261). doi: 10.1126/science.aac9323 . ISSN   0036-8075.