Microbial phylogenetics is the study of the manner in which various groups of microorganisms are genetically related. This helps to trace their evolution. [1] [2] To study these relationships biologists rely on comparative genomics, as physiology and comparative anatomy are not possible methods. [3]
Microbial phylogenetics emerged as a field of study in the 1960s, scientists started to create genealogical trees based on differences in the order of amino acids of proteins and nucleotides of genes instead of using comparative anatomy and physiology. [4] [5]
One of the most important figures in the early stage of this field is Carl Woese, who in his researches, focused on Bacteria, looking at RNA instead of proteins. More specifically, he decided to compare the small subunit ribosomal RNA (16rRNA) oligonucleotides. Matching oligonucleotides in different bacteria could be compared to one another to determine how closely the organisms were related. In 1977, after collecting and comparing 16s rRNA fragments for almost 200 species of bacteria, Woese and his team in 1977 concluded that Archaebacteria were not part of Bacteria but completely independent organisms. [3] [6]
In the 1980s microbial phylogenetics went into its golden age, as the techniques for sequencing RNA and DNA improved greatly. [7] [8] For example, comparison of the nucleotide sequences of whole genes was facilitated by the development of the means to clone DNA, making possible to create many copies of sequences from minute samples. Of incredible impact for the microbial phylogenetics was the invention of the polymerase chain reaction (PCR). [9] [10] All these new techniques led to the formal proposal of the three domains of life: Bacteria, Archaea (Woese himself proposed this name to replace the old nomination of Archaebacteria), and Eukarya, arguably one of the key passage in the history of taxonomy. [11]
One of the intrinsic problems of studying microbial organisms was the dependence of the studies from pure culture in a laboratory. Biologists tried to overcome this limitation by sequencing rRNA genes obtained from DNA isolated directly from the environment. [12] [13] This technique made possible to fully appreciate that bacteria, not only to have the greatest diversity but to constitute the greatest biomass on earth. [14]
In the late 1990s sequencing of genomes from various microbial organisms started and by 2005, 260 complete genomes had been sequenced resulting in the classification of 33 eucaryotes, 206 eubacteria, and 21 archeons. [15]
In the early 2000s, scientists started creating phylogenetic trees based not on rRNA, but on other genes with different function (for example the gene for the enzyme RNA polymerase [16] ). The resulting genealogies differed greatly from the ones based on the rRNA. These gene histories were so different between them that the only hypothesis that could explain these divergences was a major influence of horizontal gene transfer (HGT), a mechanism which permits a bacterium to acquire one or more genes from a completely unrelated organism. [17] HGT explains why similarities and differences in some genes have to be carefully studied before being used as a measure of genealogical relationship for microbial organisms. [18]
Studies aimed at understanding the widespread of HGT suggested that the ease with which genes are transferred among bacteria made impossible to apply ‘the biological species concept’ for them. [19] [20]
Since Darwin, every phylogeny for every organism has been represented in the form of a tree. Nonetheless, due to the great role that HGT plays for microbes some evolutionary microbiologists suggested abandoning this classical view in favor of a representation of genealogies more closely resembling a web, also known as network. However, there are some issues with this network representation, such as the inability to precisely establish the donor organism for a HGT event and the difficulty to determine the correct path across organisms when multiple HGT events happened. Therefore, there is not still a consensus between biologists on which representation is a better fit for the microbial world. [21]
Most microbial taxa have never been cultivated or experimentally characterized. Utilizing taxonomy and phylogeny are essential tools for organizing the diversity of life. Collecting gene sequences, aligning such sequences based on homologies and thus using models of mutation to infer evolutionary history are common methods to estimate microbial phylogenies. [22] Small subunit (SSU) rRNA (SSU rRNA) have revolutionized microbial classification since the 1970s and has since become the most sequenced gene [23] . Phylogenetic inferences are determined based on the genes chosen, for example, 16S rRNA gene is commonly selected to investigate inferences in Bacteria and Archaea, and microbial eukaryotes most commonly use the 18S RNA gene. [24]
Phylogenetic comparative methods (PCMs) are commonly utilized to compare multiple traits across organisms. Within the scope of microbiome studies, it is not common for the use of PCMs, however, recent studies have been successful in identifying genes associated with colonization of human gut. [22] This challenge was addressed through measuring the statistical association between a species that harbors the gene and the probability the species is present in the gut microbiome. The analyses showcase the combination of shotgun metagenomics paired with phylogenetically aware models. [25]
This method is commonly used for estimation of genetic and metabolic profiles of extant communities using a set of reference genomes, commonly performed with PICRUSt (Phylogenetic Investigation of Communities by Reconstructing of Unobserved States) in microbiome studies. [22] PICRUSt is a computational approach capable of prediction functional composition of a metagenome with marker data and a database of reference genomes. To predict which gene families are present, PICRUSt uses extended ancestral-state reconstruction algorithm and then combines the gene families to estimate composite metagenome. [26]
Phylogenetic variables are used to describe variables that are constructed using features in the phylogeny to summarize and contrast data of species in the phylogenetic tree. Microbiome datasets can be simplifies using phylogenetic variables by reducing the dimensions of the data to a few variables carrying biological information. [22] Recent methods such as PhILR and phylofactorization address the challenges of phylogenetic variables analysis. The PhILR transform combines statistical and phylogenetic models to overcome compositional data challenges. Incorporating both microbial evolutionary models with the isometric log-ratio transform creates the PhILR transform. [27] Phylofactorization is a dimensionality-reducing tool used to identify edges in the phylogeny from which putative functional ecological traits may have arisen. [28]
Inferences in phylogenetics requires the assumption of common ancestry or homology but when this assumption is violated the signal can be disrupted by noise. [23] It is possible for microbial traits to be unrelated due to horizontal gene transfer causing the taxonomic composition to reveal little about the function of a system. [29]
Carl Richard Woese was an American microbiologist and biophysicist. Woese is famous for defining the Archaea in 1977 through a pioneering phylogenetic taxonomy of 16S ribosomal RNA, a technique that has revolutionized microbiology. He also originated the RNA world hypothesis in 1967, although not by that name. Woese held the Stanley O. Ikenberry Chair and was professor of microbiology at the University of Illinois Urbana–Champaign.
Molecular phylogenetics is the branch of phylogeny that analyzes genetic, hereditary molecular differences, predominantly in DNA sequences, to gain information on an organism's evolutionary relationships. From these analyses, it is possible to determine the processes by which diversity among species has been achieved. The result of a molecular phylogenetic analysis is expressed in a phylogenetic tree. Molecular phylogenetics is one aspect of molecular systematics, a broader term that also includes the use of molecular data in taxonomy and biogeography.
Molecular evolution describes how inherited DNA and/or RNA change over evolutionary time, and the consequences of this for proteins and other components of cells and organisms. Molecular evolution is the basis of phylogenetic approaches to describing the tree of life. Molecular evolution overlaps with population genetics, especially on shorter timescales. Topics in molecular evolution include the origins of new genes, the genetic nature of complex traits, the genetic basis of adaptation and speciation, the evolution of development, and patterns and processes underlying genomic changes during evolution.
Horizontal gene transfer (HGT) or lateral gene transfer (LGT) is the movement of genetic material between organisms other than by the ("vertical") transmission of DNA from parent to offspring (reproduction). HGT is an important factor in the evolution of many organisms. HGT is influencing scientific understanding of higher-order evolution while more significantly shifting perspectives on bacterial evolution.
The Thermoproteota are prokaryotes that have been classified as a phylum of the domain Archaea. Initially, the Thermoproteota were thought to be sulfur-dependent extremophiles but recent studies have identified characteristic Thermoproteota environmental rRNA indicating the organisms may be the most abundant archaea in the marine environment. Originally, they were separated from the other archaea based on rRNA sequences; other physiological features, such as lack of histones, have supported this division, although some crenarchaea were found to have histones. Until 2005 all cultured Thermoproteota had been thermophilic or hyperthermophilic organisms, some of which have the ability to grow at up to 113 °C. These organisms stain Gram negative and are morphologically diverse, having rod, cocci, filamentous and oddly-shaped cells. Recent evidence shows that some members of the Thermoproteota are methanogens.
Metagenomics is the study of genetic material recovered directly from environmental or clinical samples by a method called sequencing. The broad field may also be referred to as environmental genomics, ecogenomics, community genomics or microbiomics.
16S ribosomal RNA is the RNA component of the 30S subunit of a prokaryotic ribosome. It binds to the Shine-Dalgarno sequence and provides most of the SSU structure.
Archaea is a domain of single-celled organisms. These microorganisms lack cell nuclei and are therefore prokaryotic. Archaea were initially classified as bacteria, receiving the name archaebacteria, but this term has fallen out of use.
Microbiota are the range of microorganisms that may be commensal, mutualistic, or pathogenic found in and on all multicellular organisms, including plants. Microbiota include bacteria, archaea, protists, fungi, and viruses, and have been found to be crucial for immunologic, hormonal, and metabolic homeostasis of their host.
Horizontal gene transfer (HGT) refers to the transfer of genes between distant branches on the tree of life. In evolution, it can scramble the information needed to reconstruct the phylogeny of organisms, how they are related to one another.
Evolution of cells refers to the evolutionary origin and subsequent evolutionary development of cells. Cells first emerged at least 3.8 billion years ago approximately 750 million years after Earth was formed.
Nancy A. Moran is an American evolutionary biologist and entomologist, University of Texas Leslie Surginer Endowed Professor, and co-founder of the Yale Microbial Diversity Institute. Since 2005, she has been a member of the United States National Academy of Sciences. Her seminal research has focused on the pea aphid, Acyrthosiphon pisum and its bacterial symbionts including Buchnera (bacterium). In 2013, she returned to the University of Texas at Austin, where she continues to conduct research on bacterial symbionts in aphids, bees, and other insect species. She has also expanded the scale of her research to bacterial evolution as a whole. She believes that a good understanding of genetic drift and random chance could prevent misunderstandings surrounding evolution. Her current research goal focuses on complexity in life-histories and symbiosis between hosts and microbes, including the microbiota of insects.
Woese's dogma is a principle of evolutionary biology first put forth by biophysicist Carl Woese in 1977. It states that the evolution of ribosomal RNA was a necessary precursor to the evolution of modern life forms. This led to the advancement of the phylogenetic tree of life consisting of three domains rather than the previously accepted two. While the existence of Eukarya and Prokarya were already accepted, Woese was responsible for the distinction between Bacteria and Archaea. Despite initial criticism and controversy surrounding his claims, Woese's three domain system, based on his work regarding the role of rRNA in the evolution of modern life, has become widely accepted.
The eocyte hypothesis in evolutionary biology proposes that the eukaryotes originated from a group of prokaryotes called eocytes. After his team at the University of California, Los Angeles discovered eocytes in 1984, James A. Lake formulated the hypothesis as "eocyte tree" that proposed eukaryotes as part of archaea. Lake hypothesised the tree of life as having only two primary branches: prokaryotes, which include Bacteria and Archaea, and karyotes, that comprise Eukaryotes and eocytes. Parts of this early hypothesis were revived in a newer two-domain system of biological classification which named the primary domains as Archaea and Bacteria.
The Woeseian revolution was the progression of the phylogenetic tree of life concept from two main divisions, known as the Prokarya and Eukarya, into three domains now classified as Bacteria, Archaea, and Eukaryotes. The discovery of the new domain stemmed from the work of biophysicist Carl Woese in 1977 from a principle of evolutionary biology designated as Woese's dogma. It states that the evolution of ribosomal RNA (rRNA) was a necessary precursor to the evolution of modern life forms. Although the three-domain system has been widely accepted, the initial introduction of Woese’s discovery received criticism from the scientific community.
A microbiome is the community of microorganisms that can usually be found living together in any given habitat. It was defined more precisely in 1988 by Whipps et al. as "a characteristic microbial community occupying a reasonably well-defined habitat which has distinct physio-chemical properties. The term thus not only refers to the microorganisms involved but also encompasses their theatre of activity". In 2020, an international panel of experts published the outcome of their discussions on the definition of the microbiome. They proposed a definition of the microbiome based on a revival of the "compact, clear, and comprehensive description of the term" as originally provided by Whipps et al., but supplemented with two explanatory paragraphs, the first pronouncing the dynamic character of the microbiome, and the second clearly separating the term microbiota from the term microbiome.
Horizontal or lateral gene transfer is the transmission of portions of genomic DNA between organisms through a process decoupled from vertical inheritance. In the presence of HGT events, different fragments of the genome are the result of different evolutionary histories. This can therefore complicate investigations of the evolutionary relatedness of lineages and species. Also, as HGT can bring into genomes radically different genotypes from distant lineages, or even new genes bearing new functions, it is a major source of phenotypic innovation and a mechanism of niche adaptation. For example, of particular relevance to human health is the lateral transfer of antibiotic resistance and pathogenicity determinants, leading to the emergence of pathogenic lineages.
PICRUSt is a bioinformatics software package. The name is an abbreviation for Phylogenetic Investigation of Communities by Reconstruction of Unobserved States.
Darwinian threshold or Darwinian transition is a term introduced by Carl Woese to describe a transition period during the evolution of the first cells when genetic transmission moves from a predominantly horizontal mode to a vertical mode. The process starts when the ancestors of the Last Universal Common Ancestor are no longer primarily dependent on horizontal gene transfer (HGT) and become individual entities with vertical heredity upon which natural selection is effective. After this transition, life is characterized by genealogies that have a modern tree-like phylogeny.
Microbial DNA barcoding is the use of DNA metabarcoding to characterize a mixture of microorganisms. DNA metabarcoding is a method of DNA barcoding that uses universal genetic markers to identify DNA of a mixture of organisms.