Bacterial genome

Last updated

Bacterial genomes are generally smaller and less variant in size among species when compared with genomes of eukaryotes. Bacterial genomes can range in size anywhere from about 130 kbp [1] [2] to over 14 Mbp. [3] A study that included, but was not limited to, 478 bacterial genomes, concluded that as genome size increases, the number of genes increases at a disproportionately slower rate in eukaryotes than in non-eukaryotes. Thus, the proportion of non-coding DNA goes up with genome size more quickly in non-bacteria than in bacteria. This is consistent with the fact that most eukaryotic nuclear DNA is non-gene coding, while the majority of prokaryotic, viral, and organellar genes are coding. [4] Right now, we have genome sequences from 50 different bacterial phyla and 11 different archaeal phyla. Second-generation sequencing has yielded many draft genomes (close to 90% of bacterial genomes in GenBank are currently not complete); third-generation sequencing might eventually yield a complete genome in a few hours. The genome sequences reveal much diversity in bacteria. Analysis of over 2000 Escherichia coli genomes reveals an E. coli core genome of about 3100 gene families and a total of about 89,000 different gene families. [5] Genome sequences show that parasitic bacteria have 500–1200 genes, free-living bacteria have 1500–7500 genes, and archaea have 1500–2700 genes. [6] A striking discovery by Cole et al. described massive amounts of gene decay when comparing Leprosy bacillus to ancestral bacteria. [7] Studies have since shown that several bacteria have smaller genome sizes than their ancestors did. [8] Over the years, researchers have proposed several theories to explain the general trend of bacterial genome decay and the relatively small size of bacterial genomes. Compelling evidence indicates that the apparent degradation of bacterial genomes is owed to a deletional bias.

Contents

Methods and techniques

As of 2014, there are over 30,000 sequenced bacterial genomes publicly available and thousands of metagenome projects. Projects such as the Genomic Encyclopedia of Bacteria and Archaea (GEBA) intend to add more genomes. [5]

The single gene comparison is now being supplanted by more general methods. These methods have resulted in novel perspectives on genetic relationships that previously have only been estimated. [5]

A significant achievement in the second decade of bacterial genome sequencing was the production of metagenomic data, which covers all DNA present in a sample. Previously, there were only two metagenomic projects published. [5]

Bacterial genomes

Log-log plot of the total number of annotated proteins in genomes submitted to GenBank as a function of genome size. Based on data from NCBI genome reports. Genome size vs protein count.svg
Log-log plot of the total number of annotated proteins in genomes submitted to GenBank as a function of genome size. Based on data from NCBI genome reports.

Bacteria possess a compact genome architecture distinct from eukaryotes in two important ways: bacteria show a strong correlation between genome size and number of functional genes in a genome, and those genes are structured into operons. [9] [10] The main reason for the relative density of bacterial genomes compared to eukaryotic genomes (especially multicellular eukaryotes) is the presence of noncoding DNA in the form of intergenic regions and introns. [10] Some notable exceptions include recently formed pathogenic bacteria. This was initially described in a study by Cole et al. in which Mycobacterium leprae was discovered to have a significantly higher percentage of pseudogenes to functional genes (~40%) than its free-living ancestors. [7]

Furthermore, amongst species of bacteria, there is relatively little variation in genome size when compared with the genome sizes of other major groups of life. [6] Genome size is of little relevance when considering the number of functional genes in eukaryotic species. In bacteria, however, the strong correlation between the number of genes and the genome size makes the size of bacterial genomes an interesting topic for research and discussion. [11]

The general trends of bacterial evolution indicate that bacteria started as free-living organisms. Evolutionary paths led some bacteria to become pathogens and symbionts. The lifestyles of bacteria play an integral role in their respective genome sizes. Free-living bacteria have the largest genomes out of the three types of bacteria; however, they have fewer pseudogenes than bacteria that have recently acquired pathogenicity.

Facultative and recently evolved pathogenic bacteria exhibit a smaller genome size than free-living bacteria, yet they have more pseudogenes than any other form of bacteria.

Obligate bacterial symbionts or pathogens have the smallest genomes and the fewest pseudogenes of the three groups. [12] The relationship between life-styles of bacteria and genome size raises questions as to the mechanisms of bacterial genome evolution. Researchers have developed several theories to explain the patterns of genome size evolution amongst bacteria.

Genome comparisons

As single-gene comparisons have largely given way to genome comparisons, phylogeny of bacterial genomes have improved in accuracy. The Average Nucleotide Identity (ANI) method quantifies genetic distance between entire genomes by taking advantage of regions of about 10,000 bp. With enough data from genomes of one genus, algorithms are executed to categorize species. This has been done for the Pseudomonas avellanae species in 2013 [5] and for all sequenced bacteria and archaea since 2020. [13] Observed ANI values among sequences appear to have an "ANI gap" at 85–95%, suggesting that a genetic boundary suitable for defining a species concept is present. [14]

To extract information about bacterial genomes, core- and pan-genome sizes have been assessed for several strains of bacteria. In 2012, the number of core gene families was about 3000. However, by 2015, with an over tenfold increased in available genomes, the pan-genome has increased as well. There is roughly a positive correlation between the number of genomes added and the growth of the pan-genome. On the other hand, the core genome has remain static since 2012. Currently, the E. coli pan-genome is composed of about 90,000 gene families. About one-third of these exist only in a single genome. Many of these, however, are merely gene fragments and the result of calling errors. Still, there are probably over 60,000 unique gene families in E. coli. [5]

Theories of bacterial genome evolution

Bacteria lose a large amount of genes as they transition from free-living or facultatively parasitic life cycles to permanent host-dependent life. Towards the lower end of the scale of bacterial genome size are the mycoplasmas and related bacteria. Early molecular phylogenetic studies revealed that mycoplasmas represented an evolutionary derived state, contrary to prior hypotheses. Furthermore, it is now known that mycoplasmas are just one instance of many of genome shrinkage in obligately host-associated bacteria. Other examples are Rickettsia , Buchnera aphidicola, and Borrelia burgdorferi. [15]

Small genome size in such species is associated with certain particularities, such as rapid evolution of polypeptide sequences and low GC content in the genome. The convergent evolution of these qualities in unrelated bacteria suggests that an obligate association with a host promotes genome reduction. [15]

Given that over 80% of almost all of the fully sequenced bacterial genomes consist of intact ORFs, and that gene length is nearly constant at ~1 kb per gene, it is inferred that small genomes have few metabolic capabilities. While free-living bacteria, such as E. coli, Salmonella species, or Bacillus species, usually have 1500 to 6000 proteins encoded in their DNA, obligately pathogenic bacteria often have as few as 500 to 1000 such proteins. [15]

One candidate explanation is that reduced genomes maintain genes that are necessary for vital processes pertaining to cellular growth and replication, in addition to those genes that are required to survive in the bacteria's ecological niche. However, sequence data contradicts this hypothesis. The set of universal orthologs amongst eubacteria comprises only 15% of each genome. Thus, each lineage has taken a different evolutionary path to reduced size. Because universal cellular processes require over 80 genes, variation in genes imply that the same functions can be achieved by exploitation of nonhomologous genes. [15]

Host-dependent bacteria are able to secure many compounds required for metabolism from the host's cytoplasm or tissue. They can, in turn, discard their own biosynthetic pathways and associated genes. This removal explains many of the specific gene losses. For example, the Rickettsia species, which relies on specific energy substrate from its host, has lost many of its native energy metabolism genes. Similarly, most small genomes have lost their amino acid biosynthesizing genes, as these are found in the host instead. One exception is the Buchnera, an obligate maternally transmitted symbiont of aphids. It retains 54 genes for biosynthesis of crucial amino acids, but no longer has pathways for those amino acids that the host can synthesize. Pathways for nucleotide biosynthesis are gone from many reduced genomes. Those anabolic pathways that evolved through niche adaptation remain in particular genomes. [15]

The hypothesis that unused genes are eventually removed does not explain why many of the removed genes would indeed remain helpful in obligate pathogens. For example, many eliminated genes code for products that are involved in universal cellular processes, including replication, transcription, and translation. Even genes supporting DNA recombination and repair are deleted from every small genome. In addition, small genomes have fewer tRNAs, utilizing one for several amino acids. So, a single codon pairs with multiple codons, which likely yields less-than-optimal translation machinery. It is unknown why obligate intracellular pathogens would benefit by retaining fewer tRNAs and fewer DNA repair enzymes. [15]

Another factor to consider is the change in population that corresponds to an evolution towards an obligately pathogenic life. Such a shift in lifestyle often results in a reduction in the genetic population size of a lineage, since there is a finite number of hosts to occupy. This genetic drift may result in fixation of mutations that inactivate otherwise beneficial genes, or otherwise may decrease the efficiency of gene products. Hence, not will only useless genes be lost (as mutations disrupt them once the bacteria has settled into host dependency), but also beneficial genes may be lost if genetic drift enforces ineffective purifying selection. [15]

The number of universally maintained genes is small and inadequate for independent cellular growth and replication, so that small genome species must achieve such feats by means of varying genes. This is done partly through nonorthologous gene displacement. That is, the role of one gene is replaced by another gene that achieves the same function. Redundancy within the ancestral, larger genome is eliminated. The descendant small genome content depends on the content of chromosomal deletions that occur in the early stages of genome reduction. [15]

The very small genome of M. genitalium possesses dispensable genes. In a study in which single genes of this organism were inactivated using transposon-mediated mutagenesis, at least 129 of its 484 ORGs were not required for growth. A much smaller genome than that of the M. genitalium is therefore feasible. [15]

Doubling time

One theory predicts that bacteria have smaller genomes due to a selective pressure on genome size to ensure faster replication. The theory is based upon the logical premise that smaller bacterial genomes will take less time to replicate. Subsequently, smaller genomes will be selected preferentially due to enhanced fitness. A study done by Mira et al. indicated little to no correlation between genome size and doubling time. [16] The data indicates that selection is not a suitable explanation for the small sizes of bacterial genomes. Still, many researchers believe there is some selective pressure on bacteria to maintain small genome size.

Deletional bias

Selection is but one process involved in evolution. Two other major processes (mutation and genetic drift) can account for the genome sizes of various types of bacteria. A study done by Mira et al. examined the size of insertions and deletions in bacterial pseudogenes. Results indicated that mutational deletions tend to be larger than insertions in bacteria in the absence of gene transfer or gene duplication. [16] Insertions caused by horizontal or lateral gene transfer and gene duplication tend to involve transfer of large amounts of genetic material. Assuming a lack of these processes, genomes will tend to reduce in size in the absence of selective constraint. Evidence of a deletional bias is present in the respective genome sizes of free-living bacteria, facultative and recently derived parasites and obligate parasites and symbionts.

Free-living bacteria tend to have large population-sizes and are subject to more opportunity for gene transfer. As such, selection can effectively operate on free-living bacteria to remove deleterious sequences resulting in a relatively small number of pseudogenes. Continually, further selective pressure is evident as free-living bacteria must produce all gene-products independent of a host. Given that there is sufficient opportunity for gene transfer to occur and there are selective pressures against even slightly deleterious deletions, it is intuitive that free-living bacteria should have the largest bacterial genomes of all bacteria types.

Recently-formed parasites undergo severe bottlenecks and can rely on host environments to provide gene products. As such, in recently-formed and facultative parasites, there is an accumulation of pseudogenes and transposable elements due to a lack of selective pressure against deletions. The population bottlenecks reduce gene transfer and as such, deletional bias ensures the reduction of genome size in parasitic bacteria.

Obligatory parasites and symbionts have the smallest genome sizes due to prolonged effects of deletional bias. Parasites which have evolved to occupy specific niches are not exposed to much selective pressure. As such, genetic drift dominates the evolution of niche-specific bacteria. Extended exposure to deletional bias ensures the removal of most superfluous sequences. Symbionts occur in drastically lower numbers and undergo the most severe bottlenecks of any bacterial type. There is almost no opportunity for gene transfer for endosymbiotic bacteria, and thus genome compaction can be extreme. One of the smallest bacterial genomes ever to be sequenced is that of the endosymbiont Carsonella rudii . [17] At 160 kbp, the genome of Carsonella is one of the most streamlined examples of a genome examined to date.

Genomic reduction

Molecular phylogenetics has revealed that every clade of bacteria with genome sizes under 2 Mb was derived from ancestors with much larger genomes, thus refuting the hypothesis that bacteria evolved by the successive doubling of small-genomed ancestors. [18] Recent studies performed by Nilsson et al. examined the rates of bacterial genome reduction of obligate bacteria. Bacteria were cultured introducing frequent bottlenecks and growing cells in serial passage to reduce gene transfer so as to mimic conditions of endosymbiotic bacteria. The data predicted that bacteria exhibiting a one-day generation time lose as many as 1,000 kbp in as few as 50,000 years (a relatively short evolutionary time period). Furthermore, after deleting genes essential to the methyl-directed DNA mismatch repair (MMR) system, it was shown that bacterial genome size reduction increased in rate by as much as 50 times. [19] These results indicate that genome size reduction can occur relatively rapidly, and loss of certain genes can speed up the process of bacterial genome compaction.

This is not to suggest that all bacterial genomes are reducing in size and complexity. While many types of bacteria have reduced in genome size from an ancestral state, there are still a huge number of bacteria that maintained or increased genome size over ancestral states. [8] Free-living bacteria experience huge population sizes, fast generation times and a relatively high potential for gene transfer. While deletional bias tends to remove unnecessary sequences, selection can operate significantly amongst free-living bacteria resulting in evolution of new genes and processes.

Horizontal gene transfer

Unlike eukaryotes, which evolve mainly through the modification of existing genetic information, bacteria have acquired a large percentage of their genetic diversity by the horizontal transfer of genes. This creates quite dynamic genomes, in which DNA can be introduced into and removed from the chromosome. [20]

Bacteria have more variation in their metabolic properties, cellular structures, and lifestyles than can be accounted for by point mutations alone. For example, none of the phenotypic traits that distinguish E. coli from Salmonella enterica can be attributed to point mutation. On the contrary, evidence suggests that horizontal gene transfer has bolstered the diversification and speciation of many bacteria. [20]

Horizontal gene transfer is often detected via DNA sequence information. DNA segments obtained by this mechanism often reveal a narrow phylogenetic distribution between related species. Furthermore, these regions sometimes display an unexpected level of similarity to genes from taxa that are assumed to be quite divergent. [20]

Although gene comparisons and phylogenetic studies are helpful in investigating horizontal gene transfer, the DNA sequences of genes are even more revelatory of their origin and ancestry within a genome. Bacterial species differ widely in overall GC content, although the genes in any one species' genome are roughly identical with respect to base composition, patterns of codon usage, and frequencies of di- and trinucleotides. As a result, sequences that are newly acquired through lateral transfer can be identified via their characteristics, which remains that of the donor. For example, many of the S. enterica genes that are not present in E. coli have base compositions that differ from the overall 52% GC content of the entire chromosome. Within this species, some lineages have more than a megabase of DNA that is not present in other lineages. The base compositions of these lineage-specific sequences imply that at least half of these sequences were captured through lateral transfer. Furthermore, the regions adjacent to horizontally obtained genes often have remnants of translocatable elements, transfer origins of plasmids, or known attachment sites of phage integrases. [20]

In some species, a large proportion of laterally transferred genes originate from plasmid-, phage-, or transposon-related sequences. [20]

Although sequence-based methods reveal the prevalence of horizontal gene transfer in bacteria, the results tend to be underestimates of the magnitude of this mechanism, since sequences obtained from donors whose sequence characteristics are similar to those of the recipient will avoid detection. [20]

Comparisons of completely sequenced genomes confirm that bacterial chromosomes are amalgams of ancestral and laterally acquired sequences. The hyperthermophilic Eubacteria Aquifex aeolicus and Thermotoga maritima each has many genes that are similar in protein sequence to homologues in thermophilic Archaea. 24% of Thermotoga's 1,877 ORFs and 16% of Aquifex's 1,512 ORFs show high matches to an Archaeal protein, while mesophiles such as E. coli and B. subtilis have far lesser proportions of genes that are most like Archaeal homologues. [20]

Mechanisms of lateral transfer

The genesis of new abilities due to horizontal gene transfer has three requirements. First, there must exist a possible route for the donor DNA to be accepted by the recipient cell. Additionally, the obtained sequence must be integrated with the rest of the genome. Finally, these integrated genes must benefit the recipient bacterial organism. The first two steps can be achieved via three mechanisms: transformation, transduction and conjugation. [20]

Transformation involves the uptake of named DNA from the environment. Through transformation, DNA can be transmitted between distantly related organisms. Some bacterial species, such as Haemophilus influenzae and Neisseria gonorrhoeae , are continuously competent to accept DNA. Other species, such as Bacillus subtilis and Streptococcus pneumoniae , become competent when they enter a particular phase in their lifecycle.

Transformation in N. gonorrhoeae and H. influenzae is effective only if particular recognition sequences are found in the recipient genomes (5'-GCCGTCTGAA-3' and 5'-AAGTGCGGT-3'. respectively). Although the existence of certain uptake sequences improve transformation capability between related species, many of the inherently competent bacterial species, such as B. subtilis and S. pneumoniae, do not display sequence preference.

New genes may be introduced into bacteria by a bacteriophage that has replicated within a donor through generalized transduction or specialized transduction. The amount of DNA that can be transmitted in one event is constrained by the size of the phage capsid (although the upper limit is about 100 kilobases). While phages are numerous in the environment, the range of microorganisms that can be transduced depends on receptor recognition by the bacteriophage. Transduction does not require both donor and recipient cells to be present simultaneously in time nor space. Phage-encoded proteins both mediate the transfer of DNA into the recipient cytoplasm and assist integration of DNA into the chromosome. [20]

Conjugation involves physical contact between donor and recipient cells and is able to mediate transfers of genes between domains, such as between bacteria and yeast. DNA is transmitted from donor to recipient either by self-transmissible or mobilizable plasmid. Conjugation may mediate the transfer of chromosomal sequences by plasmids that integrate into the chromosome.

Despite the multitude of mechanisms mediating gene transfer among bacteria, the process's success is not guaranteed unless the received sequence is stably maintained in the recipient. DNA integration can be sustained through one of many processes. One is persistence as an episome, another is homologous recombination, and still another is illegitimate incorporation through lucky double-strand break repair. [20]

Traits introduced through lateral gene transfer

Antimicrobial resistance genes grant an organism the ability to grow its ecological niche, since it can now survive in the presence of previously lethal compounds. As the benefit to a bacterium earned from receiving such genes are time- and space-independent, those sequences that are highly mobile are selected for. Plasmids are quite mobilizable between taxa and are the most frequent way by which bacteria acquire antibiotic resistance genes.

Adoption of a pathogenic lifestyle often yields a fundamental shift in an organism's ecological niche. The erratic phylogenetic distribution of pathogenic organisms implies that bacterial virulence is a consequence of the presence, or obtainment of, genes that are missing in avirulent forms. Evidence of this includes the discovery of large 'virulence' plasmids in pathogenic Shigella and Yersinia, as well as the ability to bestow pathogenic properties onto E. coli via experimental exposure to genes from other species. [20]

Computer-made form

In April 2019, scientists at ETH Zurich reported the creation of the world's first bacterial genome, named Caulobacter ethensis-2.0 , made entirely by a computer, although a related viable form of C. ethensis-2.0 does not yet exist. [21] [22]

See also

Related Research Articles

<span class="mw-page-title-main">Genome</span> All genetic material of an organism

In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA. The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences, and often a substantial fraction of junk DNA with no evident function. Almost all eukaryotes have mitochondria and a small mitochondrial genome. Algae and plants also contain chloroplasts with a chloroplast genome.

<span class="mw-page-title-main">Plasmid</span> Small DNA molecule within a cell

A plasmid is a small, extrachromosomal DNA molecule within a cell that is physically separated from chromosomal DNA and can replicate independently. They are most commonly found as small circular, double-stranded DNA molecules in bacteria; however, plasmids are sometimes present in archaea and eukaryotic organisms. In nature, plasmids often carry genes that benefit the survival of the organism and confer selective advantage such as antibiotic resistance. While chromosomes are large and contain all the essential genetic information for living under normal conditions, plasmids are usually very small and contain only additional genes that may be useful in certain situations or conditions. Artificial plasmids are widely used as vectors in molecular cloning, serving to drive the replication of recombinant DNA sequences within host organisms. In the laboratory, plasmids may be introduced into a cell via transformation. Synthetic plasmids are available for procurement over the internet.

Non-coding DNA (ncDNA) sequences are components of an organism's DNA that do not encode protein sequences. Some non-coding DNA is transcribed into functional non-coding RNA molecules. Other functional regions of the non-coding DNA fraction include regulatory sequences that control gene expression; scaffold attachment regions; origins of DNA replication; centromeres; and telomeres. Some non-coding regions appear to be mostly nonfunctional, such as introns, pseudogenes, intergenic DNA, and fragments of transposons and viruses. Regions that are completely nonfunctional are called junk DNA.

<span class="mw-page-title-main">Horizontal gene transfer</span> Transfer of genes from unrelated organisms

Horizontal gene transfer (HGT) or lateral gene transfer (LGT) is the movement of genetic material between organisms other than by the ("vertical") transmission of DNA from parent to offspring (reproduction). HGT is an important factor in the evolution of many organisms. HGT is influencing scientific understanding of higher-order evolution while more significantly shifting perspectives on bacterial evolution.

<span class="mw-page-title-main">Pseudogene</span> Functionless relative of a gene

Pseudogenes are nonfunctional segments of DNA that resemble functional genes. Most arise as superfluous copies of functional genes, either directly by gene duplication or indirectly by reverse transcription of an mRNA transcript. Pseudogenes are usually identified when genome sequence analysis finds gene-like sequences that lack regulatory sequences needed for transcription or translation, or whose coding sequences are obviously defective due to frameshifts or premature stop codons. Pseudogenes are a type of junk DNA.

<span class="mw-page-title-main">Transformation (genetics)</span> Genetic alteration of a cell by uptake of genetic material from the environment

In molecular biology and genetics, transformation is the genetic alteration of a cell resulting from the direct uptake and incorporation of exogenous genetic material from its surroundings through the cell membrane(s). For transformation to take place, the recipient bacterium must be in a state of competence, which might occur in nature as a time-limited response to environmental conditions such as starvation and cell density, and may also be induced in a laboratory.

Pathogenicity islands (PAIs), as termed in 1990, are a distinct class of genomic islands acquired by microorganisms through horizontal gene transfer. Pathogenicity islands are found in both animal and plant pathogens. Additionally, PAIs are found in both gram-positive and gram-negative bacteria. They are transferred through horizontal gene transfer events such as transfer by a plasmid, phage, or conjugative transposon. Therefore, PAIs contribute to microorganisms' ability to evolve.

<span class="mw-page-title-main">Genome size</span> Amount of DNA contained in a genome

Genome size is the total amount of DNA contained within one copy of a single complete genome. It is typically measured in terms of mass in picograms or less frequently in daltons, or as the total number of nucleotide base pairs, usually in megabases. One picogram is equal to 978 megabases. In diploid organisms, genome size is often used interchangeably with the term C-value.

Extrachromosomal DNA is any DNA that is found off the chromosomes, either inside or outside the nucleus of a cell. Most DNA in an individual genome is found in chromosomes contained in the nucleus. Multiple forms of extrachromosomal DNA exist, and, while some of these serve important biological functions, they can also play a role in diseases such as cancer.

In biology, a gene cassette is a type of mobile genetic element that contains a gene and a recombination site. Each cassette usually contains a single gene and tends to be very small; on the order of 500–1,000 base pairs. They may exist incorporated into an integron or freely as circular DNA. Gene cassettes can move around within an organism's genome or be transferred to another organism in the environment via horizontal gene transfer. These cassettes often carry antibiotic resistance genes. An example would be the kanMX cassette which confers kanamycin resistance upon bacteria.

Fosmids are similar to cosmids but are based on the bacterial F-plasmid. The cloning vector is limited, as a host can only contain one fosmid molecule. Fosmids can hold DNA inserts of up to 40 kb in size; often the source of the insert is random genomic DNA. A fosmid library is prepared by extracting the genomic DNA from the target organism and cloning it into the fosmid vector. The ligation mix is then packaged into phage particles and the DNA is transfected into the bacterial host. Bacterial clones propagate the fosmid library. The low copy number offers higher stability than vectors with relatively higher copy numbers, including cosmids. Fosmids may be useful for constructing stable libraries from complex genomes. Fosmids have high structural stability and have been found to maintain human DNA effectively even after 100 generations of bacterial growth. Fosmid clones were used to help assess the accuracy of the Public Human Genome Sequence.

Mycoplasma laboratorium or Synthia refers to a synthetic strain of bacterium. The project to build the new bacterium has evolved since its inception. Initially the goal was to identify a minimal set of genes that are required to sustain life from the genome of Mycoplasma genitalium, and rebuild these genes synthetically to create a "new" organism. Mycoplasma genitalium was originally chosen as the basis for this project because at the time it had the smallest number of genes of all organisms analyzed. Later, the focus switched to Mycoplasma mycoides and took a more trial-and-error approach.

<span class="mw-page-title-main">Prokaryote</span> Unicellular organism lacking a membrane-bound nucleus

A prokaryote is a single-cell organism whose cell lacks a nucleus and other membrane-bound organelles. The word prokaryote comes from the Ancient Greek πρό 'before' and κάρυον 'nut, kernel'. In the two-empire system arising from the work of Édouard Chatton, prokaryotes were classified within the empire Prokaryota. But in the three-domain system, based upon molecular analysis, prokaryotes are divided into two domains: Bacteria and Archaea. Organisms with nuclei are placed in a third domain, Eukaryota.

Pathogenomics is a field which uses high-throughput screening technology and bioinformatics to study encoded microbe resistance, as well as virulence factors (VFs), which enable a microorganism to infect a host and possibly cause disease. This includes studying genomes of pathogens which cannot be cultured outside of a host. In the past, researchers and medical professionals found it difficult to study and understand pathogenic traits of infectious organisms. With newer technology, pathogen genomes can be identified and sequenced in a much shorter time and at a lower cost, thus improving the ability to diagnose, treat, and even predict and prevent pathogenic infections and disease. It has also allowed researchers to better understand genome evolution events - gene loss, gain, duplication, rearrangement - and how those events impact pathogen resistance and ability to cause disease. This influx of information has created a need for bioinformatics tools and databases to analyze and make the vast amounts of data accessible to researchers, and it has raised ethical questions about the wisdom of reconstructing previously extinct and deadly pathogens in order to better understand virulence.

<span class="mw-page-title-main">Genome evolution</span> Process by which a genome changes in structure or size over time

Genome evolution is the process by which a genome changes in structure (sequence) or size over time. The study of genome evolution involves multiple fields such as structural analysis of the genome, the study of genomic parasites, gene and ancient genome duplications, polyploidy, and comparative genomics. Genome evolution is a constantly changing and evolving field due to the steadily growing number of sequenced genomes, both prokaryotic and eukaryotic, available to the scientific community and the public at large.

The minimal genome is a concept which can be defined as the set of genes sufficient for life to exist and propagate under nutrient-rich and stress-free conditions. Alternatively, it can also be defined as the gene set supporting life on an axenic cell culture in rich media, and it is thought what makes up the minimal genome will depend on the environmental conditions that the organism inhabits. By one early investigation, the minimal genome of a bacterium should include a virtually complete set of proteins for replication and translation, a transcription apparatus including four subunits of RNA polymerase including the sigma factor rudimentary proteins sufficient for recombination and repair, several chaperone proteins, the capacity for anaerobic metabolism through glycolysis and substrate-level phosphorylation, transamination of glutamyl-tRNA to glutaminyl-tRNA, lipid biosynthesis, eight cofactor enzymes, protein export machinery, and a limited metabolite transport network including membrane ATPases. Proteins involved in the minimum bacterial genome tend to be substantially more related to proteins found in archaea and eukaryotes compared to the average gene in the bacterial genome more generally indicating a substantial number of universally conserved proteins. The minimal genomes reconstructed on the basis of existing genes does not preclude simpler systems in more primitive cells, such as an RNA world genome which does not have the need for DNA replication machinery, which is otherwise part of the minimal genome of current cells.

Hamiltonella defensa is a species of bacteria. It is maternally or sexually transmitted and lives as an endosymbiont of whiteflies and aphids, meaning that it lives within a host, protecting its host from attack. It does this through bypassing the host's immune responses by protecting its host against parasitoid wasps. However, H. defensa is only defensive if infected by a virus. H. defensa shows a relationship with Photorhabdus species, together with Regiella insecticola. Together with other endosymbionts, it provides aphids protection against parasitoids. It is known to habitate Bemisia tabaci.

Bacterial recombination is a type of genetic recombination in bacteria characterized by DNA transfer from one organism called donor to another organism as recipient. This process occurs in three main ways:

Genomic streamlining is a theory in evolutionary biology and microbial ecology that suggests that there is a reproductive benefit to prokaryotes having a smaller genome size with less non-coding DNA and fewer non-essential genes. There is a lot of variation in prokaryotic genome size, with the smallest free-living cell's genome being roughly ten times smaller than the largest prokaryote. Two of the bacterial taxa with the smallest genomes are Prochlorococcus and Pelagibacter ubique, both highly abundant marine bacteria commonly found in oligotrophic regions. Similar reduced genomes have been found in uncultured marine bacteria, suggesting that genomic streamlining is a common feature of bacterioplankton. This theory is typically used with reference to free-living organisms in oligotrophic environments.

Chromids, formerly secondary chromosomes, are a class of bacterial replicons. These replicons are called "chromids" because they have characteristic features of both chromosomes and plasmids. Early on, it was thought that all core genes could be found on the main chromosome of the bacteria. However, in 1989 a replicon was discovered containing core genes outside of the main chromosome. These core genes make the chromid indispensable to the organism. Chromids are large replicons, although not as large as the main chromosome. However, chromids are almost always larger than a plasmid. Chromids also share many genomic signatures of the chromosome, including their GC-content and their codon usage bias. On the other hand, chromids do not share the replication systems of chromosomes. Instead, they use the replication system of plasmids. Chromids are present in 10% of bacteria species sequenced by 2009.

References

  1. McCutcheon, J. P.; Von Dohlen, C. D. (2011). "An Interdependent Metabolic Patchwork in the Nested Symbiosis of Mealybugs". Current Biology. 21 (16): 1366–1372. doi:10.1016/j.cub.2011.06.051. PMC   3169327 . PMID   21835622.
  2. Van Leuven, JT; Meister, RC; Simon, C; McCutcheon, JP (11 September 2014). "Sympatric speciation in a bacterial endosymbiont results in two genomes with the functionality of one". Cell. 158 (6): 1270–80. doi: 10.1016/j.cell.2014.07.047 . PMID   25175626.
  3. Han, K; Li, ZF; Peng, R; Zhu, LP; Zhou, T; Wang, LG; Li, SG; Zhang, XB; Hu, W; Wu, ZH; Qin, N; Li, YZ (2013). "Extraordinary expansion of a Sorangium cellulosum genome from an alkaline milieu". Scientific Reports. 3: 2101. Bibcode:2013NatSR...3E2101H. doi:10.1038/srep02101. PMC   3696898 . PMID   23812535.
  4. Hou, Yubo; Lin, Senjie (2009). "Distinct Gene Number-Genome Size Relationships for Eukaryotes and Non-Eukaryotes: Gene Content Estimation for Dinoflagellate Genomes". PLOS ONE. 4 (9): e6978. Bibcode:2009PLoSO...4.6978H. doi: 10.1371/journal.pone.0006978 . PMC   2737104 . PMID   19750009.
  5. 1 2 3 4 5 6 Land, Miriam; Hauser, Loren; Jun, Se-Ran; Nookaew, Intawat; Leuze, Michael R.; Ahn, Tae-Hyuk; Karpinets, Tatiana; Lund, Ole; Kora, Guruprased; Wassenaar, Trudy; Poudel, Suresh; Ussery, David W. (2015). "Insights from 20 years of bacterial genome sequencing". Functional & Integrative Genomics. 15 (2): 141–161. doi:10.1007/s10142-015-0433-4. PMC   4361730 . PMID   25722247. CC-BY icon.svg This article contains quotations from this source, which is available under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.
  6. 1 2 Gregory, T. R. (2005). "Synergy between sequence and size in Large-scale genomics". Nature Reviews Genetics. 6 (9): 699–708. doi:10.1038/nrg1674. PMID   16151375. S2CID   24237594.
  7. 1 2 Cole, S. T.; Eiglmeier, K.; Parkhill, J.; James, K. D.; Thomson, N. R.; Wheeler, P. R.; Honoré, N.; Garnier, T.; Churcher, C.; Harris, D.; Mungall, K.; Basham, D.; Brown, D.; Chillingworth, T.; Connor, R.; Davies, R. M.; Devlin, K.; Duthoy, S.; Feltwell, T.; Fraser, A.; Hamlin, N.; Holroyd, S.; Hornsby, T.; Jagels, K.; Lacroix, C.; MacLean, J.; Moule, S.; Murphy, L.; Oliver, K.; Quail, M. A. (2001). "Massive gene decay in the leprosy bacillus". Nature. 409 (6823): 1007–1011. Bibcode:2001Natur.409.1007C. doi:10.1038/35059006. PMID   11234002. S2CID   4307207.
  8. 1 2 Ochman, H. (2005). "Genomes on the shrink". Proceedings of the National Academy of Sciences. 102 (34): 11959–11960. Bibcode:2005PNAS..10211959O. doi: 10.1073/pnas.0505863102 . PMC   1189353 . PMID   16105941.
  9. Gregory, T. Ryan (2005). The evolution of the genome. Burlington, MA: Elsevier Academic. ISBN   0123014638.
  10. 1 2 Koonin, E. V. (2009). "Evolution of genome architecture". The International Journal of Biochemistry & Cell Biology. 41 (2): 298–306. doi:10.1016/j.biocel.2008.09.015. PMC   3272702 . PMID   18929678.
  11. Kuo, C. -H.; Moran, N. A.; Ochman, H. (2009). "The consequences of genetic drift for bacterial genome complexity". Genome Research. 19 (8): 1450–1454. doi:10.1101/gr.091785.109. PMC   2720180 . PMID   19502381.
  12. Ochman, H.; Davalos, L. M. (2006). "The Nature and Dynamics of Bacterial Genomes". Science. 311 (5768): 1730–1733. Bibcode:2006Sci...311.1730O. doi:10.1126/science.1119966. PMID   16556833. S2CID   26707775.
  13. Parks, DH; Chuvochina, M; Chaumeil, PA; Rinke, C; Mussig, AJ; Hugenholtz, P (September 2020). "A complete domain-to-species taxonomy for Bacteria and Archaea". Nature Biotechnology. 38 (9): 1079–1086. bioRxiv   10.1101/771964 . doi:10.1038/s41587-020-0501-8. PMID   32341564. S2CID   216560589.
  14. Rodriguez-R, Luis M.; Jain, Chirag; Conrad, Roth E.; Aluru, Srinivas; Konstantinidis, Konstantinos T. (7 July 2021). "Reply to: "Re-evaluating the evidence for a universal genetic boundary among microbial species"". Nature Communications. 12 (1): 4060. doi: 10.1038/s41467-021-24129-1 . PMC   8263725 . PMID   34234115.
  15. 1 2 3 4 5 6 7 8 9 Moran, Nancy A. (2002). "Microbial Minimalism". Cell. 108 (5): 583–586. doi: 10.1016/S0092-8674(02)00665-7 . PMID   11893328.
  16. 1 2 Mira, A.; Ochman, H.; Moran, N. A. (2001). "Deletional bias and the evolution of bacterial genomes". Trends in Genetics. 17 (10): 589–596. doi:10.1016/S0168-9525(01)02447-7. PMID   11585665.
  17. Nakabachi, A.; Yamashita, A.; Toh, H.; Ishikawa, H.; Dunbar, H. E.; Moran, N. A.; Hattori, M. (2006). "The 160-Kilobase Genome of the Bacterial Endosymbiont Carsonella". Science. 314 (5797): 267. doi:10.1126/science.1134196. PMID   17038615. S2CID   44570539.
  18. Ochman, H. (2005). "Genomes on the shrink". Proceedings of the National Academy of Sciences. 102 (34): 11959–11960. Bibcode:2005PNAS..10211959O. doi: 10.1073/pnas.0505863102 . PMC   1189353 . PMID   16105941.
  19. Nilsson, A. I.; Koskiniemi, S.; Eriksson, S.; Kugelberg, E.; Hinton, J. C.; Andersson, D. I. (2005). "Bacterial genome size reduction by experimental evolution". Proceedings of the National Academy of Sciences. 102 (34): 12112–12116. Bibcode:2005PNAS..10212112N. doi: 10.1073/pnas.0503654102 . PMC   1189319 . PMID   16099836.
  20. 1 2 3 4 5 6 7 8 9 10 11 Ochman, Howard; Lawrence, Jeffrey G.; Groisman, Eduardo A. (2000). "Lateral gene transfer and the nature of bacterial innovation". Nature. 405 (6784): 299–304. Bibcode:2000Natur.405..299O. doi:10.1038/35012500. PMID   10830951. S2CID   85739173.
  21. ETH Zurich (1 April 2019). "First bacterial genome created entirely with a computer". EurekAlert! . Retrieved 2 April 2019.
  22. Venetz, Jonathan E.; et al. (1 April 2019). "Chemical synthesis rewriting of a bacterial genome to achieve design flexibility and biological functionality". Proceedings of the National Academy of Sciences of the United States of America . 116 (16): 8070–8079. Bibcode:2019PNAS..116.8070V. doi: 10.1073/pnas.1818259116 . PMC   6475421 . PMID   30936302.