Onion Test

Last updated

The onion test is a way of assessing the validity of an argument for a functional role for junk DNA. It relates to the paradox that would emerge if the majority of eukaryotic non-coding DNA were assumed to be functional and the difficulty of reconciling that assumption with the diversity in genome sizes among species. [1] The term "onion test" was originally proposed [2] informally in a blog post by T. Ryan Gregory in order to help clarify the debate about junk DNA. The term has been mentioned in newspapers and online media, [3] [4] scientific journal articles, [5] [6] [7] [8] and a textbook. [9] The test is defined as:

Contents

The onion test is a simple reality check for anyone who thinks they have come up with a universal function for junk DNA. Whatever your proposed function, ask yourself this question: Can I explain why an onion needs about five times more non-coding DNA for this function than a human?

Onions and their relatives vary dramatically in their genome sizes, [10] without changing their ploidy, and this gives an exceptionally valuable window on the genomic expansion junk DNA. Since the onion (Allium cepa) is a diploid organism having a haploid genome size of 15.9 Gb, [10] it has 4.9x as much DNA as does a human genome (3.2 Gb). Other species in the genus Allium vary hugely in DNA content without changing their ploidy. Allium schoenoprasum (chives) for example has a haploid genome size of 7.5 Gb, less than half that of onions, yet Allium ursinum (wild garlic) has a haploid genome size of 30.9 Gb, nearly twice (1.94x) that of onion and over four times (4.1x) that of chives. This extreme size variation between closely related species in the genus Allium is also part of the extended onion test rationale as originally defined: [2]

Further, if you think perhaps onions are somehow special, consider that members of the genus Allium range in genome size from 7 pg to 31.5 pg. So why can A. altyncolicum make do with one fifth as much regulation, structural maintenance, protection against mutagens, or [insert preferred universal function] as A. ursinum?

C-value paradox

Some researchers argue that the onion test is related to wider issues involving the C-value paradox and is only valid if one can justify the presumption that genome size has no bearing on organismal physiology. [11] According to Larry Moran, the onion test is not an argument for junk DNA, but an approach to assessing possible functional explanations for non-coding DNA. According to him, it asks why allium species need so much more of that proposed function than do humans, and why so much more (or less) than other closely related species of allium and does not address the variation in genome size (C-value) among organisms itself. [12]

Responses

According to Jonathan McLatchie, the onion test is only valid if one can justify the presumption that genome size has no bearing on organismal physiology. Long sequences of repetitive DNA can be highly relevant to an organism and can contribute to transcription delays and developmental timing mechanisms for an organism. Furthermore, he argues that there is a positive correlation between genome size and cell volume for unicellular eukaryotes like plants and protozoa and so the larger amount of DNA thus provides a selective advantage by contributing to the skeleton and volume of the nucleus of these cells. [11] Larry Moran who was actually addressed in McLatchie's post extensively replied : [13]

[the onion test is] designed as a thought experiment to test a hypothesis about the possible function of large amounts of noncoding DNA. If you think you have an explanation for why most of the human genome has a function then you should explain how that accounts for the genomes of onions. Ryan Gregory knew that most so-called explanations look very silly when you try using them to account for genome size in onion species.

Ewan Birney (then head of the ENCODE Project) explained the difference as a product of polyploidy, [14] [15] [ better source needed ] and therefore not relevant to the discussion of humans.

(re: onions etc); polyploidy and letting your repeats "go crazy" (bad piRNAs anyone) mean your genome can be v. big

Similar claims were made by John Mattick in an article defending the ENCODE project against arguments disputing the main finding of the project: [16]

The other substantive argument that bears on the issue, alluded to in the quotes that preface the Graur et al. article, and more explicitly discussed by Doolittle, is the so-called ‘C-value enigma’ , which refers to the fact that some organisms (like some amoebae, onions, some arthropods, and amphibians) have much more DNA per cell than humans, but cannot possibly be more developmentally or cognitively complex, implying that eukaryotic genomes can and do carry varying amounts of unnecessary baggage. That may be so, but the extent of such baggage in humans is unknown. However, where data is available, these upward exceptions appear to be due to polyploidy and/or varying transposon loads (of uncertain biological relevance), rather than an absolute increase in genetic complexity. Moreover, there is a broadly consistent rise in the amount of non-protein-coding intergenic and intronic DNA with developmental complexity, a relationship that proves nothing but which suggests an association that can only be falsified by downward exceptions, of which there are none known.

Freeling et al. proposed a genome balance hypothesis that presumably accounts for the C-Value Paradox and passes the Onion Test. [17]


Related Research Articles

<span class="mw-page-title-main">Genome</span> All genetic material of an organism

In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA. The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences, and often a substantial fraction of junk DNA with no evident function. Almost all eukaryotes have mitochondria and a small mitochondrial genome. Algae and plants also contain chloroplasts with a chloroplast genome.

<span class="mw-page-title-main">Ploidy</span> Number of sets of chromosomes in a cell

Ploidy is the number of complete sets of chromosomes in a cell, and hence the number of possible alleles for autosomal and pseudoautosomal genes. Sets of chromosomes refer to the number of maternal and paternal chromosome copies, respectively, in each homologous chromosome pair, which chromosomes naturally exist as. Somatic cells, tissues, and individual organisms can be described according to the number of sets of chromosomes present : monoploid, diploid, triploid, tetraploid, pentaploid, hexaploid, heptaploid or septaploid, etc. The generic term polyploid is often used to describe cells with three or more sets of chromosomes.

<span class="mw-page-title-main">Human genome</span> Complete set of nucleic acid sequences for humans

The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the nuclear genome and the mitochondrial genome. Human genomes include both protein-coding DNA sequences and various types of DNA that does not encode proteins. The latter is a diverse category that includes DNA coding for non-translated RNA, such as that for ribosomal RNA, transfer RNA, ribozymes, small nuclear RNAs, and several types of regulatory RNAs. It also includes promoters and their associated gene-regulatory elements, DNA playing structural and replicatory roles, such as scaffolding regions, telomeres, centromeres, and origins of replication, plus large numbers of transposable elements, inserted viral DNA, non-functional pseudogenes and simple, highly repetitive sequences. Introns make up a large percentage of non-coding DNA. Some of this non-coding DNA is non-functional junk DNA, such as pseudogenes, but there is no firm consensus on the total amount of junk DNA.

Non-coding DNA (ncDNA) sequences are components of an organism's DNA that do not encode protein sequences. Some non-coding DNA is transcribed into functional non-coding RNA molecules. Other functional regions of the non-coding DNA fraction include regulatory sequences that control gene expression; scaffold attachment regions; origins of DNA replication; centromeres; and telomeres. Some non-coding regions appear to be mostly nonfunctional such as introns, pseudogenes, intergenic DNA, and fragments of transposons and viruses.

Junk DNA is a DNA sequence that has no relevant biological function. Most organisms have some junk DNA in their genomes - mostly pseudogenes and fragments of transposons and viruses - but it is possible that some organisms have substantial amounts of junk DNA.

<span class="mw-page-title-main">Polyploidy</span> Condition where cells of an organism have more than two paired sets of chromosomes

Polyploidy is a condition in which the cells of an organism have more than one pair of (homologous) chromosomes. Most species whose cells have nuclei (eukaryotes) are diploid, meaning they have two complete sets of chromosomes, one from each of two parents; each set contains the same number of chromosomes, and the chromosomes are joined in pairs of homologous chromosomes. However, some organisms are polyploid. Polyploidy is especially common in plants. Most eukaryotes have diploid somatic cells, but produce haploid gametes by meiosis. A monoploid has only one set of chromosomes, and the term is usually only applied to cells or organisms that are normally diploid. Males of bees and other Hymenoptera, for example, are monoploid. Unlike animals, plants and multicellular algae have life cycles with two alternating multicellular generations. The gametophyte generation is haploid, and produces gametes by mitosis; the sporophyte generation is diploid and produces spores by meiosis.

Molecular evolution is the process of change in the sequence composition of cellular molecules such as DNA, RNA, and proteins across generations. The field of molecular evolution uses principles of evolutionary biology and population genetics to explain patterns in these changes. Major topics in molecular evolution concern the rates and impacts of single nucleotide changes, neutral evolution vs. natural selection, origins of new genes, the genetic nature of complex traits, the genetic basis of speciation, the evolution of development, and ways that evolutionary forces influence genomic and phenotypic changes.

Gene duplication is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene. Gene duplications can arise as products of several types of errors in DNA replication and repair machinery as well as through fortuitous capture by selfish genetic elements. Common sources of gene duplications include ectopic recombination, retrotransposition event, aneuploidy, polyploidy, and replication slippage.

<span class="mw-page-title-main">Evolution of sexual reproduction</span> How sexually reproducing multicellular organisms could have evolved from a common ancestor species

Sexual reproduction is an adaptive feature which is common to almost all multicellular organisms and various unicellular organisms. Currently, the adaptive advantage of sexual reproduction is widely regarded as a major unsolved problem in biology. As discussed below, one prominent theory is that sex evolved as an efficient mechanism for producing variation, and this had the advantage of enabling organisms to adapt to changing environments. Another prominent theory, also discussed below, is that a primary advantage of outcrossing sex is the masking of the expression of deleterious mutations. Additional theories concerning the adaptive advantage of sex are also discussed below. Sex does, however, come with a cost. In reproducing asexually, no time nor energy needs to be expended in choosing a mate and, if the environment has not changed, then there may be little reason for variation, as the organism may already be well-adapted. However, very few environments have not changed over the millions of years that reproduction has existed. Hence it is easy to imagine that being able to adapt to changing environment imparts a benefit. Sex also halves the amount of offspring a given population is able to produce. Sex, however, has evolved as the most prolific means of species branching into the tree of life. Diversification into the phylogenetic tree happens much more rapidly via sexual reproduction than it does by way of asexual reproduction.

<span class="mw-page-title-main">Functional genomics</span> Field of molecular biology

Functional genomics is a field of molecular biology that attempts to describe gene functions and interactions. Functional genomics make use of the vast data generated by genomic and transcriptomic projects. Functional genomics focuses on the dynamic aspects such as gene transcription, translation, regulation of gene expression and protein–protein interactions, as opposed to the static aspects of the genomic information such as DNA sequence or structures. A key characteristic of functional genomics studies is their genome-wide approach to these questions, generally involving high-throughput methods rather than a more traditional "candidate-gene" approach.

<span class="mw-page-title-main">ENCODE</span> Research consortium investigating functional elements in human and model organism DNA

The Encyclopedia of DNA Elements (ENCODE) is a public research project which aims "to build a comprehensive parts list of functional elements in the human genome."

C-value is the amount, in picograms, of DNA contained within a haploid nucleus or one half the amount in a diploid somatic cell of a eukaryotic organism. In some cases, the terms C-value and genome size are used interchangeably; however, in polyploids the C-value may represent two or more genomes contained within the same nucleus. Greilhuber et al. have suggested some new layers of terminology and associated abbreviations to clarify this issue, but these somewhat complex additions are yet to be used by other authors.

<span class="mw-page-title-main">Genome size</span> Amount of DNA contained in a genome

Genome size is the total amount of DNA contained within one copy of a single complete genome. It is typically measured in terms of mass in picograms or less frequently in daltons, or as the total number of nucleotide base pairs, usually in megabases. One picogram is equal to 978 megabases. In diploid organisms, genome size is often used interchangeably with the term C-value.

<span class="mw-page-title-main">Paleopolyploidy</span> State of having undergone whole genome duplication in deep evolutionary time

Paleopolyploidy is the result of genome duplications which occurred at least several million years ago (MYA). Such an event could either double the genome of a single species (autopolyploidy) or combine those of two species (allopolyploidy). Because of functional redundancy, genes are rapidly silenced or lost from the duplicated genomes. Most paleopolyploids, through evolutionary time, have lost their polyploid status through a process called diploidization, and are currently considered diploids, e.g., baker's yeast, Arabidopsis thaliana, and perhaps humans.

<span class="mw-page-title-main">Gene</span> Sequence of DNA or RNA that codes for an RNA or protein product

In biology, the word gene can have several different meanings. The Mendelian gene is a basic unit of heredity and the molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protein-coding genes and non-coding genes.

Neutral mutations are changes in DNA sequence that are neither beneficial nor detrimental to the ability of an organism to survive and reproduce. In population genetics, mutations in which natural selection does not affect the spread of the mutation in a species are termed neutral mutations. Neutral mutations that are inheritable and not linked to any genes under selection will be lost or will replace all other alleles of the gene. That loss or fixation of the gene proceeds based on random sampling known as genetic drift. A neutral mutation that is in linkage disequilibrium with other alleles that are under selection may proceed to loss or fixation via genetic hitchhiking and/or background selection.

<span class="mw-page-title-main">Genome evolution</span> Process by which a genome changes in structure or size over time

Genome evolution is the process by which a genome changes in structure (sequence) or size over time. The study of genome evolution involves multiple fields such as structural analysis of the genome, the study of genomic parasites, gene and ancient genome duplications, polyploidy, and comparative genomics. Genome evolution is a constantly changing and evolving field due to the steadily growing number of sequenced genomes, both prokaryotic and eukaryotic, available to the scientific community and the public at large.

<i>Allium</i> Genus of flowering plants in the family Amaryllidaceae

Allium is a genus of monocotyledonous flowering plants with hundreds of species, including the cultivated onion, garlic, scallion, shallot, leek, and chives. The generic name Allium is the Latin word for garlic, and the type species for the genus is Allium sativum which means "cultivated garlic".

The G-value paradox arises from the lack of correlation between the number of protein-coding genes among eukaryotes and their relative biological complexity. The microscopic nematode Caenorhabditis elegans, for example, is composed of only a thousand cells but has about the same number of genes as a human. Researchers suggest resolution of the paradox may lie in mechanisms such as alternative splicing and complex gene regulation that make the genes of humans and other complex eukaryotes relatively more productive.

Reductive evolution is the process by which microorganisms remove genes from their genome. It can occur when bacteria found in a free-living state enter a restrictive state or are completely absorbed by another organism becoming intracellular (symbiogenesis). The bacteria will adapt to survive and thrive in the restrictive state by altering and reducing its genome to get rid of the newly redundant pathways that are provided by the host. In an endosymbiont or symbiogenesis relationship where both the guest and host benefit, the host can also undergo reductive evolution to eliminate pathways that are more efficiently provided for by the guest.

References

  1. Palazzo, Alexander F.; Gregory, T. Ryan (8 May 2014). Akey, Joshua M. (ed.). "The Case for Junk DNA". PLOS Genetics. 10 (5): e1004351. doi: 10.1371/journal.pgen.1004351 . ISSN   1553-7404. PMC   4014423 . PMID   24809441. In summary, the notion that the majority of eukaryotic noncoding DNA is functional is very difficult to reconcile with the massive diversity in genome size observed among species, including among some closely related taxa. The onion test is merely a restatement of this issue, which has been well known to genome biologists for many decades.
  2. 1 2 "The onion test. « Genomicron". www.genomicron.evolverzone.com. Retrieved 13 February 2019.
  3. Moran, Laurence A. (12 October 2011). "Sandwalk: A Twofer". Sandwalk. Retrieved 13 February 2019.
  4. Zimmer, Carl (8 March 2015). "Is most of our DNA garbage?". The New York Times Magazine.
  5. Palazzo, Alexander F.; Gregory, T. Ryan (8 May 2014). Akey, Joshua M. (ed.). "The Case for Junk DNA". PLOS Genetics. 10 (5): e1004351. doi: 10.1371/journal.pgen.1004351 . ISSN   1553-7404. PMC   4014423 . PMID   24809441.
  6. Freeling, Michael; Xu, Jie; Woodhouse, Margaret; Lisch, Damon (2015). "A Solution to the C-Value Paradox and the Function of Junk DNA: The Genome Balance Hypothesis". Molecular Plant. 8 (6): 899–910. doi: 10.1016/j.molp.2015.02.009 . PMID   25743198.
  7. Germain, Pierre-Luc; Ratti, Emanuele; Boem, Federico (2014). "Junk or functional DNA? ENCODE and the function controversy". Biology & Philosophy. 29 (6): 807–831. doi:10.1007/s10539-014-9441-3. ISSN   0169-3867. S2CID   84480632.
  8. Graur, D.; Zheng, Y.; Price, N.; Azevedo, R. B. R.; Zufall, R. A.; Elhaik, E. (26 March 2013). "On the Immortality of Television Sets: "Function" in the Human Genome According to the Evolution-Free Gospel of ENCODE". Genome Biology and Evolution. 5 (3): 578–590. doi:10.1093/gbe/evt028. ISSN   1759-6653. PMC   3622293 . PMID   23431001.
  9. Graur, Dan (2016). Molecular and genome evolution. Sunderland, Massachusetts: Oxford University Press. ISBN   9781605354699. OCLC   951474209.
  10. 1 2 Ricroch, A; Yockteng, R; Brown, S C; Nadot, S (2005). "Evolution of genome size across some cultivated Allium species". Genome. 48 (3): 511–520. doi:10.1139/g05-017. ISSN   0831-2796. PMID   16121247.
  11. 1 2 "Why the "Onion Test" Fails as an Argument for "Junk DNA"". Evolution News. 2 November 2011. Retrieved 12 February 2019.
  12. Moran, Laurence A. (12 October 2011). "Sandwalk: A Twofer". Sandwalk.
  13. Moran, Larry (2 November 2011). "Sandwalk: Jonathan M Flunks the Onion Test, Again". Sandwalk. Retrieved 22 July 2020.
  14. @ewanbirney (5 September 2012). "Register" (Tweet). Retrieved 20 July 2020 via Twitter.
  15. @leonidkruglyak (5 September 2012). "Register" (Tweet). Retrieved 20 July 2020 via Twitter.
  16. Mattick, John S.; Dinger, Marcel E. (15 July 2013). "The extent of functionality in the human genome". The HUGO Journal. 7 (1): 2. doi: 10.1186/1877-6566-7-2 . ISSN   1877-6566. PMC   4685169 .
  17. Freeling, Michael; Xu, Jie; Woodhouse, Margaret; Lisch, Damon (1 June 2015). "A Solution to the C-Value Paradox and the Function of Junk DNA: The Genome Balance Hypothesis". Molecular Plant. 8 (6): 899–910. doi: 10.1016/j.molp.2015.02.009 . PMID   25743198.