Apple genome

Last updated
Red phenotype of apple associated with an LTR retrotransposon Apple Genome LTR Red Phenotype.png
Red phenotype of apple associated with an LTR retrotransposon

In 2010, an Italian-led consortium announced they had sequenced the first complete genome of an apple in collaboration with horticultural genomicists at Washington State University, using 'Golden Delicious'. The apple genome has approximately 57,000 genes, which was the highest number of any plant genome studied at the time, and more genes than the human genome which has about 25,000 genes. The modern apple has 17 chromosomes which were found to be derived from an ancestor with 9 chromosomes that experienced genome-wide duplication. The genome sequence also provided proof that Malus sieversii was the wild ancestor of the domestic apple—an issue that had been long-debated in the scientific community. In 2016 a new and much higher quality whole genome sequence (WGS) for a double-haploid derivative of the Golden Delicious variety of apple was published. [1] This new understanding of the apple genome will help scientists identify genes and gene variants that contribute to resistance to disease and drought and other desirable characteristics. Understanding the genes behind these characteristics will help scientists perform more knowledgeable selective breeding.

Since the publication of the Golden Delicious WGS, many scientific discoveries have been made about apples, including that 60% of the apple’s genome is made up of transposable elements, [2] and the identification of what makes apples red. Genetic evidence has confirmed that MdMYB1, which regulates transcription of the anthocyanin biosynthesis pathway, is responsible for the red color in apples.

Apple color is important when it comes to consumer preference, and red apples are generally preferred. [3] An additional genome assembly of the Hanfu apple (HFTH1) was compared to the Golden Delicious (GDDH13) genome and showed extensive genomic variation largely due to transposable elements. [4]

The transcript levels of MdMYB1 and anthocyanin-related structural genes in the skins of Hanfu and Golden Delicious apples are significantly different. MdMYB1 has at least three types of alleles (MdMYB1-1, MdMYB1-2, and MdMYB1-3). MdMYB1-1 is a single dominant allele controlling anthocyanin synthesis in apple skin. In non-red apples, the MdMYB1-2 and MdMYB1-3 alleles show a limited expression under intense light and low-temperature. The coding region differences of these alleles do not have an impact on functionality, and scientists do not yet know the reason for the differences in expression levels in the MdMYB1 alleles. In Golden Delicious and Hanfu apples, the coding sequences of MdMYB1 were the same, but one Single nucleotide polymorphism (SNP) was found in the intron regions. Upstream of MdMYB1, 15 SNPs and five indels were identified. These indels were very different between the two types of apples. One of these indels is an LTR retrotransposon called redTE, located in the Hanfu apple genome upstream of MbMYB1. RedTE has identical flanking LTRs which means it was a more recent insertion. Many red and non-red apples were tested, and redTE was identified in all of the red apples and none of the non-red apples, meaning that redTE may be responsible for the red color of apples.

Related Research Articles

<span class="mw-page-title-main">Genome</span> All genetic material of an organism

In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA. The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences, and often a substantial fraction of junk DNA with no evident function. Almost all eukaryotes have mitochondria and a small mitochondrial genome. Algae and plants also contain chloroplasts with a chloroplast genome.

The genotype of an organism is its complete set of genetic material. Genotype can also be used to refer to the alleles or variants an individual carries in a particular gene or genetic location. The number of alleles an individual can have in a specific gene depends on the number of copies of each chromosome found in that species, also referred to as ploidy. In diploid species like humans, two full sets of chromosomes are present, meaning each individual has two alleles for any given gene. If both alleles are the same, the genotype is referred to as homozygous. If the alleles are different, the genotype is referred to as heterozygous.

<span class="mw-page-title-main">Transposable element</span> Semiparasitic DNA sequence

A transposable element is a nucleic acid sequence in DNA that can change its position within a genome, sometimes creating or reversing mutations and altering the cell's genetic identity and genome size. Transposition often results in duplication of the same genetic material. In the human genome, L1 and Alu elements are two examples. Barbara McClintock's discovery of them earned her a Nobel Prize in 1983. Its importance in personalized medicine is becoming increasingly relevant, as well as gaining more attention in data analytics given the difficulty of analysis in very high dimensional spaces.

<span class="mw-page-title-main">Human genome</span> Complete set of nucleic acid sequences for humans

The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the nuclear genome and the mitochondrial genome. Human genomes include both protein-coding DNA sequences and various types of DNA that does not encode proteins. The latter is a diverse category that includes DNA coding for non-translated RNA, such as that for ribosomal RNA, transfer RNA, ribozymes, small nuclear RNAs, and several types of regulatory RNAs. It also includes promoters and their associated gene-regulatory elements, DNA playing structural and replicatory roles, such as scaffolding regions, telomeres, centromeres, and origins of replication, plus large numbers of transposable elements, inserted viral DNA, non-functional pseudogenes and simple, highly repetitive sequences. Introns make up a large percentage of non-coding DNA. Some of this non-coding DNA is non-functional junk DNA, such as pseudogenes, but there is no firm consensus on the total amount of junk DNA.

Selfish genetic elements are genetic segments that can enhance their own transmission at the expense of other genes in the genome, even if this has no positive or a net negative effect on organismal fitness. Genomes have traditionally been viewed as cohesive units, with genes acting together to improve the fitness of the organism. However, when genes have some control over their own transmission, the rules can change, and so just like all social groups, genomes are vulnerable to selfish behaviour by their parts.

Repeated sequences are short or long patterns of nucleic acids that occur in multiple copies throughout the genome. In many organisms, a significant fraction of the genomic DNA is repetitive, with over two-thirds of the sequence consisting of repetitive elements in humans. Some of these repeated sequences are necessary for maintaining important genome structures such as telomeres or centromeres.

<span class="mw-page-title-main">Retrotransposon</span> Type of genetic component

Retrotransposons are mobile elements which move in the host genome by converting their transcribed RNA into DNA through the reverse transcription. Thus, they differ from Class II transposable elements, or DNA transposons, in utilizing an RNA intermediate for the transposition and leaving the transposition donor site unchanged.

<span class="mw-page-title-main">Genetic Information Research Institute</span>

The Genetic Information Research Institute (GIRI) is a non-profit institution that was founded in 1994 by Jerzy Jurka. The mission of the institute "is to understand biological processes which alter the genetic makeup of different organisms, as a basis for potential gene therapy and genome engineering techniques." The institute specializes in applying computer tools to analysis of DNA and protein sequence information. GIRI develops and maintains Repbase Update, a database of prototypic sequences representing repetitive DNA from different eukaryotic species, and Repbase Reports, an electronic journal established in 2001. Repetitive DNA is primarily derived from transposable elements (TEs), which include DNA transposons belonging to around 20 superfamilies and retrotransposons that can also be sub-classified into subfamilies. The majority of known superfamilies of DNA transposons were discovered or co-discovered at GIRI, including Helitron, Academ, Dada, Ginger, Kolobok, Novosib, Sola, Transib, Zator, PIF/Harbinger and Polinton/Maverick. An ancient element from the Transib superfamily was identified as the evolutionary precursor of the Recombination activating gene. GIRI has hosted three international conferences devoted to the genomic impact of eukaryotic transposable elements.

<span class="mw-page-title-main">Sex chromosome</span> Chromosome that differs from an ordinary autosome in form, size, and behavior

Sex chromosomes are chromosomes that carry the genes that determine the sex of an individual. The human sex chromosomes are a typical pair of mammal allosomes. They differ from autosomes in form, size, and behavior. Whereas autosomes occur in homologous pairs whose members have the same form in a diploid cell, members of an allosome pair may differ from one another.

Jeffrey Lynn Bennetzen is an American geneticist on the faculty of the University of Georgia (UGA). Bennetzen is known for his work describing codon usage bias in yeast, and E. coli; being the first to clone and sequence an active transposon in plants, discovering that most of the DNA in plant genomes was a particular class of mobile DNA (LTR-retrotransposons); solving the C-value paradox; proposing sorghum and Setaria as model grasses; showing that rice centromeres were hotspots for recombination, but not crossovers; and developing a technique to date polyploidization events. He is an author, with Sarah Hake of the book "Handbook of Maize." Bennetzen was elected to the US National Academy of Sciences in 2004.

<span class="mw-page-title-main">Long terminal repeat</span> DNA sequence

A long terminal repeat (LTR) is a pair of identical sequences of DNA, several hundred base pairs long, which occur in eukaryotic genomes on either end of a series of genes or pseudogenes that form a retrotransposon or an endogenous retrovirus or a retroviral provirus. All retroviral genomes are flanked by LTRs, while there are some retrotransposons without LTRs. Typically, an element flanked by a pair of LTRs will encode a reverse transcriptase and an integrase, allowing the element to be copied and inserted at a different location of the genome. Copies of such an LTR-flanked element can often be found hundreds or thousands of times in a genome. LTR retrotransposons comprise about 8% of the human genome.

<span class="mw-page-title-main">International Grape Genome Program</span>

The International Grape Genomics Program (IGGP) is a collaborative genome project dedicated to determining the genome sequence of the grapevine Vitis vinifera. It is a multinational project involving research centers in Australia, Canada, Chile, France, Germany, Italy, South Africa, Spain, and the United States.

Micropia is the name of a family of LTR retrotransposons widespread in the genomes of fruitflies of the genus Drosophila. Micropia retrotransposons in some species of Drosophila express a male germline-specific and meiotic-specific antisense transcript complementary to the reverse transcriptase (RT) and ribonuclease A (RNaseA) genes of the proviral retrotransposon. No active transposition of micropia has been registered so far. micropia is likely part of a selfish driver system responsible for the Drosophila Y chromosomal lampbrushloop evolution in some species.

<span class="mw-page-title-main">LTR retrotransposon</span> Class I transposable element

LTR retrotransposons are class I transposable elements (TEs) characterized by the presence of long terminal repeats (LTRs) directly flanking an internal coding region. As retrotransposons, they mobilize through reverse transcription of their mRNA and integration of the newly created cDNA into another genomic location. Their mechanism of retrotransposition is shared with retroviruses, with the difference that the rate of horizontal transfer in LTR-retrotransposons is much lower than the vertical transfer by passing active TE insertions to the progeny. LTR retrotransposons that form virus-like particles are classified under Ortervirales.

Ectopic recombination is an atypical form of recombination in which a crossing over takes place between two homologous DNA sequences located at non-allelic chromosomal positions. Such recombination often results in dramatic chromosomal rearrangement, which is generally harmful to the organism. Some research, however, has suggested that ectopic recombination can result in mutated chromosomes that benefit the organism. Ectopic recombination can occur during both meiosis and mitosis, although it is more likely occur during meiosis. It occurs relatively frequently—in at least one yeast species the frequency of ectopic recombination is roughly on par with that of allelic recombination. If the alleles at two loci are heterozygous, then ectopic recombination is relatively likely to occur, whereas if the alleles are homozygous, they will almost certainly undergo allelic recombination. Ectopic recombination does not require loci involved to be close to one another; it can occur between loci that are widely separated on a single chromosome, and has even been known to occur across chromosomes. Neither does it require high levels of homology between sequences—the lower limit required for it to occur has been estimated at as low as 2.2 kb of homologous stretches of DNA nucleotides.

A conserved non-coding sequence (CNS) is a DNA sequence of noncoding DNA that is evolutionarily conserved. These sequences are of interest for their potential to regulate gene production.

<span class="mw-page-title-main">Long interspersed nuclear element</span>

Long interspersed nuclear elements (LINEs) are a group of non-LTR retrotransposons that are widespread in the genome of many eukaryotes. LINEs contain an internal Pol II promoter to initiate transcription into mRNA, and encode one or two proteins, ORF1 and ORF2. The functional domains present within ORF1 vary greatly among LINEs, but often exhibit RNA/DNA binding activity. ORF2 is essential to successful retrotransposition, and encodes a protein with both reverse transcriptase and endonuclease activity.

Transposable elements are short strands of repetitive DNA that can self-replicate and translocate within the eukaryotic genome, and are generally perceived as parasitic in nature. Their transcription can lead to the production of dsRNAs, which resemble retroviruses transcripts. While most host cellular RNA has a singular, unpaired sense strand, dsRNA possesses sense and anti-sense transcripts paired together, and this difference in structure allows an host organism to detect dsRNA production, and thereby the presence of transposons. Plants lack distinct divisions between somatic cells and reproductive cells, and also have, generally, larger genomes than animals, making them an intriguing case-study kingdom to be used in attempting to better understand the epigenetics function of transposable elements.

Metavirus is a genus of viruses in the family Metaviridae. They are retrotransposons that invade a eukaryotic host genome and may only replicate once the virus has infected the host. These genetic elements exist to infect and replicate in their host genome and are derived from ancestral elements unrelated from their host. Metavirus may use several different hosts for transmission, and has been found to be transmissible through ovule and pollen of some plants.

A plant genome assembly represents the complete genomic sequence of a plant species, which is assembled into chromosomes and other organelles by using DNA fragments that are obtained from different types of sequencing technology.

References

  1. "The Apple Genome and Epigenome" . Retrieved 14 April 2020.
  2. Daccord, N.; Celton, J.; Linsmith, G. (2017). "High-quality de novo assembly of the apple genome and methylome dynamics of early fruit development". Nature Genetics. 49 (7): 1099–1106. doi:10.1038/ng.3886 . Retrieved 14 April 2020.
  3. "Red Color Development in Apple Fruit" . Retrieved 14 April 2020.
  4. Zhang, L.; Hu, J.; Han, X. (2019). "A high-quality apple genome assembly reveals the association of a retrotransposon and red fruit colour". Nature Communications. 10 (1): 1494. Bibcode:2019NatCo..10.1494Z. doi: 10.1038/s41467-019-09518-x . PMC   6445120 . PMID   30940818. S2CID   91190274.