This list of sequenced plant genomes contains plant species known to have publicly available complete genome sequences that have been assembled, annotated and published. Unassembled genomes are not included, nor are organelle only sequences. For all kingdoms, see the list of sequenced genomes.
See also List of sequenced algae genomes.
Organism strain | Division | Relevance | Genome size | Number of genes predicted | Organization | Year of completion | Assembly status |
---|---|---|---|---|---|---|---|
Anthoceros angustus | Bryophytes | Early diverging land plant | |||||
Ceratodon purpureus | Bryophytes | Early diverging land plant | |||||
Fontinalis antipyretica (greater water-moss) | Bryophytes | Aquatic moss | 385.2 Mbp | 16,538 | BGI | 2020 [1] | BGISEQ-500 & 10X, scaffold N50 45.8 Kbp |
Marchantia polymorpha | Bryophytes | Early diverging land plant | 225.8 Mb | 19,138 | 2017 [2] | ||
Physcomitrella patens ssp. patens str. Gransden 2004 | Bryophytes | Early diverging land plant | 462.3 Mbp | 35,938 | 2008 [3] | ||
Pleurozium schreberi (feather moss) | Bryophytes | Ubiquitous moss species | 318 Mbp | 15,992 | 2019 [4] |
Organism strain | Division | Relevance | Genome size | Number of genes predicted | Organization | Year of completion | Assembly status |
---|---|---|---|---|---|---|---|
Isoetes sinensis | Lycopodiophyta | First aquatic Quillwort | 2.131 GB | 57,303 | 2023 [5] | Scaffold N50 = 86 Mb | |
Selaginella moellendorffii | Lycopodiophyta | Model organism | 106 Mb | 22,285 | 2011 [6] [7] | scaffold N50 = 1.7 Mb | |
Selaginella lepidophylla | Lycopodiophyta | Desiccation tolerance | 122 Mb | 27,204 | 2018 [8] | contig N50 = 163 kb |
Organism strain | Division | Relevance | Genome size | Number of genes predicted | Organization | Year of completion | Assembly status |
---|---|---|---|---|---|---|---|
Azolla filiculoides | Polypodiophyta | Fern | 0.75 Gb | 20,201 | 2018 [9] | ||
Salvinia cucullata | Polypodiophyta | Fern | 0.26 Gb | 19,914 | 2018 [9] | ||
Ceratopteris richardii | Polypodiophyta | Model organism | 7.5 Gb | 36,857 | 2019 (v1.1), [10] 2021 (v2.1) [11] | Partial assembly consisting of 7.5 Gb/11.2 Gb, arranged in 39 chromosomes | |
Alsophila spinulosa | Polypodiophyta | Tree Fern | 6.23 Gb | 67,831 | 2022 [12] |
Organism strain | Division | Relevance | Genome size | Number of genes predicted | No of chromosomes | Organization | Year of completion | Assembly status |
---|---|---|---|---|---|---|---|---|
Cycas panzhihuaensis (Dukou sago palm) | Cycadophyta | Rare and vulnerable species of cycad | 10.5 Gb | 2022 [13] | ||||
Picea abies (Norway spruce) | Pinales | Timber, tonewood, ornamental such as Christmas tree | 19.6 Gb | 26,359 [14] | 12 | Umeå Plant Science Centre / SciLifeLab, Sweden | 2013 [15] | |
Picea glauca (White spruce) | Pinales | Timber, Pulp | 20.8 Gb | 14,462 [14] | 12 | Institutional Collaboration | 2013 [16] [17] | |
Pinus taeda (Loblolly pine) | Pinales | Timber | 20.15 Gb | 9,024 [14] | 12 | 2014 [18] [19] [20] | N50 scaffold size: 66.9 kbp | |
Pinus lambertiana (Sugar pine) | Pinales | Timber; with the largest genomes among the pines; the largest pine species | 31 Gb | 13,936 | 12 | 2016 [14] | 61.5X sequence coverage, platforms used: Hiseq 2000, Hiseq 2500, GAIIx, MiSeq | |
Ginkgo biloba | Ginkgoales | 11.75 Gb | 41,840 | 2016 [21] | N50 scaffold size: 48.2 kbp | |||
Pseudotsuga menziesii | Pinales | 16 Gb | 54,830 | 13 | 2017 [22] | N50 scaffold size : 340.7 kbp | ||
Gnetum monatum | Gnetales | 4.07 Gb | 27,491 | 2018 [23] | ||||
Larix sibirica | Pinales | 12.34 Gbp | 2019 [24] | scaffold N50 of 6440 bp | ||||
Abies alba | Pinales | 18.16 Gb | 94,205 | 2019 [25] | scaffold N50 of 14,051 bp |
Organism strain | Family | Relevance | Genome size | Number of genes predicted | Organization | Year of completion | Assembly status |
---|---|---|---|---|---|---|---|
Amborella trichopoda | Amborellaceae | Basal angiosperm | 2013 [26] [27] |
Organism strain | Family | Relevance | Genome size | Number of genes predicted | Organization | Year of completion | Assembly status |
---|---|---|---|---|---|---|---|
Chloranthus spicatus (Thunb.) Makino, [28] (Pearl Orchid) | Chlorantaceae | 2021 [29] | saffold N50 of 191.37 Mb |
Organism strain | Family | Relevance | Genome size | Number of genes predicted | Organization | Year of completion | Assembly status |
---|---|---|---|---|---|---|---|
Annona muricata | Annonaceae | Commercially grown fruit, medicinal applications | 799.11 Mb | 23,375 | Institute for Biodiversity and Environmental Research (UBD) Alliance for Conservation Tree Genomics Biodiversity Genomics Team | 2021 [30] | PacBio and Illumina short‐reads, in combination with 10× Genomics and Bionano data (v1). A total of 949 scaffolds assembled to a final size of 656.77 Mb, with a scaffold N50 of 3.43 Mb (v1), and then further improved to seven pseudo‐chromosomes using Hi‐C sequencing data (v2; scaffold N50: 93.2 Mb, total size in chromosomes: 639.6 Mb). |
Salix arbutifolia (syn. Chosenia arbutifolia) | Salicaceae | Seriously endangered relic species | 338.93 Mb | 33,229 | 2022 [31] | Contig N50 of 1.68 Mb | |
Cinnamomum kanehirae (Stout camphor tree) | Lauraceae | 730.7 Mb | 2019 [32] |
Organism strain | Family | Relevance | Genome size | Number of genes predicted | Organization | Year of completion | Assembly status |
---|---|---|---|---|---|---|---|
Macadamia integrifolia HAES 741 (Macadamia nut) | Proteaceae | Commercially grown nut | 745 Mb | 34,274 | 2020 [33] | N50 413 kb | |
Macadamia jansenii | Proteaceae | Rare relative of macademia nut | 750 Mbp | 2020 [34] | Compared Nanopore, Illumina and BGI stLRF data | ||
Nelumbo nucifera (sacred lotus) | Nelumbonaceae | Basal eudicot | 929 Mbp | 2013 [35] | contig N50 of 38.8 kbp and a scaffold N50 of 3.4 Mbp |
Organism strain | Family | Relevance | Genome size | Number of genes predicted | Organization | Year of completion | Assembly status |
---|---|---|---|---|---|---|---|
Aquilegia coerulea | Ranunculaceae | Basal eudicot | Unpublished [36] |
Organism strain | Family | Relevance | Genome size | Number of genes predicted | Organization | Year of completion | Assembly status |
---|---|---|---|---|---|---|---|
Trochodendron aralioides (Wheel tree) | Trochodendrales | Basal eudicot having secondary xylem without vessel elements | 1.614 Gb | 35,328 | Guangxi University | 2019 [37] | 19 scaffolds corresponding to 19 chromosomes |
Organism strain | Family | Relevance | Genome size | Number of genes predicted | Organization | Year of completion | Assembly status |
---|---|---|---|---|---|---|---|
Beta vulgaris (sugar beet) | Chenopodiaceae | Crop plant | 714–758 Mbp | 27,421 | 2013 [38] | ||
Chenopodium quinoa | Chenopodiaceae | Crop plant | 1.39–1.50 Gb | 44,776 | 2017 [39] | 3,486 scaffolds, scaffold N50 of 3.84 Mb, 90% of the assembled genome is contained in 439 scaffolds [39] | |
Amaranthus hypocondriacus | Amaranthaceae | Crop plant | 403.9 Mb | 23,847 | 2016 [40] | 16 large scaffolds from 16.9 to 38.1 Mb. N50 and L50 of the assembly was 24.4 Mb and 7, respectively. [41] | |
Carnegiea gigantea | Cactaceae | Wild plant | 1.40 Gb | 28,292 | 2017 [42] | 57,409 scaffolds, scaffold N50 of 61.5 kb [42] | |
Nepenthes mirabilis | Nepenthes | Carnivorous Plant | 691 Mb | 42,961 | 2023 [43] | 159,555 contigs/scaffolds and N50 of 10,307 bp | |
Suaeda aralocaspica | Amaranthaceae | Performs complete C4 photosynthesis within individual cells (SCC4) | 467 Mb | 29,604 | ABLife Inc. | 2019 [44] | 4,033 scaffolds, scaffold N50 length of 1.83 Mb |
Simmondsia chinensis (jojoba) | Simmondsiaceae | Oilseed Crop | 887 Mb | 23,490 | 2020 [45] | 994 scaffolds, scaffold N50 length of 5.2 Mb | |
Droseraceae | Carnivorous Plant | 263.79 Mb | 2016 [46] | 12,713 scaffolds [46] | |||
Tamarix chinensis (Chinese tamarisk) | Tamaricaceae | Margin tree | 1.32 Gb | 2023 [47] |
Organism strain | Family | Relevance | Genome size | Number of genes predicted | No of chromosomes | Organization | Year of completion | Assembly status |
---|---|---|---|---|---|---|---|---|
Bretschneidera sinensis | Akaniaceae | endangered relic tree species | 1.21 Gb | 45,839 | 2022 [48] | |||
Sclerocarya birrea (Marula) | Anacardiaceae | Used for food | 18,397 | 2018 [49] [50] | ||||
Begonia masoniana (Iron cross begonia) | Begoniaceae | Flower | 799.83 Mb | 2022 [51] | ||||
Begonia peltatifolia (Rex begonia) | Begoniaceae | Flower | 331.75 Mb | 2022 [51] | ||||
Betula pendula (silver birch) | Betulaceae | Boreal forest tree, model for forest biotechnology | 435 Mbp [52] | 28,399 | 14 | University of Helsinki | 2017 [52] | 454/Illumina/PacBio. Assembly size 435 Mbp. Contig N50: 48,209 bp, scaffold N50: 239,796 bp. 89% of the assembly mapped to 14 pseudomolecules. Additionally 150 birch individuals sequenced. |
Betula platyphylla (Japanese white birch) | Betulaceae | Pioneer hardwood tree species | 430 Mbp | 2021 [53] | contig N50 = 751 kbp | |||
Betula nana (dwarf birch) | Betulaceae | Arctic shrub | 450 Mbp | QMUL/SBCS | 2013 [54] | |||
Corylus heterophylla Fisch (Asian hazel) | Betulaceae | Nut tree used for food | 370.75 Mbp | 27,591 | 11 | 2021 [55] | Nanopore/Hi-C chromosome scale. Contig N50 and scaffold N50 sizes of 2.07 and 31.33 Mb, respectively | |
Corylus mandshurica | Betulaceae | Hazel used for breeding | 367.67 Mb | 28,409 | 11 | 2021 [56] | ||
Aethionema arabicum | Brassicaceae | Comparative analysis of crucifer genomes | 2013 [57] | |||||
Arabidopsis lyrata ssp. lyrata strain MN47 | Brassicaceae | Model plant | 206.7 Mbp | 32,670 [58] | 8 | 2011 [58] | 8.3X sequence coverage, analyzed on ABI 3730XL capillary sequencers | |
Arabidopsis thaliana Ecotype:Columbia | Brassicaceae | Model plant | 135 Mbp | 27,655 [59] | 5 | AGI | 2000 [60] | |
Barbarea vulgaris G-type | Brassicaceae | Model plant for specialised metabolites and plant defenses | 167.7 Mbp | 25,350 | 8 | 2017 [61] | 66.5 X coverage with Illumina GA II technology | |
Brassica rapa ssp. pekinensis (Chinese cabbage) accession Chiifu-401-42 | Brassicaceae | Assorted crops and model organism | 485 Mbp | 41,174 (has undergone genome triplication) | 10 | The Brassica rapa Genome Sequencing Project Consortium | 2011 [62] | 72X coverage of paired short read sequences generated by Illumina GA II technology |
Brassica napus (Oilseed rape or rapeseed) European winter oilseed cultivar 'Darmor-bzh' | Brassicaceae | Crops | 1130 Mbp | 101,040 | 19 | Institutional Collaboration | 2014 [63] | 454 GS-FLX+ Titanium (Roche, Basel, Switzerland) and Sanger sequencing. Correction and gap filling used 79 Gb of Illumina (San Diego, CA) HiSeq sequence. |
Capsella rubella | Brassicaceae | Close relative of Arabidopsis thaliana | 130 Mbp | 26,521 | JGI | 2013? [64] 2013 [65] | ||
Cardamine hirsuta (hairy bittercress) strain 'Oxford' | Brassicaceae | A model system for studies in evolution of plant development | 198 Mbp | 29,458 | 8 | Max Planck Institute for Plant Breeding Research, Köln, Germany | 2016 [66] | Shotgun sequencing strategy, combining paired end reads (197× assembled sequence coverage) and mate pair reads (66× assembled) from Illumina HiSeq (a total of 52 Gbp raw reads). |
Eruca sativa (salad rocket) | Brassicaceae | Used for food | 851 Mbp | 45,438 | University of Reading | 2020 [67] | Illumina MiSeq and HiSeq2500. PCR free paired end and long mate pair sequencing and assembly. Illumina HiSeq transcriptome sequencing (125/150 bp paired end reads). | |
Erysimum cheiranthoides (wormseed wallflower) strain 'Elbtalaue' | Brassicaceae | Model plant for studying defensive chemistry, including cardiac glycosides | 175 Mbp | 29,947 | 8 | Boyce Thompson Institute, Ithaca, NY | 2020 [68] [69] | 39.5 Gb PacBio sequences (average length 10,603 bp), one lane Illumina MiSeq sequencing (2 x 250 bp paired end), Phase Genomics Hi-C scaffolding, PacBio and Illumina transcriptome sequencing |
Eutrema salsugineum | Brassicaceae | A relative of arabidopsis with high salt tolerance | 240 Mbp | 26,351 | JGI | 2013 [70] | ||
Eutrema parvulum | Brassicaceae | Comparative analysis of crucifer genomes | 2013 [57] | |||||
Leavenworthia alabamica | Brassicaceae | Comparative analysis of crucifer genomes | 2013 [57] | |||||
Sisymbrium irio | Brassicaceae | Comparative analysis of crucifer genomes | 2013 [57] | |||||
Thellungiella parvula | Brassicaceae | A relative of arabidopsis with high salt tolerance | 2011 [71] | |||||
Cannabis sativa (hemp) | Cannabaceae | Hemp and marijuana production | ca 820 Mbp | 30,074 based on transcriptome assembly and clustering | 2011 [72] | Illumina/454 scaffold N50 16.2 Kbp | ||
Capparis spinosa var. herbacea (Caper) | Capparaceae | Crop | 274.53 Mb | 21,577 | 2022 [73] | contig N50 9.36 Mb | ||
Carica papaya (papaya) | Caricaceae | Fruit crop | 372 Mbp | 28,629 | 2008 [74] | contig N50 11kbp scaffold N50 1Mbp total coverage ~3x (Sanger) 92.1% unigenes mapped 235Mbp anchored (of this 161Mbp also oriented) | ||
Casuarina equisetifolia (Australian Pine) | Casuarinaceae | bonsai subject | 300 Mbp | 29,827 | 2018 [75] | |||
Tripterygium wilfordii (Lei gong teng) | Celastraceae | Chinese medicine crop | 340.12 Mbp | 31,593 | 2021 [76] | Contig N50 3.09 Mbp | ||
Cleome gynandra (African cabbage) | Cleomaceae | C4 leafy vegetable and medicinal plant | 740 Mb | 30,933 | 2023 [77] | N50 of 42 Mb | ||
Kalanchoë fedtschenkoi Raym.-Hamet & H. PerrierKalanchoe | Crassulaceae | Molecular genetic model for obligate CAM species in the eudicots | 256 Mbp | 30,964 | 34 | 2017 [78] | ~70× paired-end reads and ~37× mate-pair reads generated using an Illumina MiSeq platform. | |
Rhodiola crenulata (Tibetan medicinal herb) | Crassulaceae | Uses for medicine and food | 344.5 Mb | 35,517 | 2017 [79] | |||
Citrullus lanatus (watermelon) | Cucurbitaceae | Vegetable crop | ca 425 Mbp | 23,440 | BGI | 2012 [80] | Illumina coverage 108.6x contig N50 26.38 kbp Scaffold N50 2.38 Mbp genome covered 83.2% ~97% ESTs mapped | |
Cucumis melo (Muskmelon) DHL92 | Cucurbitaceae | Vegetable crop | 450 Mbp | 27,427 | 2012 [81] | 454 13.5x coverage contig N50: 18.1kbp scaffold N50: 4.677 Mbp WGS | ||
Cucumis sativus (cucumber) 'Chinese long' inbred line 9930 | Cucurbitaceae | Vegetable crop | 350 Mbp (Kmer depth) 367 Mbp (flow cytometry) | 26,682 | 2009 [82] | contig N50 19.8kbp scaffold N50 1,140kbp total coverage ~72.2 (Sanger + Ilumina) 96.8% unigenes mapped 72.8% of the genome anchored | ||
Cucurbita argyrosperma subsp. argyrosperma (Silver-seed gourd) | Cucurbitaceae | Seed and fruit crop | 228.8 Mbp | 27,998 | 20 | National Autonomous University of Mexico | 2019, [83] updated in 2021 | contig N50 447 kbp scaffold N50 11.6 Mbp total coverage: 120x Illumina (HiSeq2000 and MiSeq) + 31x PacBio RSII |
Cucurbita argyrosperma subsp. sororia (wild gourd) | Cucurbitaceae | Wild relative of the silver-seed gourd | 255.2 Mbp | 30,592 | 20 | National Autonomous University of Mexico | 2021 [84] | contig N50 1.2 Mbp scaffold N50 12.1 Mbp total coverage: 213x Illumina HiSeq4000 + 75.4x PacBio Sequel |
Siraitia grosvenorii (Monk fruit) | Cucurbitaceae | Chinese medicine/sweetener | 456.5 Mbp | 30,565 | Anhui Agricultural University | 2018 [85] | ||
Hippophae rhamnoides (sea-buckthorn) | Elaeagnaceae | used in food and cosmetics | 730 Mbp | 30,812 | 2022 [86] | |||
Hevea brasiliensis (rubber tree) | Euphorbiaceae | the most economically important member of the genus Hevea | 2013 [87] | |||||
Jatropha curcas Palawan | Euphorbiaceae | bio-diesel crop | 2011 [88] | |||||
Manihot esculenta (Cassava) | Euphorbiaceae | Humanitarian importance | ~760 Mb | 30,666 | JGI | 2012 [89] | ||
Ricinus communis (Castor bean) | Euphorbiaceae | Oilseed crop | 320 Mbp | 31,237 | JCVI | 2010 [90] | Sanger coverage~4.6x contig N50 21.1 kbp scaffold N50 496.5kbp | |
Ricinus communis L. (Wild Castor) | Euphorbiaceae | one of the most important oil crops worldwide | ~318.13 Mb | 30,066 | National Key R&D Program of China, the National Natural Science Foundation of China, the Guangdong Basic and Applied Basic Research Foundation, China, and the Shenzhen Science and Technology Program, China | 2021 [91] | genome size of 316 Mb, a scaffold N50 of 31.93 Mb, and a contig N50 of 8.96 Mb | |
Ammopiptanthus nanus | Fabaceae | Only genus of evergreen broadleaf shrub | 889 Mb | 37,188 | 2018 [92] | |||
Cajanus cajan (Pigeon pea) var. Asha | Fabaceae | Model legume | 2012 [93] [94] | |||||
Arachis duranensis (A genome diploid wild peanut) accession V14167 | Fabaceae | Wild ancestor of peanut, an oilseed and grain legume crop | 2016 [95] | Illumina 154x coverage, contig N50 22 kbp, scaffold N50 948 kbp | ||||
Amphicarpaea edgeworthii (Chinese hog-peanut) | Fabaceae | produces both aerial and subterranean fruits | 299-Mb | 27 899 | Taishan Scholar Program, National Natural Science Foundation of China, the Innovation Program of SAAS | 2021 [96] | ||
Arachis ipaensis (B genome diploid wild peanut) accession K30076 | Fabaceae | Wild ancestor of peanut, an oilseed and grain legume crop | 2016 [95] | Illumina 163x coverage, contig N50 23 kbp, scaffold N50 5,343 kbp | ||||
Cicer arietinum (chickpea) | Fabaceae | filling | 2013 [97] | |||||
Cicer arietinum L. (chickpea) | Fabaceae | 2013 [98] | ||||||
Dalbergia odorifera (fragrant rosewood) | Fabaceae | Wood product (heartwood) and folk medicine | 653 Mb | 30,310 | 10 | Chinese Academy of Forestry | 2020 [99] | Contig N50: 5.92Mb Scaffold N50: 56.1 6Mb |
Faidherbia albida (Apple-Ring Acacia) | Fabaceae | Importante in the Sahel for raising bees | 28,979 | 2018 [100] [49] | ||||
Glycine max (soybean) var. Williams 82 | Fabaceae | Protein and oil crop | 1115 Mbp | 46,430 | 2010 [101] | Contig N50:189.4kbp Scaffold N50:47.8Mbp Sanger coverage ~8x WGS 955.1 Mbp assembled | ||
Lablab purpureus (Hyacinth Bean) | Fabaceae | Crop for human consumption | 20,946 | 2018 [49] [102] | ||||
Lotus japonicus (Bird's-foot Trefoil) | Fabaceae | Model legume | 2008 [103] | |||||
Medicago truncatula (Barrel Medic) | Fabaceae | Model legume | 2011 [104] | |||||
Melilotus officinalis (sweet yellow clover) | Fabaceae | Forage and Chinese medicine | 976.27 Mbp | 50,022 | 2023 [105] | |||
Phaseolus vulgaris (common bean) | Fabaceae | Model bean | 520 Mbp | 31,638 | JGI | 2013? [106] | ||
Prosopis cineraria (Ghaf) | Fabaceae | Desert mimosoid legume | 691 Mbp | 55,325 | 2023 [107] | |||
Vicia faba L. (Faba bean) | Fabaceae | Nature (journal) | 2023 [108] | |||||
Vicia villosa (hairy vetch) | Fabaceae | Forage and cover crop | 2.03 Gbp | 2023 [109] | ||||
Vigna hirtella (Wild vigna) | Fabaceae | Wild legume | 474.1 Mbp | 2023 [108] [110] | ||||
Vigna reflexo-pilosa (Créole bean) | Fabaceae | Tetraploid wild legume | 998.7 Mbp | 2023 [111] [112] | ||||
Vigna subterranea (Bambara Groundnut) | Fabaceae | similar to peanuts | 31,707 | 2018 [113] [49] | ||||
Vigna trinervia | Fabaceae | 498,7 Mbp | 2023 [111] | |||||
Trifolium pratense L. (Red clover) | Fabaceae | often used to relieve symptoms of menopause, high cholesterol, and osteoporosis. [114] | 2022 [115] | |||||
Vicia sativa L. (Common vetch) | Fabaceae | grain to livestock | 2022 [116] | |||||
Macrotyloma uniflorum (Horse gram) | Fabaceae | horsefeed | 2021 [117] | |||||
Castanea mollissima (Chinese chestnut) | Fagaceae | cultivated nut | 785.53 Mb | 36,479 | Beijing University of Agriculture | 2019 [118] | Illumina: ~42.7× PacBio: ~87× contig N50: 944,000bp | |
Quercus robur (European oak) | Fagaceae | Pedunculate oak, large diversity, somatic mutation studies | 736 Mb | 25,808 | 12 | Biogeco lab, Inrae, University of Bordeaux | 2018 [119] | https://www.oakgenome.fr/?page_id=587 |
Carya illinoinensis Pecan | Junglandaceae | snacks in various recipes | 651.31 Mb | 2019 [120] | ||||
Juglans mandshurica Maxim. (Manchurian walnut) | Junglandaceae | cultivated nut | 548.7 Mb | 2022 [121] | ||||
Juglans regia (Persian walnut) | Junglandaceae | cultivated nut | 540 Mb | Chinese Academy of Forestry | 2020 [122] | |||
Juglans sigillata (Iron walnut) | Junglandaceae | cultivated nut | 536.50 Mb | Nanjing Forestry University | 2020 [123] | Illumina+Nanopore+bionano scaffold N50: 16.43 Mb, contig N50: 4.34 Mb | ||
Linum usitatissimum (flax) | Linaceae | Crop | ~350 Mbp | 43,384 | BGI et al. | 2012 [124] | ||
Bombax ceiba (red silk cotton tree) | Malvaceae | capsules with white fibre like cotton | 895 Mb | 2018 [125] | ||||
Durio zibethinus (Durian) | Malvaceae | Tropical fruit tree | ~738 Mbp | 2017 [126] | ||||
Gossypium raimondii | Malvaceae | One of the putative progenitor species of tetraploid cotton | 2013? [127] | |||||
Theobroma cacao (cocoa tree) | Malvaceae | Flavouring crop | 2010 [128] [129] | |||||
Theobroma cacao (cocoa tree) cv. Matina 1-6 | Malvaceae | Most widely cultivated cacao type | 2013 [130] | |||||
Theobroma cacao (200 accessions) | Malvaceae | domestication history of cacao | 2018 [131] | |||||
Azadirachta indica (neem) | Meliaceae | Source of number of Terpenoids, including biopesticide azadirachtin, Used in Traditional Medicine | 364 Mbp | ~20000 | GANIT Labs Archived 2014-01-08 at the Wayback Machine | 2012 [132] and 2011 [133] | Illumina GAIIx, scaffold N50 of 452028bp, Transcriptome data from Shoot, Root, Leaf, Flower and Seed | |
Artocarpus nanchuanensis (Bayberry) | Moraceae | Extremely endangered fruit tree | 769.44 Mbp | 39,596 | 28 | 2022 [134] | ||
Moringa oleifera (Horseradish Tree) | Moringaceae | traditional herbal medicine | 18,451 | 2018 [135] [49] | ||||
Eucalyptus caleyi (Caley's ironbark) | Myrtaceae | 589.32 Mb | 2024 [136] | |||||
Eucalyptus urophylla (Timor white gum) | Myrtaceae | Fibre and timber crop | 544.5 Mb | 2023 [137] | ||||
Eucalyptus grandis (Rose gum) | Myrtaceae | Fibre and timber crop | 691.43 Mb | 2011 [138] [137] | ||||
Eucalyptus lansdowneana (crimson mallee) | Myrtaceae | 633.52 Mb | 2024 [136] | |||||
Eucalyptus marginata (Jarrah) | Myrtaceae | 512.89 Mb | 2024 [136] | |||||
Eucalyptus pauciflora (Snow gum) | Myrtaceae | Fibre and timber crop | 594.87 Mb | ANU | 2020 [139] | Nanopore + Illumina; contig N50: 3.23 Mb | ||
Melaleuca alternifolia (tea tree) | Myrtaceae | terpene-rich essential oil with therapeutic and cosmetic uses around the world | 362 Mb | 37,226 | Gigabyte, NCBI GenBank, GigaScience | 2021 [140] | 3128 scaffolds with a total length of 362 Mb (N50 = 1.9 Mb) | |
Averrhoa carambola (Star Fruit) | Oxalidales | fruit crop | 335.49 Mb | 2020 [141] | ||||
Carya cathayensis (Chinese hickory) | Rosaceae | fruit crop | 706.43 Mb | 2019 [120] | ||||
Eriobotrya japonica (Loquat) | Rosaceae | Fruit tree | 760.1 Mb | 45,743 | Shanghai Academy of Agricultural Sciences | 2020 [142] | Illumina+Nanopore+Hi-C 17 chromosomes, scaffold N50: 39.7 Mb | |
Fragaria vesca (wild strawberry) | Rosaceae | Fruit crop | 240 Mbp | 34,809 | 2011 [143] | scaffold N50: 1.3 Mbp 454/Illumina/solid 39x coverage WGS | ||
Gillenia trifoliata | Rosaceae | Apple Tribe | 320.17±4.22 Mb | 26,166 | 18 | 2021 [144] | Number of scaffolds(>2kb): 789, scaffold N50: 30,093,771 bp, Contig N50 (bp): 828,523 | |
Malus domestica (apple) "Golden Delicious" | Rosaceae | Fruit crop | ~742.3 Mbp | 57,386 | 2010 [145] | contig N50 13.4 (kbp??) scaffold N50 1,542.7 (kbp??) total coverage ~16.9x (Sanger + 454) 71.2% anchored | ||
Prunus amygdalus (almond) | Rosaceae | Fruit crop | 2013? [146] | |||||
Prunus avium (sweet cherry) cv. Stella | Rosaceae | Fruit crop | 2013? [146] | |||||
Prunus mume (Chinese plum or Japanese apricot) | Rosaceae | Fruit crop | 2012 [147] | |||||
Prunus persica (peach) | Rosaceae | Fruit crop | 265 Mbp | 27,852 | 2013 [148] | Sanger coverage:8.47x WGS ca 99% ESTs mapped 215.9 Mbp in pseudomolecules | ||
Prunus salicina (Japanese plum) | Rosaceae | Fruit crop | 284.2 Mbp | 24,448 | 8 | 2020 [149] | PacBio/Hi-C, with contig N50 of 1.78 Mb and scaffold N50 of 32.32 Mb. | |
Pyrus bretschneideri (ya pear or Chinese white pear) cv. Dangshansuli | Rosaceae | Fruit crop | 2012 [150] | |||||
Pyrus communis (European pear) cv. Doyenne du Comice | Rosaceae | Fruit crop | 2013? [146] | |||||
Rosa roxburghii (Chestnut Rose) | Rosaceae | Fruit crop | 504 Mbp | 2023 [151] | ||||
Rosa sterilis | Rosaceae | Fruit crop | 981.2 Mb | 2023 [152] | ||||
Rubus occidentalis (Black raspberry) | Rosaceae | Fruit crop | 290 Mbp | 2018 [153] | ||||
Citrus clementina (Clementine) | Rutaceae | Fruit crop | 2013? [154] | |||||
Citrus sinensis (Sweet orange) | Rutaceae | Fruit crop | 2013?, [154] 2013 [155] | |||||
Clausena lansium (Wampee) | Rutaceae | Fruit crop | 2021 [156] | |||||
Populus trichocarpa (poplar) | Salicaceae | Carbon sequestration, model tree, timber | 510 Mbp (cytogenetic) 485 Mbp (coverage) | 73,013 [Phytozome] | 2006 [157] | Scaffold N50: 19.5 Mbp Contig N50:552.8 Kbp [phytozome] WGS >=95 % cDNA found | ||
Populus pruinosa (desert tree) | Salicaceae | farming and ranching | 479.3 Mbp | 35,131 | 2017 [158] | |||
Acer truncatum (purpleblow maple) | Sapindaceae | Tree producing nervonic acid | 633.28 Mb | 28,438 | 2020 [159] | contig N50 = 773.17 Kb; scaffold N50 = 46.36 Mb | ||
Acer yangbiense | Sapindaceae | Plant species with extremely small populations | 110 Gb | 28,320 | 13 | 2019 [160] | scaffold N50 = 45 Mb | |
Dimocarpus longan (Longan) | Sapindaceae | Fruit crop | 471.88 Mb | 2017 [161] | ||||
Xanthoceras sorbifolium Bunge (Yellowhorn) | Sapindaceae | Fruit Crop | 504.2 Mb | 24,672 | 2019 [162] [163] | |||
Aquilaria sinensis (Agarwood) | Thymelaeaceae | Fragrant wood | 726.5 Mb | 29,203 | 2020 [164] | Illumina+nanopore+Hi-C, scaffold N50: 88.78 Mb | ||
Vitis vinifera (grape) genotype PN40024 | Vitaceae | fruit crop | 2007 [165] |
Organism strain | Family | Relevance | Genome size | Number of genes predicted | Organization | Year of completion | Assembly status |
---|---|---|---|---|---|---|---|
Asclepias syriaca , (common milkweed) | Apocynaceae | Exudes milky latex | 420 Mbp | 14,474 | Oregon State University | 2019 [166] | 80.4× depth N50 = 3,415 bp |
Erigeron breviscapus (Chinese herbal fleabane) | Asteraceae | Chinese medicine | 37,505 | 2017 [167] | |||
Helianthus annuus (sunflower) | Asteraceae | Oil crop | 3.6 Gbb | 52,232 | INRA and The Sunflower Genome Database [168] | 2017 [169] | N50 contig: 13.7 kb |
Lactuca sativa (lettuce) | Asteraceae | Vegetable crop | 2.5 Gbb | 38,919 | 2017 [170] | N50 contig: 12 kb; N50 scaffold: 476 kb | |
Handroanthus impetiginosus, Bignoniaceae (Pink Ipê) | Bignoniaceae | Common tree | 503.7 Mb | 31,668 | 2017 [171] | ||
Diospyros oleifera Cheng (Oil persimmon) | Ebenaceae | Fruit tree | 849.53 Mb | 28,580 | Zhejiang University & Chinese Academy of Forestry | 2019 [172] & 2020 [173] | Two genomes both chromosome scale & assigned to 15 pseudochromosomes |
Salvia miltiorrhiza Bunge (Chinese red sage) | Lamiaceae | TCM treatment for COPD | 641 Mb | 34,598 | 2015 [174] | ||
Callicarpa americana (American beautyberry) | Lamiaceae | Ornamental shrub and insect-repellent | 506 Mb | 32,164 | Michigan State University | 2020 [175] | 17 pseudomolecules Contig N50: 7.5Mb Scaffold N50: and 29.0 Mb |
Mentha x piperita (Peppermint) | Lamiaceae | Oil crop | 353 Mb | 35,597 | Oregon State University | 2017 [176] | |
Tectona grandis (Teak) | Lamiaceae | Durability and water resistance | 31,168 | 2019 [177] | |||
Utricularia gibba (humped bladderwort) | Lentibulariaceae | model system for studying genome size evolution; a carnivorous plant | 81.87 Mb | 28,494 | LANGEBIO, CINVESTAV | 2013 [178] | Scaffold N50: 80.839 Kb |
Camptotheca acuminata Decne (Chinese happy tree) | Nyssaceae | chemical drugs for cancer treatment | 403 Mb | 31,825 | 2017 [179] | ||
Davidia involucrata Baill (Dove tree) | Nyssaceae | Living fossil | 1,169 Mb | 42,554 | 2020 [180] | ||
Mimulus guttatus | Phrymaceae | model system for studying ecological and evolutionary genetics | ca 430 Mbp | 26,718 | JGI | 2013? [181] | Scaffold N50 = 1.1 Mbp Contig N50 = 45.5 Kbp |
Primula vulgaris (Common primrose) | Primulaceae | Used for cooking | 474 Mb | 2018 [182] | |||
Cinchona pubescens Vahl. (Fever tree) | Rubiaceae | Anti-malarial | 1.1 Gb | 2022 [183] | |||
Solanum lycopersicum (tomato) cv. Heinz 1706 | Solanaceae | Food crop | ca 900 Mbp | 34,727 | SGN | 2011 [184] 2012 [185] | Sanger/454/Illumina/Solid Pseudomolecules spanning 91 scaffolds (760Mbp of which 594Mbp have been oriented ) over 98% ESTs mappable |
Solanum aethiopicum (Ethiopian eggplant) | Solanaceae | Food crop | 1.02 Gbp | 34,906 | BGI | 2019 [186] | Illumina scaffold N50: 516,100bp contig N50: 25,200 bp ~109× coverage |
Solanum pimpinellifolium (Currant Tomato) | Solanaceae | closest wild relative to tomato | 2012 [185] | Illumina contig N50: 5100bp ~40x coverage | |||
Solanum tuberosum (Potato) | Solanaceae | Food crop | 726 Mbp [187] | 39,031 | Potato Genome Sequencing Consortium (PGSC) | 2011 [188] [189] | Sanger/454/Illumina 79.2x coverage contig N50: 31,429bp scaffold N50: 1,318,511bp |
Solanum commersonii (commerson's nightshade) | Solanaceae | Wild potato relative | 838 Mbp kmer (840 Mbp) | 37,662 | UNINA, UMN, UNIVR, Sequentia Biotech, CGR | 2015 [190] | Illumina 105x coverage contig N50: 6,506bp scaffold N50: 44,298bp |
Cuscuta campestris (field dodder) | Solanaceae | model system for parasitic plants | 556 Mbp kmer (581 Mbp) | 44,303 | RWTH Aachen University, Research Center Jülich, University of Tromsø, Helmholtz Zentrum München, Technical University Munich, University of Vienna | 2018 [191] | scaffold N50 = 1.38 Mbp |
Cuscuta australis (Southern dodder) | Solanaceae | model system for parasitic plants | 265 Mbp kmer (273 Mbp) | 19,671 | Kunming Institute of Botany, Chinese Academy of Sciences | 2018 [192] | scaffold N50 = 5.95 Mbp contig N50 = 3.63 Mbp |
Nicotiana benthamiana | Solanaceae | Close relative of tobacco | ca 3 Gbp | 2012 [193] | Illumina 63x coverage contig N50: 16,480bp scaffold N50:89,778bp >93% unigenes found | ||
Nicotiana sylvestris (Tobacco plant) | Solanaceae | model system for studies of terpenoid production | 2.636 Gbp | Philip Morris International | 2013 [194] | 94x coverage scaffold N50: 79.7 kbp 194kbp superscaffolds using physical Nicotiana map | |
Nicotiana tomentosiformis | Solanaceae | Tobacco progenitor | 2.682 Gb | Philip Morris International | 2013 [194] | 146x coverage scaffold N50: 82.6 kb 166kbp superscaffolds using physical Nicotiana map | |
Capsicum annuum (Pepper) (a) cv. CM334 (b) cv. Zunla-1 | Solanaceae | Food crop | ~3.48 Gbp | (a) 34,903 (b) 35,336 | (a) 2014 [195] (b) 2014 [196] | N50 contig: (a) 30.0 kb (b) 55.4 kb N50 scaffold: (a) 2.47 Mb (b) 1.23 Mb | |
Capsicum annuum var. glabriusculum (Chiltepin) | Solanaceae | Progenitor of cultivated pepper | ~3.48 Gbp | 34,476 | 2014 [196] | N50 contig: 52.2 kb N50 scaffold: 0.45 Mb | |
Petunia hybrida | Solanaceae | Economically important flower | 2011 [197] |
Organism strain | Family | Relevance | Genome size | Number of genes predicted | Organization | Year of completion | Assembly status |
---|---|---|---|---|---|---|---|
Setaria italica (Foxtail millet) | Poaceae | Model of C4 metabolism | 2012 [198] | ||||
Aegilops tauschii (Tausch's goatgrass) | Poaceae | bread wheat D-genome progenitor | ca 4.36 Gb | 39,622 | 2017 [199] | pseudomolecule assembly | |
Bothriochloa decipiens (Australian bluestem grass) | Poaceae | BCD clade and polyploid | 1,218.22 Mb | 60,652 | 2023 [200] | Scaffold N50: 42.637 Mb | |
Brachypodium distachyon (purple false brome) | Poaceae | Model monocot | 2010 [201] | ||||
Coix lacryma-jobi L. (Job's tears) | Poaceae | Crop & used in medicine & ornamentation | 1.619 Gb | 39,629 | 2019 [202] | ||
Dichanthelium oligosanthes (Heller's rosette grass) | Poaceae | C3 grass closely related to C4 species | 960 Mb | DDPSC | 2016 [203] | ||
Digitaria exilis (white fonio) | Poaceae | African orphan crop | 761 Mb | ICRISAT, UC Davis | 2021 [204] | 3,329 contigs. N50: 1.73 Mb; L50, 126) | |
Eragrostis curvula | Poaceae | good for livestock | 602 Mb | 56,469 | 2019 [205] | ||
Hordeum vulgare (barley) | Poaceae | Model of ecological adoption | IBSC | 2012, [206] 2017 [207] | |||
Oryza brachyantha (wild rice) | Poaceae | Disease resistant wild relative of rice | 2013 [208] | ||||
Oryza glaberrima (African rice) var CG14 | Poaceae | West-African species of rice | 2010 [209] | ||||
Oryza rufipogon (red rice) | Poaceae | Ancestor to Oryza sativa | 406 Mb | 37,071 | SIBS | 2012 [210] | Illumina HiSeq2000 100x coverage |
Oryza sativa (long grain rice) ssp indica | Poaceae | Crop and model cereal | 430 Mb [211] | International Rice Genome Sequencing Project (IRGSP) | 2002 [212] | ||
Oryza sativa (Short grain rice) ssp japonica | Poaceae | Crop and model cereal | 430 Mb | International Rice Genome Sequencing Project (IRGSP) | 2002 [213] | ||
Panicum virgatum (switchgrass) | Poaceae | biofuel | 2013? [214] | ||||
Poa annua (annual bluegrass) | Poaceae | weed | 3.56 Gb | 76,420 | USDA ARS, Forage and Range Research | 2023 [215] | unphased (haploid) pseudomolecules |
Poa infirma (weak bluegrass) | Poaceae | diploid progenitor to Poa annua | 2.25 Gb | 39,420 | Penn State University | 2023 [216] | unphased (haploid) pseudomolecules |
Poa pratensis (Kentucky bluegrass) | Poaceae | Lawn grass | 6.09 Gbp | 2023 [217] | Scaffold N50: 65.1 Mbp | ||
Poa supina (supine bluegrass) | Poaceae | diploid progenitor to Poa annua | 1.27 Gb | 37,935 | Penn State University | 2023 [216] | unphased (haploid) pseudomolecules |
Phyllostachys edulis (moso bamboo) | Poaceae | Bamboo textile industry | 603.3 Mb | 25,225 | 2013 [218] 2018 [219] | ||
Sorghum bicolor genotype BTx623 | Poaceae | Crop | ca 730 Mb | 34,496 | 2009 [220] | contig N50:195.4kbp scaffold N50: 62.4Mbp Sanger, 8.5x coverage WGS | |
Triticum aestivum (bread wheat) | Poaceae | 20% of global nutrition | 14.5 Gb | 107,891 | IWGSC | 2018 [221] | pseudomolecule assembly |
Triticum urartu | Poaceae | Bread wheat A-genome progenitor | ca 4.94 Gb | BGI | 2013 [222] | Non-repetitive sequence assembled Illumina WGS | |
Zea mays (maize) ssp mays B73 | Poaceae | Cereal crop | 2.3 Gb | 39,656 [223] | 2009 [224] | contig N50 40kbp scaffold N50: 76kbp Sanger, 4-6x coverage per BAC | |
Pennisetum glaucum (pearl millet) | Poaceae | Sub-Saharan and Sahelian millet species | ~1,79 Gb | 38,579 | 2017 [225] | WGS and bacterial artificial chromosome (BAC) sequencing |
Organism strain | Family | Relevance | Genome size | Number of genes predicted | No of chromosomes | Organization | Year of completion | Assembly status |
---|---|---|---|---|---|---|---|---|
Ananas bracteatus accession CB5 | Bromeliaceae | Wild pineapple relative | 382 Mbp | 27,024 | 25 | 2015 [226] | 100× coverage using Illumina paired-end reads of libraries with different insert sizes. | |
Ananas comosus (L.) Merr. (Pineapple), varieties F153 and MD2 | Bromeliaceae | The most economically valuable crop possessing crassulacean acid metabolism (CAM) | 382 Mb | 27,024 | 25 | 2015 [226] | 400× Illumina reads, 2× Moleculo synthetic long reads, 1× 454 reads, 5× PacBio single-molecule long reads and 9,400 BACs. | |
Ottelia alismoides (Duck lettuce) | Hydrocharitaceae | Aquatic plant | 6.45Gb | 2024 [227] | 11,923 scaffolds/contigs and an N50 of 790,733 bp | |||
Pontederia crassipes (Water hyacinth) | Pontederiaceae | Aquatic plant | 1.22 Gb | 8 | 65,299 | 2024 [228] | Scaffold N50 = 77.2Mb | |
Musa acuminata (Banana) | Musaceae | A-genome of modern banana cultivars | 523 Mbp | 36,542 | 2012 [229] | N50 contig: 43.1 kb N50 scaffold: 1.3 Mb | ||
Musa balbisiana (Wild banana) (PKW) | Musaceae | B-genome of modern banana cultivars | 438 Mbp | 36,638 | 2013 [230] | N50 contig: 7.9 kb | ||
Musa balbisiana (DH-PKW) | Musaceae | B-genome (B-subgenome to cultivated allotriploid bananas) | 430 Mb | 35,148 | 11 | CATAS, BGI, CIRAD | 2019 [231] | N50 contig: 1.83 Mb |
Musa beccarii (Red ornamental banana) | Musaceae | Ornamental, aids understanding Musaceae genomes evolution | 567 Mb | 39,112 | 9 | 2023 [232] | ||
Calamus simplicifolius | Arecaceae | native to tropical and subtropical regions | 1.98 Gb | 51,235 | 2018 [233] | |||
Cocos nucifera (Coconut palm) | Arecaceae | used in food and cosmetics | ~2.42 Gb | 2017 [234] | ||||
Daemonorops jenkinsiana | Arecaceae | native to tropical and subtropical regions. | 1.61 Gb | 52,342 | 2018 [233] | |||
Phoenix dactylifera (Date palm) | Arecaceae | Woody crop in arid regions | 658 Mbp | 28,800 | 2011 [235] | N50 contig: 6.4 kb | ||
Elaeis guineensis (African oil palm) | Arecaceae | Oil-bearing crop | ~1800 Mbp | 34,800 | 2013 [236] | N50 scaffold: 1.27 Mb | ||
Spirodela polyrhiza (Greater duckweed) | Araceae | Aquatic plant | 158 Mbp | 19,623 | 2014 [237] | N50 scaffold: 3.76 Mb | ||
Dendrobium hybrid cultivar ‘Emma White’ | Orchidaceae | Commercialised hybrid orchid | 678 Mbp | 2022 [238] | ||||
Phalaenopsis equestris (Schauer) Rchb.f. (Moth orchid) | Orchidaceae | Breeding parent of many modern moth orchid cultivars and hybrids. Plant with crassulacean acid metabolism (CAM). | 1600 Mbp | 29,431 | 2014 [239] | N50 scaffold: 359,115 kb | ||
Iris pallida Lam. (Dalmatian Iris) | Iridaceae | Ornamental and, commercial interest in secondary metabolites | 10.04 Gbp | 63,944 | Novartis | 2023 [240] | Scaffold N50: 14.34 Mbp | |
Iris sibirica (Siberian Ibis) | Iridaceae | Ornamental flower | 2023 [241] | |||||
Iris virginica (Southern Blue Flag Iris) | Iridaceae | Ornamental flower | 2023 [241] |
Not meeting criteria of the first paragraph of this article in being nearly full sequences with high quality, published, assembled and publicly available. This list includes species where sequences are announced in press releases or websites, but not in a data-rich publication in a refereed peer-review journal with DOI.
In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA. The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences, and often a substantial fraction of junk DNA with no evident function. Almost all eukaryotes have mitochondria and a small mitochondrial genome. Algae and plants also contain chloroplasts with a chloroplast genome.
Comparative genomics is a branch of biological research that examines genome sequences across a spectrum of species, spanning from humans and mice to a diverse array of organisms from bacteria to chimpanzees. This large-scale holistic approach compares two or more genomes to discover the similarities and differences between the genomes and to study the biology of the individual genomes. Comparison of whole genome sequences provides a highly detailed view of how organisms are related to each other at the gene level. By comparing whole genome sequences, researchers gain insights into genetic relationships between organisms and study evolutionary changes. The major principle of comparative genomics is that common features of two organisms will often be encoded within the DNA that is evolutionarily conserved between them. Therefore, Comparative genomics provides a powerful tool for studying evolutionary changes among organisms, helping to identify genes that are conserved or common among species, as well as genes that give unique characteristics of each organism. Moreover, these studies can be performed at different levels of the genomes to obtain multiple perspectives about the organisms.
CRISPR is a family of DNA sequences found in the genomes of prokaryotic organisms such as bacteria and archaea. Each sequence within an individual prokaryotic cell is derived from a DNA fragment of a bacteriophage that had previously infected the prokaryote or one of its ancestors. These sequences are used to detect and destroy DNA from similar bacteriophages during subsequent infections. Hence these sequences play a key role in the antiviral defense system of prokaryotes and provide a form of heritable, acquired immunity. CRISPR is found in approximately 50% of sequenced bacterial genomes and nearly 90% of sequenced archaea.
The prasinophytes are a group of unicellular green algae. Prasinophytes mainly include marine planktonic species, as well as some freshwater representatives. The prasinophytes are morphologically diverse, including flagellates with one to eight flagella and non-motile (coccoid) unicells. The cells of many species are covered with organic body scales; others are naked. Well studied genera include Ostreococcus, considered to be the smallest free-living eukaryote, and Micromonas, both of which are found in marine waters worldwide. Prasinophytes have simple cellular structures, containing a single chloroplast and a single mitochondrion. The genomes are relatively small compared to other eukaryotes . At least one species, the Antarctic form Pyramimonas gelidicola, is capable of phagocytosis and is therefore a mixotrophic algae.
Neoaves is a clade that consists of all modern birds with the exception of Palaeognathae and Galloanserae. This group is defined in the PhyloCode by George Sangster and colleagues in 2022 as "the most inclusive crown clade containing Passer domesticus, but not Gallus gallus". Almost 95% of the roughly 10,000 known species of extant birds belong to the Neoaves.
RNA-Seq is a technique that uses next-generation sequencing to reveal the presence and quantity of RNA molecules in a biological sample, providing a snapshot of gene expression in the sample, also known as transcriptome.
In the fields of molecular biology and genetics, a pan-genome is the entire set of genes from all strains within a clade. More generally, it is the union of all the genomes of a clade. The pan-genome can be broken down into a "core pangenome" that contains genes present in all individuals, a "shell pangenome" that contains genes present in two or more strains, and a "cloud pangenome" that contains genes only found in a single strain. Some authors also refer to the cloud genome as "accessory genome" containing 'dispensable' genes present in a subset of the strains and strain-specific genes. Note that the use of the term 'dispensable' has been questioned, at least in plant genomes, as accessory genes play "an important role in genome evolution and in the complex interplay between the genome and the environment". The field of study of pangenomes is called pangenomics.
SOAP is a suite of bioinformatics software tools from the BGI Bioinformatics department enabling the assembly, alignment, and analysis of next generation DNA sequencing data. It is particularly suited to short read sequencing data.
The 1000 Plant Transcriptomes Initiative (1KP) was an international research effort to establish the most detailed catalogue of genetic variation in plants. It was announced in 2008 and headed by Gane Ka-Shu Wong and Michael Deyholos of the University of Alberta. The project successfully sequenced the transcriptomes of 1,000 different plant species by 2014; its final capstone products were published in 2019.
A reference genome is a digital nucleic acid sequence database, assembled by scientists as a representative example of the set of genes in one idealized individual organism of a species. As they are assembled from the sequencing of DNA from a number of individual donors, reference genomes do not accurately represent the set of genes of any single individual organism. Instead, a reference provides a haploid mosaic of different DNA sequences from each donor. For example, one of the most recent human reference genomes, assembly GRCh38/hg38, is derived from >60 genomic clone libraries. There are reference genomes for multiple species of viruses, bacteria, fungus, plants, and animals. Reference genomes are typically used as a guide on which new genomes are built, enabling them to be assembled much more quickly and cheaply than the initial Human Genome Project. Reference genomes can be accessed online at several locations, using dedicated browsers such as Ensembl or UCSC Genome Browser.
CRISPR-Cas design tools are computer software platforms and bioinformatics tools used to facilitate the design of guide RNAs (gRNAs) for use with the CRISPR/Cas gene editing system.
The Earth BioGenome Project (EBP) is an initiative that aims to sequence and catalog the genomes of all of Earth's currently described eukaryotic species over a period of ten years. The initiative would produce an open DNA database of biological information that provides a platform for scientific research and supports environmental and conservation initiatives. A scientific paper presenting the vision for the project was published in PNAS in April 2018, and the project officially launched November 1, 2018.
A plant genome assembly represents the complete genomic sequence of a plant species, which is assembled into chromosomes and other organelles by using DNA fragments that are obtained from different types of sequencing technology.
Red clover is a wild plant belonging to the legume family and is often used to relieve symptoms of menopause, high cholesterol, and osteoporosis.