This list of sequenced protist genomes contains all the protist species known to have publicly available complete genome sequences that have been assembled, annotated and published; draft genomes are not included, nor are organelle only sequences.
Alveolata are a group of protists which includes the Ciliophora, Apicomplexa and Dinoflagellata. Members of this group are of particular interest to science as the cause of serious human and livestock diseases.
Organism | Type | Relevance | Genome size | Number of genes predicted | Organization | Year of completion | Assembly status | Links |
---|---|---|---|---|---|---|---|---|
Babesia bovis | Apicomplexan | Cattle pathogen | 8.2 Mb | 3,671 | 2007 [1] | |||
Breviolum minutim (Symbiodinium minutum; clade B1) | Dinoflagellate | Coral symbiont | 1.5 Gb | 47,014 | Okinawa Institute of Science and Technology | 2013 [2] | Draft | OIST Marine Genomics [3] |
Cladocopium goreaui (Symbiodinium goreaui; Clade C1) | Dinoflagellate | Coral symbiont | 1.19 Gb | 35,913 | Reef Future Genomics (ReFuGe) 2020/ University of Queensland | 2018 [4] | Draft | ReFuGe 2020 [5] |
Cladocopium C92 strain Y103 ( Symbiodinium sp. clade C; putative type C92) | Dinoflagellate | Foraminiferan symbiont | Unknown (assembly size 0.70 Gb) | 65,832 | Okinawa Institute of Science and Technology | 2018 [6] | Draft | OIST Marine Genomics [3] |
Cryptosporidium hominis Strain:TU502 | Apicomplexan | Human pathogen | 10.4 Mb | 3,994 [7] | Virginia Commonwealth University | 2004 [7] | ||
Cryptosporidium parvum C- or genotype 2 isolate | Apicomplexan | Human pathogen | 16.5 Mb | 3,807 [8] | UCSF and University of Minnesota | 2004 [8] | ||
Eimeria tenella Houghton strain | Apicomplexan | Intestinal parasite of domestic fowl | 55-60 Mb [9] | The Wellcome Trust Sanger Institute [10] | Available for download; [10] 2007 for Chr 1 [11] | |||
Fugacium kawagutii CS156=CCMP2468 ( Symbiodinium kawagutii; clade F1) | Dinoflagellate | Coral symbiont? | 1.07 Gb | 26,609 | Reef Future Genomics (ReFuGe) 2020 / University of Queensland | 2018 [4] | Draft | ReFuGe 2020 [5] |
Fugacium kawagutii CCMP2468 ( Symbiodinium kawagutii; clade F1) | Dinoflagellate | Coral symbiont? | 1.18 Gb | 36,850 | University of Connecticut / Xiamen University | 2015 [12] | Draft | S. kawagutii genome project [13] |
Neospora caninum | Apicomplexan | Pathogen for cattle and dogs | 62 Mb [14] | The Wellcome Trust Sanger Institute [15] | Available for download [15] | |||
Paramecium tetraurelia | Ciliate | Model organism | 72 Mb | 39,642 [16] | Genoscope | 2006 [16] | ||
Polarella glacialis CCMP1383 | Dinoflagellate | Psychrophile, Antarctic | 3.02 Gb (diploid), 1.48 Gbp (haploid) | 58,232 | University of Queensland | 2020 [17] | Draft | UQ eSpace [18] |
Polarella glacialis CCMP2088 | Dinoflagellate | Psychrophile, Arctic | 2.65 Gb (diploid), 1.30 Gbp (haploid) | 51,713 | University of Queensland | 2020 [17] | Draft | UQ eSpace [18] |
Plasmodium berghei ANKA | Apicomplexan | Rabbit malaria | 18.5 Mb [19] | 4,900; [19] 11,654 (UniProt) | ||||
Plasmodium chabaudi | Apicomplexan | Rodent malaria | 19.8 Mb [20] | 5,000 [20] | ||||
Plasmodium falciparum Clone:3D7 | Apicomplexan | Human pathogen (malaria) | 22.9 Mb | 5,268 [21] | Malaria Genome Project Consortium | 2002 [21] | ||
Plasmodium knowlesi | Apicomplexan | Primate pathogen (malaria) | 23.5 Mb | 5,188 [22] | 2008 [22] | |||
Plasmodium vivax | Apicomplexan | Human pathogen (malaria) | 26.8 Mb | 5,433 [23] | 2008 [23] | |||
Plasmodium yoelii yoelii Strain:17XNL | Apicomplexan | Rodent pathogen (malaria) | 23.1 Mb | 5,878 [24] | TIGR and NMRC | 2002 [24] | ||
Symbiodinium microadriaticum (clade A) | Dinoflagellate | Coral symbiont | 1.1 Gb | 49,109 | King Abdullah University of Science and Technology | 2016 [25] | Draft | Reef Genomics [26] |
Symbiodinium A3 strain Y106 ( Symbiodinium sp. clade A3) | Dinoflagellate | symbiont | Unknown (assembly size 0.77 Gb) | 69,018 | Okinawa Institute of Science and Technology | 2018 [6] | Draft | OIST Marine Genomics [3] |
Tetrahymena thermophila | Ciliate | Model organism | 104 Mb | 27,000 [27] | 2006 [27] | |||
Theileria annulata Ankara clone C9 | Apicomplexan | Cattle pathogen | 8.3 Mb | 3,792 | Sanger | 2005 [28] | ||
Theileria parva Strain:Muguga | Apicomplexan | Cattle pathogen (African east coast fever) | 8.3 Mb | 4,035 [29] | TIGR and the International Livestock Research Institute | 2005 [29] | ||
Toxoplasma gondii GT1, ME49, VEG strains | Apicomplexan | Mammal pathogen | 63 Mb (RefSeq) | 8,100 (UniProt) - 9,000 (EuPathDB) | J. Craig Venter Inst., TIGR, UPenn. | 2008 [30] |
Amoebozoa are a group of motile amoeboid protists, members of this group move or feed by means of temporary projections, called pseudopods. The best known member of this group is the slime mold, which has been studied for centuries; other members include the Archamoebae, Tubulinea and Flabellinia. Some Amoeboza cause disease.
Organism | Type | Relevance | Genome size | Number of genes predicted | Organization | Year of completion |
---|---|---|---|---|---|---|
Dictyostelium discoideum Strain:AX4 | Slime mold | Model organism | 34 Mb | 12,500 [31] | Consortium from University of Cologne, Baylor College of Medicine and the Sanger Centre | 2005 [31] |
Entamoeba histolytica HM1:IMSS | Parasitic protozoan | Human pathogen (amoebic dysentery) | 23.8 Mb | 9,938 [32] | TIGR, Sanger Institute and the London School of Hygiene and Tropical Medicine | 2005 [32] |
Polysphondylium pallidum Strain:PN500 | Slime mold | Model organism | 12,939, [33] 12,350 (UniProt) | Leibniz Institute for Age Research | 2009 [33] |
The Chromista are a group of protists that contains the algal phyla Heterokontophyta (stramenopiles), Haptophyta and Cryptophyta. Members of this group are mostly studied for evolutionary interest.
Organism | Type | Relevance | Genome size | Number of genes predicted | Organization | Year of completion |
---|---|---|---|---|---|---|
Albugo laibachii | Oomycete | Arabidopsis parasite, biotroph | 37 Mb [34] | 13,032 [34] | 2011 [34] | |
Aureococcus anophagefferens Strain:CCMP1984 | Pelagophyte | DOE Joint Genome Institute | 2011 [35] | |||
Bigelowiella natans | Chlorarachniophyte | Model organism | nucleomorph: 0.331 Mb nuclear: 95 Mb | nucleomorph: 373 [36] nuclear: >21,000 [37] | nucleomorph: Hall Institute Australia, Univ. Melbourne, Univ. BC nuclear: Dalhousie University, Halifax, Nova Scotia, Canada | 2006, [36] 2012 [37] |
Chroomonas mesostigmaticaCCMP1168 | Cryptophyta | 2012 [38] | ||||
Cryptomonas paramecium | Cryptophyta | 2010 [39] | ||||
Emiliania huxleyi CCMP1516 | Coccolithophore (phytoplankton) | 141.7 Mb [40] | 30,569 [40] | Joint Genome Institute | 2013 [40] | |
Emiliania huxleyi RCC1217 | Coccolithophore (phytoplankton) | Available for download [41] | ||||
Fragilariopsis cylindrus | Diatom | 61.1 Mb [42] | 21,066 [42] | Joint Genome Institute | 2017 [42] | |
Guillardia theta | Cryptomonad | Model organism | 0.551 Mb (nucleomorph genome only) 87 Mb (nuclear genome) | nucleomorph: 465 [43] 513, 598 (UniProt) nuclear: >21,000 [37] | nucleomorph: Canadian Institute of Advanced Research, Philipps-University Marburg and the University of British Columbia nuclear: Dalhousie University, Halifax, Nova Scotia, Canada | 2001, [43] 2012 [37] |
Hemiselmis andersenii CCMP7644 | Cryptomonad | Model organism | 0.572 Mb (nucleomorph genome only) | 472, [44] 502 (UniProt) | Canadian Institute of Advanced Research | 2007 [44] |
Hyaloperonospora arabidopsidis | Oomycete | obligate biotroph, Arabidopsis pathogen | WUGSC | 2010 [45] | ||
Nannochloropis gaditana Strain: CCMP526 | Eustigmatophyte | Lipid-producing, biotechnology applications | Virginia Bioinformatics Institute | 2012 [46] | ||
Phaeodactylum tricornutum Strain: CCAP1055/1 | Diatom | 27.4 Mb | 10,402 | Joint Genome Institute | 2008 [47] | |
Phytophthora infestans Strain:T30-4 | Oomycete | Great Famine of Ireland pathogen | Broad Institute | 2009 [48] | ||
Phytophthora ramorum | Oomycete | Sudden oak death pathogen | 65 Mb (7x) | 15,743 | Joint Genome Institute et al. | 2006 [49] |
Phytophthora sojae | Oomycete | Soybean pathogen | 95 Mb (9x) | 19,027 | Joint Genome Institute et al. | 2006 [49] |
Pseudo-nitzschia multiseries | Diatom | Joint Genome Institute | ||||
Plasmodiophora brassicae | Plasmodiophorid | Clubroot disease pathogen | 25.5 Mb | 9,730 | SLU Uppsala et al. | 2015 [50] |
Pythium ultimum | Oomycete | ubiquitous plant pathogen | 42.8 Mb | 15,290 | Michigan State University et al. | 2010 [51] |
Thalassiosira pseudonana Strain:CCMP 1335 | Diatom | 34.5 Mb | 11,242 [52] | Joint Genome Institute and the University of Washington | 2004 [52] |
Excavata is a group of related free living and symbiotic protists; it includes the Metamonada, Loukozoa, Euglenozoa and Percolozoa. They are researched for their role in human disease.
Organism | Type | Relevance | Genome size | Number of genes predicted | Organization | Year of completion |
---|---|---|---|---|---|---|
Giardia enterica (G. duodenalis assemblage B) | Parasitic protozoan | Human pathogen (Giardiasis) | 11.7 Mb | 4,470 [53] | multicenter collaboration | 2009 [53] |
Giardia duodenalis ATCC 50803 (Giardia duodenalis assemblage A) | Parasitic protozoan | Human pathogen (Giardiasis) | 11.7 Mb | 6,470, [54] 7,153 (UniProt) | Karolinska Institutet, Marine Biological Laboratory | 2007 [54] |
Leishmania braziliensis MHOM/BR/75M2904 | Parasitic protozoan | Human pathogen (Leishmaniasis) | 33 Mb | 8,314 [55] | Sanger Institute, Universidade de São Paulo, Imperial College | 2007 [55] |
Leishmania infantum JPCM5 | Parasitic protozoan | Human pathogen (Visceral leishmaniasis) | 33 Mb | 8,195 [55] | Sanger Institute, Imperial College and University of Glasgow | 2007 [55] |
Leishmania major Strain:Friedlin | Parasitic protozoan | Human pathogen (Cutaneous leishmaniasis) | 32.8 Mb | 8,272 [56] | Sanger Institute and Seattle Biomedical Research Institute | 2005 [56] |
Naegleria gruberi | amoeboflagellate | Diverged from other eukaryotes over 1 billion years ago | 41 Mb [57] | 15,727 [57] | 2010 [57] | |
Trichomonas vaginalis | Parasitic protozoan | Human pathogen (Trichomoniasis) | 160 Mb | 59,681 [58] | TIGR | 2007 [58] |
Trypanosoma brucei Strain:TREU927/4 GUTat10.1 | Parasitic protozoan | Human pathogen (Sleeping sickness) | 26 Mb | 9,068 [59] | Sanger Institute and TIGR | 2005 [59] |
Trypanosoma cruzi Strain:CL Brener TC3 | Parasitic protozoan | Human pathogen (Chagas disease) | 34 Mb | 22,570 [60] | TIGR, Seattle Biomedical Research Institute and Uppsala University | 2005 [60] |
Opisthokonts are a group of eukaryotes that include both animals and fungi as well as basal groups that are not classified in these groups. These basal opisthokonts are reasonably categorized as protists and include choanoflagellates, which are the sister or near-sister group of animals.
Organism | Type | Relevance | Genome size | Number of genes predicted | Organization | Year of completion |
---|---|---|---|---|---|---|
Monosiga brevicollis | Choanoflagellate | close relative of metazoans | 41.6 Mb | 9,200 [61] | Joint Genome Institute | 2007 [61] |
In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA. The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences, and often a substantial fraction of 'junk' DNA with no evident function. Almost all eukaryotes have mitochondria and a small mitochondrial genome. Algae and plants also contain chloroplasts with a chloroplast genome.
The alveolates are a group of protists, considered a major clade and superphylum within Eukarya. They are currently grouped with the stramenopiles and Rhizaria among the protists with tubulocristate mitochondria, the group being referred to as SAR.
Genome size is the total amount of DNA contained within one copy of a single complete genome. It is typically measured in terms of mass in picograms or less frequently in daltons, or as the total number of nucleotide base pairs, usually in megabases. One picogram is equal to 978 megabases. In diploid organisms, genome size is often used interchangeably with the term C-value.
Ancient DNA (aDNA) is DNA isolated from ancient specimens. Due to degradation processes ancient DNA is more degraded in comparison with contemporary genetic material. Even under the best preservation conditions, there is an upper boundary of 0.4–1.5 million years for a sample to contain sufficient DNA for sequencing technologies. The oldest sample ever sequenced is estimated to be 1.65 million years old. Genetic material has been recovered from paleo/archaeological and historical skeletal material, mummified tissues, archival collections of non-frozen medical specimens, preserved plant remains, ice and from permafrost cores, marine and lake sediments and excavation dirt. On 7 December 2022, The New York Times reported that two-million year old genetic material was found in Greenland, and is currently considered the oldest DNA discovered so far.
Reticulon 4 receptor (RTN4R) also known as Nogo-66 Receptor (NgR) or Nogo receptor 1 is a protein which in humans is encoded by the RTN4R gene. This gene encodes the receptor for reticulon 4, oligodendrocytemyelin glycoprotein and myelin-associated glycoprotein. This receptor mediates axonal growth inhibition and may play a role in regulating axonal regeneration and plasticity in the adult central nervous system.
Guanine nucleotide-binding protein G(I)/G(S)/G(O) subunit gamma-11 is a protein that in humans is encoded by the GNG11 gene.
RNA-binding protein 33 is a protein that in humans is encoded by the RBM33 gene.
Neoaves is a clade that consists of all modern birds with the exception of Paleognathae and Galloanserae. Almost 95% of the roughly 10,000 known species of extant birds belong to the Neoaves.
Guillardia is a genus of flagellate cryptomonad algae belonging to the family Geminigeraceae, containing a secondary plastid within a reduced cytoplasmic compartment that contains a vestigial nucleomorph. There is only one characterised member of this genus, Guillardia theta.
The Denisovans or Denisova hominins(di-NEE-sə-və) are an extinct species or subspecies of archaic human that ranged across Asia during the Lower and Middle Paleolithic. Denisovans are known from few physical remains and consequently, most of what is known about them comes from DNA evidence. No formal species name has been established pending more complete fossil material.
Single-cell sequencing examines the sequence information from individual cells with optimized next-generation sequencing technologies, providing a higher resolution of cellular differences and a better understanding of the function of an individual cell in the context of its microenvironment. For example, in cancer, sequencing the DNA of individual cells can give information about mutations carried by small populations of cells. In development, sequencing the RNAs expressed by individual cells can give insight into the existence and behavior of different cell types. In microbial systems, a population of the same species can appear genetically clonal. Still, single-cell sequencing of RNA or epigenetic modifications can reveal cell-to-cell variability that may help populations rapidly adapt to survive in changing environments.
Aequorlitornithes is a clade of waterbirds recovered in a comprehensive genomic systematic study using nearly 200 species in 2015. It contains the clades Charadriiformes, Mirandornithes and Phaethoquornithes. Previous studies have found different placement for the clades in the tree.
The G-value paradox arises from the lack of correlation between the number of protein-coding genes among eukaryotes and their relative biological complexity. The microscopic nematode Caenorhabditis elegans, for example, is composed of only a thousand cells but has about the same number of genes as a human. Researchers suggest resolution of the paradox may lie in mechanisms such as alternative splicing and complex gene regulation that make the genes of humans and other complex eukaryotes relatively more productive.
Marine protists are defined by their habitat as protists that live in marine environments, that is, in the saltwater of seas or oceans or the brackish water of coastal estuaries. Life originated as marine single-celled prokaryotes and later evolved into more complex eukaryotes. Eukaryotes are the more developed life forms known as plants, animals, fungi and protists. Protists are the eukaryotes that cannot be classified as plants, fungi or animals. They are mostly single-celled and microscopic. The term protist came into use historically as a term of convenience for eukaryotes that cannot be strictly classified as plants, animals or fungi. They are not a part of modern cladistics because they are paraphyletic.
{{cite journal}}
: Cite journal requires |journal=
(help)