Arabidopsis thaliana is a first class model organism and the single most important species for fundamental research in plant molecular genetics.
A. thaliana was the first plant for which a high-quality reference genome sequence was determined and a worldwide research community has developed many other genetic resources and tools. The experimental advantages of A. thaliana have enabled many important discoveries. [1] [2] [3] [4] [5] These advantages have been extensively reviewed, [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] as has its role in fundamental discoveries about the plant immune system, [22] natural variation, [23] [24] root biology, [25] and other areas. [26]
A. thaliana was first described by Johannes Thal, and later renamed in his honor. [24] (See the Taxonomy section of the main article.) Friedrich Laibach outlined why A. thaliana might be a good experimental system in 1943 [27] and collected a large number of natural accessions. [6] [12] [13] [24] A. thaliana is largely self-pollinating, so these accessions represent inbred strains, with high homozygosity that simplifies genetic analysis. Natural A. thaliana accessions are often referred to as "ecotypes". Laibach had earlier (1907) determined the A. thaliana chromosome number (5) as part of his PhD research. [28] Laibach's student Erna Reinholz described mutagenesis of A. thaliana with X-ray radiation in 1945. [29]
George Rédei pioneered the use of A. thaliana for fundamental studies, mutagenizing plants with ethyl methanesulfonate (EMS) and then screening them for auxotrophic defects [5] and writing an influential review in 1975. [6] Rédei distributed the standard laboratory accessions 'Columbia-0' and 'Landsberg erecta'. [8] [18]
Gerhard Röbbelen organized the first International Arabidopsis Symposium in 1965. [13] Röbbelen also started the 'Arabidopsis Information Service', a newsletter for sharing information in the community. [30] This newsletter was maintained by A.R. Kranz starting in 1974, and was published until 1990. [13]
As molecular biology methods progressed, many investigators sought to focus community effort on a common model plant species such as petunia or tomato. [12] [13] This concept changed the emphasis of the long tradition of researchers using diverse agronomically important species such as maize, barley, and peas. [13] The A. thaliana subcommunity espoused an ethos of freely sharing information and materials, and investigators were attracted by the perceived wide-open nature of plant molecular genetics relative to other fields that were better established and thus more "crowded" and competitive. [15] The A. thaliana genome was shown to be relatively small and nonrepetitive, [31] [32] [33] which was an important advantage for early molecular methods. [13] Pioneering A. thaliana studies have used its natural filamentous pathogen Hyaloperonospora arabidopsidis , the model plant-pathogenic bacterium Pseudomonas syringae , and many other microbes. [22] A. thaliana roots are transparent and have a relatively simple radially symmetric cellular structure, facilitating analysis by microscopy. [34]
Cloning of an A. thaliana gene, an alcohol dehydrogenase-encoding locus, was described in 1986, [35] by which time mutations at over 200 loci had been defined. [7]
Development of genetic maps based on scorable phenotypes [36] and molecular genetic markers facilitated map-based cloning of mutant loci from classical "forward genetic" screens. [13] [14] [17] Growing amounts of DNA sequence data facilitated development and application of such molecular markers. [37] [38] Descriptions of the first successful map-based cloning projects were published in 1992. [39] [40]
Recombinant inbred strain/line (RIL) populations were developed, notably from a cross of Columbia-0 × Lansberg erecta, [41] and used to map and clone a wide variety of quantitative trait loci.
A. thaliana can be genetically transformed using Agrobacterium tumefaciens ; transformation was first reported in 1986. [42] Later work showed that transgenic seed can be obtained by simply dipping flowers into a suitable bacterial suspension. The invention/discovery of this 'floral dip' method, published in 1998, [43] made A. thaliana arguably the most easily transformed multicellular organism, and has been essential to many subsequent investigations. [13] Efficient transformation facilitated insertional mutagenesis [44] as described further below.
A. thaliana geneticists made important contributions to development of the ABC model of flower development via genetic analysis of floral homeotic mutants. [45] [46] [47] [48]
The plant homeodomain finger is so named due to its discovery in an Arabidopsis homeodomain. In 1993 Schindler et al. discovered the PHD finger in the protein HAT3.1. [49] It has since proven to be important to chromatin in a wide variety of taxa. [50]
KNOTTED-like homeobox genes, homologs of the maize KNOTTED1 gene that control shoot apical meristem identity, were described in 1994 [51] and cloning of the SHOOT-MERISTEMLESS locus was published in 1996. [52]
An international consortium began developing a physical map for A. thaliana in 1990, and DNA sequencing and assembly efforts were formalized in the Arabidopsis Genome Initiative (AGI) in 1996. [4] [10] This work paralleled the Human Genome Project and related projects for other model organisms, including the budding yeast S. cerevisiae , the nematode C. elegans , and the fly Drosophila melanogaster , which were published in 1996, [53] 1998, [54] and 2000, [55] respectively. The project built on efforts to sequence expressed sequence tags from A. thaliana. [56] [57] Descriptions of the sequences of chromosomes 4 and 2 were published in 1999, [58] [59] and the project was completed in 2000. [60] [61] [62] [63] This represented the first reference genome for a flowering plant and facilitated comparative genomics.
A series of meetings led to an ambitious long-term NSF-funded initiative to determine the function of every A. thaliana gene by the year 2010. [64] [65] The rationale for this project was to combine new high-throughput technologies with systematic gene-family-wide studies and community resources to accelerate progress beyond what was possible via piecemeal single-laboratory studies.
DNA microarray technology was rapidly adopted for A. thaliana research and led to the development of "atlases" of gene expression in different tissues and under different conditions.
The A. thaliana genome sequence, low-cost Sanger sequencing, and ease of transformation facilated genome-wide mutagenesis, yielding collections of sequence-indexed transposon mutant and (especially) T-DNA mutant lines. [66] [67] The ease and speed of ordering mutant seed from stock centers dramatically accelerated "reverse genetic" study of many gene families; the Arabidopsis Biological Resource Center and the Nottingham Arabidopsis Stock Centre were important in this regard, and information on stock availability was integrated into The Arabidopsis Information Resource database. [26]
Syngenta developed and publicly shared a significant T-DNA mutant population, the Syngenta Arabidopsis Insertion Library (SAIL) collection. Industry investment in A. thaliana research suffered a setback in the closure of Syngenta's Torrey Mesa Research Institute (TMRI), [68] but remained robust. Mendel Biotechnology overexpressed the vast majority of A. thaliana transcription factors to generate leads for genetic engineering. Cereon Genomics, a subsidiary of Monsanto, sequenced the Landsberg erecta accession (at lower coverage than the Col-0 project) and shared the assembly, along with other sequence marker data. [38] [69] [70]
A. thaliana quickly became an important model for the study of plant small RNAs. The argonaute1 mutant, named for its resemblance to an Argonauta octopuses, [71] was the namesake for the Argonaute protein family central to silencing. [16] Forward genetic screens focused on vegetative phase change uncovered many genes controlling small RNA biogenesis. Multiple groups identified mutations in the DICER-LIKE1 gene (encoding the main DICER protein controlling microRNA biogenesis in plants) that cause strong developmental defects. [72] A. thaliana became an important model for RNA-directed DNA methylation (transcriptional silencing), partly because many A. thaliana methylation mutants are viable, which is not the case for several model animals (in which such mutations cause lethality). [16]
As the NSF 2010 project neared completion, there was a perceived decrease in funding agency interest in A. thaliana, evidenced by the cessation of USDA funding for A. thaliana research[ citation needed ] and the end of NSF funding for the TAIR database. [73] This trend coincided with the progress of the (US NSF-supported) National Plant Genome Initiative, which began in 1998 and put an increased emphasis on crops. Draft genome sequence for rice were published in 2002 [74] [75] and followed by publications for sorghum [76] and maize [77] in 2009. A draft genome of the model tree Populus trichocarpa was published in 2006. [78] The draft genome of Brachypodium distachyon , a short-statured model grass (Poaceae) was published in 2010. [79] The Joint Genome Institute of the United States Department of Energy identified poplar, sorghum, B. distachyon, model C4 grass Setaria viridis (foxtail millet), model moss Physcomitrella patens , model alga Chlamydomonas reinhardtii , and soybean as its "flagship" species for plant genomics geared towards bioenergy applications. [80]
Well established investigators including Ronald W. Davis, Gerald Fink, and Frederick M. Ausubel adopted A. thaliana as a model in the 1980s, attracting interest. [81] [9]
Elliot Meyerowitz and Chris R. Somerville were awarded the Balzan Prize in 2006 for their work developing A. thaliana as a model. [82] Thirteen prominent American A. thaliana geneticists were selected as investigators of the prestigious Howard Hughes Medical Institute and Gordon and Betty Moore Foundation in 2011: [83] [84] Philip Benfey, Dominique Bergmann, Simon Chan, Xuemei Chen, Jeff Dangl, Xinnian Dong, Joseph R. Ecker, Mark Estelle, Sheng Yang He, Robert A. Martienssen, Elliot Meyerowitz, Craig Pikaard, and Keiko Torii. (Also selected were wheat geneticist Jorge Dubcovsky and photosynthesis researcher Krishna Niyogi, who has extensively used A. thaliana along with the alga Chlamydomonas reinhardtii . [85] ) Prior to this, a handful of A. thaliana geneticists had become HHMI investigators: Joanne Chory (1997, [86] also awarded a 2018 Breakthrough Prize in Life Sciences [87] ), Daphne Preuss (2000-2006), [88] and Steve Jacobsen (2005). [89] Caroline Dean was awarded many honors including the 2020 Wolf Prize in Agriculture for "pioneering discoveries in flowering time control and epigenetic basis of vernalization" made with A. thaliana. [90]
A. thaliana continues to be the subject of intense study using new technologies such as high-throughput sequencing. Direct sequencing of cDNA ("RNA-Seq") largely replaced microarray analysis of gene expression, and several studies sequenced cDNA from single cells (scRNA-seq), particularly from root tissue. [25] Mapping of mutations from forward screens is increasingly done with direct genome sequencing, combined in some cases with bulked segregant analysis or backcrossing. [91] A. thaliana is a premier model for studies of the plant microbiome and natural genetic variation, [16] [23] [24] including genome-wide association studies. Short RNA-guided DNA editing with CRISPR tools has been applied to A. thaliana since 2013. [92]
In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA. The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences, and often a substantial fraction of junk DNA with no evident function. Almost all eukaryotes have mitochondria and a small mitochondrial genome. Algae and plants also contain chloroplasts with a chloroplast genome.
Arabidopsis thaliana, the thale cress, mouse-ear cress or arabidopsis, is a small plant from the mustard family (Brassicaceae), native to Eurasia and Africa. Commonly found along the shoulders of roads and in disturbed land, it is generally considered a weed.
Genomics is an interdisciplinary field of molecular biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dimensional structural configuration. In contrast to genetics, which refers to the study of individual genes and their roles in inheritance, genomics aims at the collective characterization and quantification of all of an organism's genes, their interrelations and influence on the organism. Genes may direct the production of proteins with the assistance of enzymes and messenger molecules. In turn, proteins make up body structures such as organs and tissues as well as control chemical reactions and carry signals between cells. Genomics also involves the sequencing and analysis of genomes through uses of high throughput DNA sequencing and bioinformatics to assemble and analyze the function and structure of entire genomes. Advances in genomics have triggered a revolution in discovery-based research and systems biology to facilitate understanding of even the most complex biological systems such as the brain.
Gene duplication is a major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene. Gene duplications can arise as products of several types of errors in DNA replication and repair machinery as well as through fortuitous capture by selfish genetic elements. Common sources of gene duplications include ectopic recombination, retrotransposition event, aneuploidy, polyploidy, and replication slippage.
The transfer DNA is the transferred DNA of the tumor-inducing (Ti) plasmid of some species of bacteria such as Agrobacterium tumefaciens and Agrobacterium rhizogenes . The T-DNA is transferred from bacterium into the host plant's nuclear DNA genome. The capability of this specialized tumor-inducing (Ti) plasmid is attributed to two essential regions required for DNA transfer to the host cell. The T-DNA is bordered by 25-base-pair repeats on each end. Transfer is initiated at the right border and terminated at the left border and requires the vir genes of the Ti plasmid.
The ABC model of flower development is a scientific model of the process by which flowering plants produce a pattern of gene expression in meristems that leads to the appearance of an organ oriented towards sexual reproduction, a flower. There are three physiological developments that must occur in order for this to take place: firstly, the plant must pass from sexual immaturity into a sexually mature state ; secondly, the transformation of the apical meristem's function from a vegetative meristem into a floral meristem or inflorescence; and finally the growth of the flower's individual organs. The latter phase has been modelled using the ABC model, which aims to describe the biological basis of the process from the perspective of molecular and developmental genetics.
The MADS box is a conserved sequence motif. The genes which contain this motif are called the MADS-box gene family. The MADS box encodes the DNA-binding MADS domain. The MADS domain binds to DNA sequences of high similarity to the motif CC[A/T]6GG termed the CArG-box. MADS-domain proteins are generally transcription factors. The length of the MADS-box reported by various researchers varies somewhat, but typical lengths are in the range of 168 to 180 base pairs, i.e. the encoded MADS domain has a length of 56 to 60 amino acids. There is evidence that the MADS domain evolved from a sequence stretch of a type II topoisomerase in a common ancestor of all extant eukaryotes.
Superman is a plant gene in Arabidopsis thaliana, that plays a role in controlling the boundary between stamen and carpel development in a flower. It is named for the comic book character Superman, and the related genes kryptonite (gene) and clark kent were named accordingly. It encodes a transcription factor. Homologous genes are known in the petunia and snapdragon, which are also involved in flower development, although in both cases there are important differences from the functioning in Arabidopsis. Superman is expressed early on in flower development, in the stamen whorl adjacent to the carpel whorl. It interacts with the other genes of the ABC model of flower development in a variety of ways.
Dame Caroline Dean is a British plant scientist working at the John Innes Centre. She is focused on understanding the molecular controls used by plants to seasonally judge when to flower. She is specifically interested in vernalisation — the acceleration of flowering in plants by exposure to periods of prolonged cold. She has also been on the Life Sciences jury for the Infosys Prize from 2018.
Whole genome sequencing (WGS) is the process of determining the entirety, or nearly the entirety, of the DNA sequence of an organism's genome at a single time. This entails sequencing all of an organism's chromosomal DNA as well as DNA contained in the mitochondria and, for plants, in the chloroplast.
LEAFY is a plant gene that causes groups of undifferentiated cells called meristems to develop into flowers instead of leaves with associated shoots.
HOTHEAD is an Arabidopsis thaliana gene that encodes a flavin adenine dinucleotide-containing oxidoreductase. This gene has a role in the creation of the carpel during the formation of flowers through the fusion of epidermal cells. Observations of reversion of the hothead phenotype and genotype led to the suggestion that the plants were able to "remember" the sequences of genes present in their ancestors, possibly through a cache of complementary RNA. This report attracted broad attention, and alternative explanations were suggested. Later research suggested that the supposed reversion phenomenon was due to the plants having a pronounced bias towards outcrossing, rather than self-fertilizing at high rates, as is typical for A. thaliana.
Martin Edward Kreitman is an American geneticist at the University of Chicago, most well known for the McDonald–Kreitman test that is used to infer the amount of adaptive evolution in population genetic studies.
Detlef Weigel is a German American scientist working at the interface of developmental and evolutionary biology.
Robert L. Last is a plant biochemical genomicist who studies metabolic processes that protect plants from the environment and produce products important for animal and human nutrition. His research has covered (1) production and breakdown of essential amino acids, (2) the synthesis and protective roles of Vitamin C and Vitamin E (tocopherols) as well as identification of mechanisms that protect photosystem II from damage, and (3) synthesis and biological functions of plant protective specialized metabolites. Four central questions are: (i) how are leaf and seed amino acids levels regulated, (ii.) what mechanisms protect and repair photosystem II from stress-induced damage, (iii.) how do plants produce protective metabolites in their glandular secreting trichomes (iv.) and what are the evolutionary mechanisms that contribute to the tremendous diversity of specialized metabolites that protect plants from insects and pathogens and are used as therapeutic agents.
Christopher Roland Somerville is a Canadian-American biologist known as a pioneer of Arabidopsis thaliana research. Somerville is currently Professor Emeritus at the University of California, Berkeley and a Program Officer at the Open Philanthropy Project.
A plant genome assembly represents the complete genomic sequence of a plant species, which is assembled into chromosomes and other organelles by using DNA fragments that are obtained from different types of sequencing technology.
The G-value paradox arises from the lack of correlation between the number of protein-coding genes among eukaryotes and their relative biological complexity. The microscopic nematode Caenorhabditis elegans, for example, is composed of only a thousand cells but has about the same number of genes as a human. Researchers suggest resolution of the paradox may lie in mechanisms such as alternative splicing and complex gene regulation that make the genes of humans and other complex eukaryotes relatively more productive.
June Nasrallah is Barbara McClintock Professor in the Plant Biology Section of the School of Integrative Plant Science at Cornell University. Her research focuses on plant reproductive biology and the cell-cell interactions that underlie self-incompatibility in plants belonging to the mustard (Brassicaceae) family. She was elected to the US National Academy of Sciences in 2003 for this work and her contributions generally to our understanding of receptor-based signaling in plants.
{{cite journal}}
: CS1 maint: numeric names: authors list (link)