This article may be too technical for most readers to understand.(May 2016) |
Nuclear organization refers to the spatial organization and dynamics of chromatin within a cell nucleus during interphase. There are many different levels and scales of nuclear organisation.
At the smallest scale, DNA is packaged into units called nucleosomes, which compacts DNA about 7-fold. In addition, nucleosomes protect DNA from damage and carry epigenetic information. Positions of nucleosomes determine accessibility of DNA to transcription factors.
At the intermediate scale, DNA looping can physically bring together DNA elements that would otherwise be separated by large distances. These interactions allow regulatory signals to cross over large genomic distances—for example, from enhancers to promoters.
At a larger scale, chromosomes are organised into two compartments labelled A ("active") and B ("inactive"), which are further subdivided into sub-compartments. [1] At the largest scale, entire chromosomes segregate into distinct regions called chromosome territories.
Chromosome organization is dynamic at all scales. [2] [3] Individual nucleosomes undergo constant thermal motion and nucleosome breathing. At intermediate scales, an active process of loop extrusion creates dynamic loops and Topologically Associating Domains (TADs).
Each human cell contains around two metres of DNA, which must be tightly folded to fit inside the cell nucleus. However, in order for the cell to function, proteins must be able to access the sequence information contained within the DNA, in spite of its tightly-packed nature. Hence, the cell has a number of mechanisms in place to control how DNA is organized. [4]
Moreover, nuclear organization can play a role in establishing cell identity. Cells within an organism have near identical nucleic acid sequences, but often exhibit different phenotypes. One way in which this individuality occurs is through changes in genome architecture, which can alter the expression of different sets of genes. [5] These alterations can have a downstream effect on cellular functions such as cell cycle facilitation, DNA replication, nuclear transport, and alteration of nuclear structure. Controlled changes in nuclear organization are essential for proper cellular function.
The organization of chromosomes into distinct regions within the nucleus was first proposed in 1885 by Carl Rabl. Later in 1909, with the help of the microscopy technology at the time, Theodor Boveri coined the termed chromosome territories after observing that chromosomes occupy individually distinct nuclear regions. [6] Since then, mapping genome architecture has become a major topic of interest.
Over the last ten years, rapid methodological developments have greatly advanced understanding in this field. [4] Large-scale DNA organization can be assessed with DNA imaging using fluorescent tags, such as DNA Fluorescence in situ hybridization (FISH), and specialized microscopes. [7] Additionally, high-throughput sequencing technologies such as Chromosome Conformation Capture-based methods can measure how often DNA regions are in close proximity. [8] At the same time, progress in genome-editing techniques (such as CRISPR/Cas9, ZFNs, and TALENs) have made it easier to test the organizational function of specific DNA regions and proteins. [9] There is also growing interest in the rheological properties of the interchromosomal space, studied by the means of Fluorescence Correlation Spectroscopy and its variants. [10] [11]
Architectural proteins regulate chromatin structure by establishing physical interactions between DNA elements. [12] These proteins tend to be highly conserved across a majority of eukaryotic species. [13] [14]
In mammals, key architectural proteins include:
The organization of DNA within the nucleus begins with the 10 nm fiber, a "beads-on-a-string" structure [24] made of nucleosomes connected by 20-60bp linkers. A fiber of nucleosomes is interrupted by regions of accessible DNA, which are 100-1000bp long regions devoid of nucleosomes. Transcription factors bind within accessible DNA to displace nucleosomes and form cis-regulatory elements. Sites of accessible DNA are typically probed by ATAC-seq or DNase-Seq experimental methods.
A 30 nm fiber has long been proposed as the next layer of chromatin organization. While 30nm fiber is often visible in vitro under high salt concentration, [25] its existence in vivo has been questioned in many recent studies. [26] [27] [28] Instead, these studies point towards a disordered fiber with a width of 20 to 50nm.
The process of loop extrusion by SMC complexes dynamically creates chromatin loops ranging in size from 50-100kb in yeast [29] to up to several Mb in mammals. [30] There is strong support for loop extrusion in yeast, mammals, and nematodes. [31]
In mammals, loop extrusion is responsible for the formation of topologically associating domains and loops between CTCF sites, as well as for bringing promoters and enhancers together. CTCF sites serve as boundaries of insulated neighborhoods or topologically associating domains.
The presence of loop extrusion in fruit flies is debated and the formation of DNA loops may be mediated by a different process of boundary element pairing. [32]
Self-interacting (or self-associating) domains are found in many organisms. In eukaryotes, they have been usually referred to as TADs irrespective of the mechanism of their formation. TADs have a higher ratio of chromosomal contacts within the domain than outside it. [33] They are formed through the help of architectural proteins. In many organisms, TADs correlate with regulation of gene expression, and enhancers and promoters within a TAD interact at higher frequency. [34]
Lamina-associating domains (LADs) and nucleolar-associating domains (NADs) are regions of the chromosome that interact with the nuclear lamina and nucleolus, respectively.
Making up approximately 40% of the genome, LADs consist mostly of gene poor regions and span between 40kb to 30Mb in size. [19] There are two known types of LADs: constitutive LADs (cLADs) and facultative LADs (fLADs). cLADs are A-T rich heterochromatin regions that remain on lamina and are seen across many types of cells and species. There is evidence that these regions are important to the structural formation of interphase chromosome. On the other hand, fLADs have varying lamina interactions and contain genes that are either activated or repressed between individual cells indicating cell-type specificity. [35] The boundaries of LADs, like self-interacting domains, are enriched in transcriptional elements and architectural protein binding sites. [19]
NADs, which constitutes 4% of the genome, share nearly all of the same physical characteristics as LADs. In fact, DNA analysis of these two types of domains have shown that many sequences overlap, indicating that certain regions may switch between lamina-binding and nucleolus-binding. [36] NADs are associated with nucleolus function. The nucleolus is the largest sub-organelle within the nucleus and is the principal site for rRNA transcription. It also acts in signal recognition particle biosynthesis, protein sequestration, and viral replication. [37] The nucleolus forms around rDNA genes from different chromosomes. However, only a subset of rDNA genes is transcribed at a time and do so by looping into the interior of the nucleolus. The rest of the genes lay on the periphery of the sub-nuclear organelle in silenced heterochromatin state. [36]
A/B compartments were first discovered in early Hi-C studies. [38] [39] Researchers noticed that the whole genome could be split into two spatial compartments, labelled "A" and "B", where regions in compartment A tend to interact preferentially with A compartment-associated regions than B compartment-associated ones. Similarly, regions in compartment B tend to associate with other B compartment-associated regions.
A/B compartment-associated regions are on the multi-Mb scale and correlate with either open and expression-active chromatin ("A" compartments) or closed and expression-inactive chromatin ("B" compartments). [38] A compartments tend to be gene-rich, have high GC-content, contain histone markers for active transcription, and usually displace the interior of the nucleus. As well, they are typically made up of self-interacting domains and contain early replication origins. B compartments, on the other hand, tend to be gene-poor, compact, contain histone markers for gene silencing, and lie on the nuclear periphery. They consist mostly of LADs and contain late replication origins. [38] In addition, higher resolution Hi-C coupled with machine learning methods has revealed that A/B compartments can be refined into subcompartments. [40] [41]
The fact that compartments self-interact is consistent with the idea that the nucleus localizes proteins and other factors such as long non-coding RNA (lncRNA) in regions suited for their individual roles.[ citation needed ] An example of this is the presence of multiple transcription factories throughout the nuclear interior. [42] These factories are associated with elevated levels of transcription due to the high concentration of transcription factors (such as transcription protein machinery, active genes, regulatory elements, and nascent RNA). Around 95% of active genes are transcribed within transcription factories. Each factory can transcribe multiple genes – these genes need not have similar product functions, nor do they need to lie on the same chromosome. Finally, the co-localization of genes within transcription factories is known to depend on cell type. [43]
The last level of organization concerns the distinct positioning of individual chromosomes within the nucleus. The region occupied by a chromosome is called a chromosome territory (CT). [44] Among eukaryotes, CTs have several common properties. First, although chromosomal locations are not the same across cells within a population, there is some preference among individual chromosomes for particular regions. For example, large, gene-poor chromosomes are commonly located on the periphery near the nuclear lamina while smaller, gene-rich chromosomes group closer to the center of the nucleus. [45] Second, individual chromosome preference is variable among different cell types. For example, the X-chromosome has shown to localize to the periphery more often in liver cells than in kidney cells. [46] Another conserved property of chromosome territories is that homologous chromosomes tend to be far apart from one another during cell interphase. The final characteristic is that the position of individual chromosomes during each cell cycle stays relatively the same until the start of mitosis. [47] The mechanisms and reasons behind chromosome territory characteristics is still unknown and further experimentation is needed.
The cell nucleus is a membrane-bound organelle found in eukaryotic cells. Eukaryotic cells usually have a single nucleus, but a few cell types, such as mammalian red blood cells, have no nuclei, and a few others including osteoclasts have many. The main structures making up the nucleus are the nuclear envelope, a double membrane that encloses the entire organelle and isolates its contents from the cellular cytoplasm; and the nuclear matrix, a network within the nucleus that adds mechanical support.
Chromatin is a complex of DNA and protein found in eukaryotic cells. The primary function is to package long DNA molecules into more compact, denser structures. This prevents the strands from becoming tangled and also plays important roles in reinforcing the DNA during cell division, preventing DNA damage, and regulating gene expression and DNA replication. During mitosis and meiosis, chromatin facilitates proper segregation of the chromosomes in anaphase; the characteristic shapes of chromosomes visible during this stage are the result of DNA being coiled into highly condensed chromatin.
A nucleosome is the basic structural unit of DNA packaging in eukaryotes. The structure of a nucleosome consists of a segment of DNA wound around eight histone proteins and resembles thread wrapped around a spool. The nucleosome is the fundamental subunit of chromatin. Each nucleosome is composed of a little less than two turns of DNA wrapped around a set of eight proteins called histones, which are known as a histone octamer. Each histone octamer is composed of two copies each of the histone proteins H2A, H2B, H3, and H4.
Euchromatin is a lightly packed form of chromatin that is enriched in genes, and is often under active transcription. Euchromatin stands in contrast to heterochromatin, which is tightly packed and less accessible for transcription. 92% of the human genome is euchromatic.
In molecular biology and genetics, transcriptional regulation is the means by which a cell regulates the conversion of DNA to RNA (transcription), thereby orchestrating gene activity. A single gene can be regulated in a range of ways, from altering the number of copies of RNA that are transcribed, to the temporal control of when the gene is transcribed. This control allows the cell or organism to respond to a variety of intra- and extracellular signals and thus mount a response. Some examples of this include producing the mRNA that encode enzymes to adapt to a change in a food source, producing the gene products involved in cell cycle specific activities, and producing the gene products responsible for cellular differentiation in multicellular eukaryotes, as studied in evolutionary developmental biology.
The nuclear lamina is a dense fibrillar network inside the nucleus of eukaryote cells. It is composed of intermediate filaments and membrane associated proteins. Besides providing mechanical support, the nuclear lamina regulates important cellular events such as DNA replication and cell division. Additionally, it participates in chromatin organization and it anchors the nuclear pore complexes embedded in the nuclear envelope.
In biology, the nuclear matrix is the network of fibres found throughout the inside of a cell nucleus after a specific method of chemical extraction. According to some it is somewhat analogous to the cell cytoskeleton. In contrast to the cytoskeleton, however, the nuclear matrix has been proposed to be a dynamic structure. Along with the nuclear lamina, it supposedly aids in organizing the genetic information within the cell.
An insulator is a type of cis-regulatory element known as a long-range regulatory element. Found in multicellular eukaryotes and working over distances from the promoter element of the target gene, an insulator is typically 300 bp to 2000 bp in length. Insulators contain clustered binding sites for sequence specific DNA-binding proteins and mediate intra- and inter-chromosomal interactions.
Cohesin is a protein complex that mediates sister chromatid cohesion, homologous recombination, and DNA looping. Cohesin is formed of SMC3, SMC1, SCC1 and SCC3. Cohesin holds sister chromatids together after DNA replication until anaphase when removal of cohesin leads to separation of sister chromatids. The complex forms a ring-like structure and it is believed that sister chromatids are held together by entrapment inside the cohesin ring. Cohesin is a member of the SMC family of protein complexes which includes Condensin, MukBEF and SMC-ScpAB.
Histone H2B is one of the 5 main histone proteins involved in the structure of chromatin in eukaryotic cells. Featuring a main globular domain and long N-terminal and C-terminal tails, H2B is involved with the structure of the nucleosomes.
Transcriptional repressor CTCF also known as 11-zinc finger protein or CCCTC-binding factor is a transcription factor that in humans is encoded by the CTCF gene. CTCF is involved in many cellular processes, including transcriptional regulation, insulator activity, V(D)J recombination and regulation of chromatin architecture.
Chromatin remodeling is the dynamic modification of chromatin architecture to allow access of condensed genomic DNA to the regulatory transcription machinery proteins, and thereby control gene expression. Such remodeling is principally carried out by 1) covalent histone modifications by specific enzymes, e.g., histone acetyltransferases (HATs), deacetylases, methyltransferases, and kinases, and 2) ATP-dependent chromatin remodeling complexes which either move, eject or restructure nucleosomes. Besides actively regulating gene expression, dynamic remodeling of chromatin imparts an epigenetic regulatory role in several key biological processes, egg cells DNA replication and repair; apoptosis; chromosome segregation as well as development and pluripotency. Aberrations in chromatin remodeling proteins are found to be associated with human diseases, including cancer. Targeting chromatin remodeling pathways is currently evolving as a major therapeutic strategy in the treatment of several cancers.
Chromosome conformation capture techniques are a set of molecular biology methods used to analyze the spatial organization of chromatin in a cell. These methods quantify the number of interactions between genomic loci that are nearby in 3-D space, but may be separated by many nucleotides in the linear genome. Such interactions may result from biological functions, such as promoter-enhancer interactions, or from random polymer looping, where undirected physical motion of chromatin causes loci to collide. Interaction frequencies may be analyzed directly, or they may be converted to distances and used to reconstruct 3-D structures.
Double-strand-break repair protein rad21 homolog is a protein that in humans is encoded by the RAD21 gene. RAD21, an essential gene, encodes a DNA double-strand break (DSB) repair protein that is evolutionarily conserved in all eukaryotes from budding yeast to humans. RAD21 protein is a structural component of the highly conserved cohesin complex consisting of RAD21, SMC1A, SMC3, and SCC3 [ STAG1 (SA1) and STAG2 (SA2) in multicellular organisms] proteins, involved in sister chromatid cohesion.
Inner nuclear membrane proteins are membrane proteins that are embedded in or associated with the inner membrane of the nuclear envelope. There are about 60 INM proteins, most of which are poorly characterized with respect to structure and function. Among the few well-characterized INM proteins are lamin B receptor (LBR), lamina-associated polypeptide 1 (LAP1), lamina-associated polypeptide-2 (LAP2), emerin and MAN1.
Structural maintenance of chromosomes protein 1B (SMC-1B) is a protein that in humans is encoded by the SMC1B gene. SMC proteins engage in chromosome organization and can be broken into 3 groups based on function which are cohesins, condensins, and DNA repair. SMC-1B belongs to a family of proteins required for chromatid cohesion and DNA recombination during meiosis and mitosis. SMC1B protein appears to participate with other cohesins REC8, STAG3 and SMC3 in sister-chromatid cohesion throughout the whole meiotic process in human oocytes.
The term S/MAR, otherwise called SAR, or MAR, are sequences in the DNA of eukaryotic chromosomes where the nuclear matrix attaches. As architectural DNA components that organize the genome of eukaryotes into functional units within the cell nucleus, S/MARs mediate structural organization of the chromatin within the nucleus. These elements constitute anchor points of the DNA for the chromatin scaffold and serve to organize the chromatin into structural domains. Studies on individual genes led to the conclusion that the dynamic and complex organization of the chromatin mediated by S/MAR elements plays an important role in the regulation of gene expression.
A topologically associating domain (TAD) is a self-interacting genomic region, meaning that DNA sequences within a TAD physically interact with each other more frequently than with sequences outside the TAD. The average size of a topologically associating domain (TAD) is 1000 kb in humans, 880 kb in mouse cells, and 140 kb in fruit flies. Boundaries at both side of these domains are conserved between different mammalian cell types and even across species and are highly enriched with CCCTC-binding factor (CTCF) and cohesin. In addition, some types of genes appear near TAD boundaries more often than would be expected by chance.
In mammalian biology, insulated neighborhoods are chromosomal loop structures formed by the physical interaction of two DNA loci bound by the transcription factor CTCF and co-occupied by cohesin. Insulated neighborhoods are thought to be structural and functional units of gene control because their integrity is important for normal gene regulation. Current evidence suggests that these structures form the mechanistic underpinnings of higher-order chromosome structures, including topologically associating domains (TADs). Insulated neighborhoods are functionally important in understanding gene regulation in normal cells and dysregulated gene expression in disease.
Loop extrusion is a major mechanism of Nuclear organization. It is a dynamic process in which structural maintenance of chromosomes (SMC) protein complexes progressively grow loops of DNA or chromatin. In this process, SMC complexes, such as condensin or cohesin, bind to DNA/chromatin, use ATP-driven motor activity to reel in DNA, and as a result, extrude the collected DNA as a loop.