Eukaryotic chromosome fine structure refers to the structure of sequences for eukaryotic chromosomes. Some fine sequences are included in more than one class, so the classification listed is not intended to be completely separate.
Some sequences are required for a properly functioning chromosome:
Throughout the eukaryotic kingdom, the overall structure of chromosome ends is conserved and is characterized by the telomeric tract - a series of short G-rich repeats. This is succeeded by an extensive subtelomeric region consisting of various types and lengths of repeats - the telomere associated sequences (TAS). [1] These regions are generally low in gene density, low in transcription, low in recombination, late replicating, are involved in protecting the end from degradation and end-to-end fusions and in completing replication. The subtelomeric repeats can rescue chromosome ends when telomerase fails, buffer subtelomerically located genes against transcriptional silencing and protect the genome from deleterious rearrangements due to ectopic recombination. They may also be involved in fillers for increasing chromosome size to some minimum threshold level necessary for chromosome stability; act as barriers against transcriptional silencing; provide a location for the adaptive amplification of genes; and be involved in secondary mechanism of telomere maintenance via recombination when telomerase activity is absent.
Other sequences are used in replication or during interphase with the physical structure of the chromosome.
Regions of the genome with protein-coding genes include several elements:
Many regions of the DNA are transcribed with RNA as the functional form:
Other RNAs are transcribed and not translated, but have undiscovered functions.
Repeated sequences are of two basic types: unique sequences that are repeated in one area; and repeated sequences that are interspersed throughout the genome.
Satellites are unique sequences that are repeated in tandem in one area. Depending on the length of the repeat, they are classified as either:
Interspersed sequences are nonadjacent repeats, with sequences that are found dispersed across the genome. They can be classified based on the length of the repeat as:
Both of these types are classified as retrotransposons.
Retrotransposons are sequences in the DNA that are the result of retrotransposition of RNA. LINEs and SINEs are examples where the sequences are repeats, but there are non-repeated sequences that can also be retrotransposons.
Typical eukaryotic chromosomes contain much more DNA than is classified in the categories above. The DNA may be used as spacing, or have other as-yet-unknown function. Or, they may simply be random sequences of no consequence.
In the fields of molecular biology and genetics, a genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA. The nuclear genome includes protein-coding genes and non-coding genes, other functional regions of the genome such as regulatory sequences, and often a substantial fraction of junk DNA with no evident function. Almost all eukaryotes have mitochondria and a small mitochondrial genome. Algae and plants also contain chloroplasts with a chloroplast genome.
An intron is any nucleotide sequence within a gene that is not expressed or operative in the final RNA product. The word intron is derived from the term intragenic region, i.e., a region inside a gene. The term intron refers to both the DNA sequence within a gene and the corresponding RNA sequence in RNA transcripts. The non-intron sequences that become joined by this RNA processing to form the mature RNA are called exons.
A reverse transcriptase (RT) is an enzyme used to generate complementary DNA (cDNA) from an RNA template, a process termed reverse transcription. Reverse transcriptases are used by viruses such as HIV and hepatitis B to replicate their genomes, by retrotransposon mobile genetic elements to proliferate within the host genome, and by eukaryotic cells to extend the telomeres at the ends of their linear chromosomes. Contrary to a widely held belief, the process does not violate the flows of genetic information as described by the classical central dogma, as transfers of information from RNA to DNA are explicitly held possible.
A transposable element is a nucleic acid sequence in DNA that can change its position within a genome, sometimes creating or reversing mutations and altering the cell's genetic identity and genome size. Transposition often results in duplication of the same genetic material. In the human genome, L1 and Alu elements are two examples. Barbara McClintock's discovery of them earned her a Nobel Prize in 1983. Its importance in personalized medicine is becoming increasingly relevant, as well as gaining more attention in data analytics given the difficulty of analysis in very high dimensional spaces.
The human genome is a complete set of nucleic acid sequences for humans, encoded as DNA within the 23 chromosome pairs in cell nuclei and in a small DNA molecule found within individual mitochondria. These are usually treated separately as the nuclear genome and the mitochondrial genome. Human genomes include both protein-coding DNA sequences and various types of DNA that does not encode proteins. The latter is a diverse category that includes DNA coding for non-translated RNA, such as that for ribosomal RNA, transfer RNA, ribozymes, small nuclear RNAs, and several types of regulatory RNAs. It also includes promoters and their associated gene-regulatory elements, DNA playing structural and replicatory roles, such as scaffolding regions, telomeres, centromeres, and origins of replication, plus large numbers of transposable elements, inserted viral DNA, non-functional pseudogenes and simple, highly repetitive sequences. Introns make up a large percentage of non-coding DNA. Some of this non-coding DNA is non-functional junk DNA, such as pseudogenes, but there is no firm consensus on the total amount of junk DNA.
Non-coding DNA (ncDNA) sequences are components of an organism's DNA that do not encode protein sequences. Some non-coding DNA is transcribed into functional non-coding RNA molecules. Other functional regions of the non-coding DNA fraction include regulatory sequences that control gene expression; scaffold attachment regions; origins of DNA replication; centromeres; and telomeres. Some non-coding regions appear to be mostly nonfunctional such as introns, pseudogenes, intergenic DNA, and fragments of transposons and viruses.
The coding region of a gene, also known as the coding sequence(CDS), is the portion of a gene's DNA or RNA that codes for protein. Studying the length, composition, regulation, splicing, structures, and functions of coding regions compared to non-coding regions over different species and time periods can provide a significant amount of important information regarding gene organization and evolution of prokaryotes and eukaryotes. This can further assist in mapping the human genome and developing gene therapy.
Repeated sequences are short or long patterns of nucleic acids that occur in multiple copies throughout the genome. In many organisms, a significant fraction of the genomic DNA is repetitive, with over two-thirds of the sequence consisting of repetitive elements in humans. Some of these repeated sequences are necessary for maintaining important genome structures such as telomeres or centromeres.
Retrotransposons are a type of genetic component that copy and paste themselves into different genomic locations (transposon) by converting RNA back into DNA through the reverse transcription process using an RNA transposition intermediate.
A primary transcript is the single-stranded ribonucleic acid (RNA) product synthesized by transcription of DNA, and processed to yield various mature RNA products such as mRNAs, tRNAs, and rRNAs. The primary transcripts designated to be mRNAs are modified in preparation for translation. For example, a precursor mRNA (pre-mRNA) is a type of primary transcript that becomes a messenger RNA (mRNA) after processing.
Subtelomeres are segments of DNA between telomeric caps and chromatin.
In biology, the word gene can have several different meanings. The Mendelian gene is a basic unit of heredity and the molecular gene is a sequence of nucleotides in DNA that is transcribed to produce a functional RNA. There are two types of molecular genes: protein-coding genes and noncoding genes.
Exon shuffling is a molecular mechanism for the formation of new genes. It is a process through which two or more exons from different genes can be brought together ectopically, or the same exon can be duplicated, to create a new exon-intron structure. There are different mechanisms through which exon shuffling occurs: transposon mediated exon shuffling, crossover during sexual recombination of parental genomes and illegitimate recombination.
60S ribosomal protein L41 is a protein that is specific to humans and is encoded by the RPL41 gene, also known as HG12 and large eukaryotic ribosomal subunit protein eL41. The gene family HGNC is L ribosomal proteins. The protein itself is also described as P62945-RL41_HUMAN on the GeneCards database. This RPL41 gene is located on chromosome 12.
Numerous key discoveries in biology have emerged from studies of RNA, including seminal work in the fields of biochemistry, genetics, microbiology, molecular biology, molecular evolution and structural biology. As of 2010, 30 scientists have been awarded Nobel Prizes for experimental work that includes studies of RNA. Specific discoveries of high biological significance are discussed in this article.
A conserved non-coding sequence (CNS) is a DNA sequence of noncoding DNA that is evolutionarily conserved. These sequences are of interest for their potential to regulate gene production.
Telomeric repeat–containing RNA (TERRA) is a long non-coding RNA transcribed from telomeres - repetitive nucleotide regions found on the ends of chromosomes that function to protect DNA from deterioration or fusion with neighboring chromosomes. TERRA has been shown to be ubiquitously expressed in almost all cell types containing linear chromosomes - including humans, mice, and yeasts. While the exact function of TERRA is still an active area of research, it is generally believed to play a role in regulating telomerase activity as well as maintaining the heterochromatic state at the ends of chromosomes. TERRA interaction with other associated telomeric proteins has also been shown to help regulate telomere integrity in a length-dependent manner.
Short interspersed nuclear elements (SINEs) are non-autonomous, non-coding transposable elements (TEs) that are about 100 to 700 base pairs in length. They are a class of retrotransposons, DNA elements that amplify themselves throughout eukaryotic genomes, often through RNA intermediates. SINEs compose about 13% of the mammalian genome.
This glossary of genetics is a list of definitions of terms and concepts commonly used in the study of genetics and related disciplines in biology, including molecular biology, cell biology, and evolutionary biology. It is intended as introductory material for novices; for more specific and technical detail, see the article corresponding to each term. For related terms, see Glossary of evolutionary biology.
This glossary of genetics is a list of definitions of terms and concepts commonly used in the study of genetics and related disciplines in biology, including molecular biology, cell biology, and evolutionary biology. It is split across two articles: