Legend of nucleobases | |
---|---|
Code | Nucleotide represented |
A | Adenine (A) |
C | Cytosine (C) |
G | Guanine (G) |
T | Thymine (T) |
N | A, C, G or T |
M | A or C |
R | A or G |
W | A or T |
Y | C or T |
S | C or G |
K | G or T |
H | A, C or T |
B | C, G or T |
V | A, C or G |
D | A, G or T |
The homing endonucleases are a special type of restriction enzymes encoded by introns or inteins. They act on the cellular DNA of the cell that synthesizes them; to be precise, in the opposite allele of the gene that encode them. [1]
The list includes some of the most studied examples. The following concepts have been detailed:
H1
: LAGLIDADG family – H2
: GIY-YIG family – H3
: H-N-H family – H4
: His-Cys box family – H5
: PD-(D/E)xK – H6
: EDxHD. (Further reading: Homing endonuclease § Structural families.)Enzyme | SF | PDB code | Source | D | SCL | Recognition sequence | Cut |
---|---|---|---|---|---|---|---|
I-AniI [2] | H1 | 1P8K | Aspergillus nidulans | E | mito | 5' TTGAGGAGGTTTCTCTGTAAATAA 3' AACTCCTCCAAAGAGACATTTATT | 5' ---TTGAGGAGGTTTC TCTGTAAATAA--- 3' 3' ---AACTCCTCC AAAGAGACATTTATT--- 5' |
I-CeuI [3] [4] [5] [6] | H1 | 2EX5 | Chlamydomonas eugametos | E | chloro | 5' TAACTATAACGGTCCTAAGGTAGCGA 3' ATTGATATTGCCAGGATTCCATCGCT | 5' ---TAACTATAACGGTCCTAA GGTAGCGA--- 3' 3' ---ATTGATATTGCCAG GATTCCATCGCT--- 5' |
I-ChuI [7] [8] | H1 | Q32001 | Chlamydomonas humicola | E | chloro | 5' GAAGGTTTGGCACCTCGATGTCGGCTCATC 3' CTTCCAAACCGTGGAGCTACAGCCGAGTAG | 5' ---GAAGGTTTGGCACCTCG ATGTCGGCTCATC--- 3' 3' ---CTTCCAAACCGTG GAGCTACAGCCGAGTAG--- 5' |
I-CpaI [8] [9] | H1 | Q39562 | Chlamydomonas pallidostigmata | E | chloro | 5' CGATCCTAAGGTAGCGAAATTCA 3' GCTAGGATTCCATCGCTTTAAGT | 5' ---CGATCCTAAGGTAGCGAA ATTCA--- 3' 3' ---GCTAGGATTCCATC GCTTTAAGT--- 5' |
I-CpaII [10] | H1 | Q39559 | Chlamydomonas pallidostigmata | E | chloro | 5' CCCGGCTAACTCTGTGCCAG 3' GGGCCGATTGAGACACGGTC | 5' ---CCCGGCTAACTC TGTGCCAG--- 3' 5' ---GGGCCGAT TGAGACACGGTC--- 3' |
I-CreI [11] | H1 | 1BP7 | Chlamydomonas reinhardtii | E | chloro | 5' CTGGGTTCAAAACGTCGTGAGACAGTTTGG 3' GACCCAAGTTTTGCAGCACTCTGTCAAACC | 5' ---CTGGGTTCAAAACGTCGTGA GACAGTTTGG--- 3' 3' ---GACCCAAGTTTTGCAG CACTCTGTCAAACC--- 5' |
I-DmoI | H1 | 1B24 | Desulfurococcus mobilis | A | chrm | 5' ATGCCTTGCCGGGTAAGTTCCGGCGCGCAT 3' TACGGAACGGCCCATTCAAGGCCGCGCGTA | 5' ---ATGCCTTGCCGGGTAA GTTCCGGCGCGCAT--- 3' 3' ---TACGGAACGGCC CATTCAAGGCCGCGCGTA--- 5' |
H-DreI [12] | H1 | 1MOW | Hybrid: I-DmoI and I-CreI | A E | 5' CAAAACGTCGTAAGTTCCGGCGCG 3' GTTTTGCAGCATTCAAGGCCGCGC | 5' ---CAAAACGTCGTAA GTTCCGGCGCG--- 3' 3' ---GTTTTGCAG CATTCAAGGCCGCGC--- 5' | |
I-HmuI [13] [14] | H3 | 1U3E | Bacillus subtilis phage SP01 | B | phage | 5' AGTAATGAGCCTAACGCTCAGCAA 3' TCATTACTCGGATTGCGAGTCGTT | Nicking endonuclease : * 3' ---TCATTACTCGGATTGC GAGTCGTT--- 5' |
I-HmuII [14] [15] | H3 | Q38137 | Bacillus subtilis phage SP82 | B | phage | 5' AGTAATGAGCCTAACGCTCAACAA 3' TCATTACTCGGATTGCGAGTTGTT | Nicking endonuclease : * 3' ---TCATTACTCGGATTGCGAGTTGTTN35 NNNN--- 5' |
I-LlaI [16] [17] | H3 | P0A3U1 | Lactococcus lactis | B | chrm | 5' CACATCCATAACCATATCATTTTT 3' GTGTAGGTATTGGTATAGTAAAAA | 5' ---CACATCCATAA CCATATCATTTTT--- 3' 3' ---GTGTAGGTATTGGTATAGTAA AAA--- 5' |
I-MsoI | H1 | 1M5X | Monomastix sp. | E | 5' CTGGGTTCAAAACGTCGTGAGACAGTTTGG 3' GACCCAAGTTTTGCAGCACTCTGTCAAACC | 5' ---CTGGGTTCAAAACGTCGTGA GACAGTTTGG--- 3' 3' ---GACCCAAGTTTTGCAG CACTCTGTCAAACC--- 5' | |
PI-PfuI | H1 | 1DQ3 | Pyrococcus furiosus Vc1 | A | 5' GAAGATGGGAGGAGGGACCGGACTCAACTT 3' CTTCTACCCTCCTCCCTGGCCTGAGTTGAA | 5' ---GAAGATGGGAGGAGGG ACCGGACTCAACTT--- 3' 3' ---CTTCTACCCTCC TCCCTGGCCTGAGTTGAA--- 5' | |
PI-PkoII | H1 | 2CW7 | Pyrococcus kodakarensis BAA-918 | A | 5' CAGTACTACGGTTAC 3' GTCATGATGCCAATG | 5' ---CAGTACTACG GTTAC--- 3' 3' ---GTCATG ATGCCAATG--- 5' | |
I-PorI [18] [19] | H3 | Pyrobaculum organotrophum | A | chrm | 5' GCGAGCCCGTAAGGGTGTGTACGGG 3' CGCTCGGGCATTCCCACACATGCCC | 5' ---GCGAGCCCGTAAGGGT GTGTACGGG--- 3' 3' ---CGCTCGGGCATT CCCACACATGCCC--- 5' | |
I-PpoI | H4 | 1EVX | Physarum polycephalum | E | plasmid | 5' TAACTATGACTCTCTTAAGGTAGCCAAAT 3' ATTGATACTGAGAGAATTCCATCGGTTTA | 5' ---TAACTATGACTCTCTTAA GGTAGCCAAAT--- 3' 3' ---ATTGATACTGAGAG AATTCCATCGGTTTA--- 5' |
PI-PspI | H1 | Q51334 | Pyrococcus sp. | A | chrm | 5' TGGCAAACAGCTATTATGGGTATTATGGGT 3' ACCGTTTGTCGATAATACCCATAATACCCA | 5' ---TGGCAAACAGCTATTAT GGGTATTATGGGT--- 3' 3' ---ACCGTTTGTCGAT AATACCCATAATACCCA--- 5' |
I-ScaI [20] [21] | H1 | P03873 | Saccharomyces capensis | E | mito | 5' TGTCACATTGAGGTGCACTAGTTATTAC 3' ACAGTGTAACTCCACGTGATCAATAATG | 5' ---TGTCACATTGAGGTGCACT AGTTATTAC--- 3' 3' ---ACAGTGTAACTCCAC GTGATCAATAATG--- 5' |
I-SceI [4] [5] | H1 | 1R7M | Saccharomyces cerevisiae | E | mito | 5' AGTTACGCTAGGGATAACAGGGTAATATAG 3' TCAATGCGATCCCTATTGTCCCATTATATC | 5' ---AGTTACGCTAGGGATAA CAGGGTAATATAG--- 3' 3' ---TCAATGCGATCCC TATTGTCCCATTATATC--- 5' |
PI-SceI [22] [23] | H1 | 1VDE | Saccharomyces cerevisiae | E | 5' ATCTATGTCGGGTGCGGAGAAAGAGGTAATGAAATGGCA 3' TAGATACAGCCCACGCCTCTTTCTCCATTACTTTACCGT | 5' ---ATCTATGTCGGGTGC GGAGAAAGAGGTAATGAAATGGCA--- 3' 3' ---TAGATACAGCC CACGCCTCTTTCTCCATTACTTTACCGT--- 5' | |
I-SceII [24] [25] [26] | H1 | Saccharomyces cerevisiae | E | mito | 5' TTTTGATTCTTTGGTCACCCTGAAGTATA 3' AAAACTAAGAAACCAGTGGGACTTCATAT | 5' ---TTTTGATTCTTTGGTCACCC TGAAGTATA--- 3' 3' ---AAAACTAAGAAACCAG TGGGACTTCATAT--- 5' | |
I-SecIII [24] [27] [28] | H1 | Saccharomyces cerevisiae | E | mito | 5' ATTGGAGGTTTTGGTAACTATTTATTACC 3' TAACCTCCAAAACCATTGATAAATAATGG | 5' ---ATTGGAGGTTTTGGTAAC TATTTATTACC--- 3' 3' ---TAACCTCCAAAACC ATTGATAAATAATGG--- 5' | |
I-SceIV [24] [29] [30] | H1 | Saccharomyces cerevisiae | E | mito | 5' TCTTTTCTCTTGATTAGCCCTAATCTACG 3' AGAAAAGAGAACTAATCGGGATTAGATGC | 5' ---TCTTTTCTCTTGATTA GCCCTAATCTACG--- 3' 3' ---AGAAAAGAGAAC TAATCGGGATTAGATGC--- 5' | |
I-SceV [24] [31] | H3 | Saccharomyces cerevisiae | E | mito | 5' AATAATTTTCTTCTTAGTAATGCC 3' TTATTAAAAGAAGAATCATTACGG | 5' ---AATAATTTTCT TCTTAGTAATGCC--- 3' 3' ---TTATTAAAAGAAGAATCATTA CGG--- 5' | |
I-SceVI [24] [32] | H3 | Saccharomyces cerevisiae | E | mito | 5' GTTATTTAATGTTTTAGTAGTTGG 3' CAATAAATTACAAAATCATCAACC | 5' ---GTTATTTAATG TTTTAGTAGTTGG--- 3' 3' ---CAATAAATTACAAAATCATCA ACC--- 5' | |
I-SceVII [20] | H1 | Saccharomyces cerevisiae | E | mito | 5' TGTCACATTGAGGTGCACTAGTTATTAC 3' ACAGTGTAACTCCACGTGATCAATAATG | Unknown ** | |
I-Ssp6803I | H5 | 2OST | Synechocystis sp. PCC 6803 | B | 5' GTCGGGCTCATAACCCGAA 3' CAGCCCGAGTATTGGGCTT | 5' ---GTCGGGCT CATAACCCGAA--- 3' 3' ---CAGCCCGAGTA TTGGGCTT--- 5' | |
I-TevI [33] [34] [35] | H2 | 1I3J | Escherichia coli phage T4 | B | phage | 5' AGTGGTATCAACGCTCAGTAGATG 3' TCACCATAGT TGCGAGTCATCTAC | 5' ---AGTGGTATCAAC GCTCAGTAGATG--- 3' 3' ---TCACCATAGT TGCGAGTCATCTAC--- 5' |
I-TevII [33] [36] | H2 | Escherichia coli phage T4 | B | phage | 5' GCTTATGAGTATGAAGTGAACACGTTATTC 3' CGAATACTCATACTTCACTTGTGCAATAAG | 5' ---GCTTATGAGTATGAAGTGAACACGT TATTC--- 3' 3' ---CGAATACTCATACTTCACTTGTG CAATAAG--- 5' | |
I-TevIII [37] | H3 | Escherichia coli phage RB3 | B | phage | 5' TATGTATCTTTTGCGTGTACCTTTAACTTC 3' ATACATAGAAAACGCACATGGAAATTGAAG | 5' ---T ATGTATCTTTTGCGTGTACCTTTAACTTC--- 3' 3' ---AT ACATAGAAAACGCACATGGAAATTGAAG--- 5' | |
PI-TliI [38] [39] | H1 | Thermococcus litoralis | A | chrm | 5' TAYGCNGAYACNGACGGYTTYT 3' ATRCGNCTRTGNCTGCCTAARA | 5' ---TAYGCNGAYACNGACGG YTTYT--- 3' 3' ---ATRCGNCTRTGNC TGCCTAARA--- 5' | |
PI-TliII [22] [39] [40] | H1 | Thermococcus litoralis | A | chrm | 5' AAATTGCTTGCAAACAGCTATTACGGCTAT 3' TTTAACGAACGTTTGTCGATAATGCCGATA | Unknown ** | |
I-Tsp061I | H1 | 2DCH | Thermoproteus sp. IC-061 | A | 5' CTTCAGTATGCCCCGAAAC 3' GAAGTCATACGGGGCTTTG | 5' ---CTTCAGTAT GCCCCGAAAC--- 3' 3' ---GAAGT CATACGGGGCTTTG--- 5' | |
I-Vdi141I | H1 | 3E54 | Vulcanisaeta distributa IC-141 | A | 5' CCTGACTCTCTTAAGGTAGCCAAA 3' GGACTGAGAGAATTCCATCGGTTT | 5' ---CCTGACTCTCTTAA GGTAGCCAAA--- 3' 3' ---GGACTGAG AGAATTCCATCGGTTT--- 5' |
*: Nicking endonuclease : These enzymes cut only one DNA strand, leaving the other strand untouched.
**: Unknown cutting site: Researchers have not been able to determine the exact cutting site of these enzymes yet.
Databases and lists of restriction enzymes:
Restriction Enzyme Database.
The Intein Database and Registry. [41]
New England Biolabs enzyme finder.
Promega restriction enzymes webpage.
Databases of proteins:
RCSB Protein Data Bank.
Swiss-Prot is a curated protein sequence database which strives to provide a high level of annotation (such as the description of the function of a protein, its domains structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. TrEMBL is a computer-annotated supplement of Swiss-Prot that contains all the translations of EMBL nucleotide sequence entries not yet integrated in Swiss-Prot.
A restriction enzyme, restriction endonuclease, REase, ENase orrestrictase is an enzyme that cleaves DNA into fragments at or near specific recognition sites within molecules known as restriction sites. Restriction enzymes are one class of the broader endonuclease group of enzymes. Restriction enzymes are commonly classified into five types, which differ in their structure and whether they cut their DNA substrate at their recognition site, or if the recognition and cleavage sites are separate from one another. To cut DNA, all restriction enzymes make two incisions, once through each sugar-phosphate backbone of the DNA double helix.
RNA splicing is a process in molecular biology where a newly-made precursor messenger RNA (pre-mRNA) transcript is transformed into a mature messenger RNA (mRNA). It works by removing all the introns and splicing back together exons. For nuclear-encoded genes, splicing occurs in the nucleus either during or immediately after transcription. For those eukaryotic genes that contain introns, splicing is usually needed to create an mRNA molecule that can be translated into protein. For many eukaryotic introns, splicing occurs in a series of reactions which are catalyzed by the spliceosome, a complex of small nuclear ribonucleoproteins (snRNPs). There exist self-splicing introns, that is, ribozymes that can catalyze their own excision from their parent RNA molecule. The process of transcription, splicing and translation is called gene expression, the central dogma of molecular biology.
In molecular biology, RNA polymerase, or more specifically DNA-directed/dependent RNA polymerase (DdRP), is an enzyme that synthesizes RNA from a DNA template.
The restriction modification system is found in bacteria and other prokaryotic organisms, and provides a defense against foreign DNA, such as that borne by bacteriophages.
Ribonuclease H is a family of non-sequence-specific endonuclease enzymes that catalyze the cleavage of RNA in an RNA/DNA substrate via a hydrolytic mechanism. Members of the RNase H family can be found in nearly all organisms, from bacteria to archaea to eukaryotes.
Protein splicing is an intramolecular reaction of a particular protein in which an internal protein segment is removed from a precursor protein with a ligation of C-terminal and N-terminal external proteins on both sides. The splicing junction of the precursor protein is mainly a cysteine or a serine, which are amino acids containing a nucleophilic side chain. The protein splicing reactions which are known now do not require exogenous cofactors or energy sources such as adenosine triphosphate (ATP) or guanosine triphosphate (GTP). Normally, splicing is associated only with pre-mRNA splicing. This precursor protein contains three segments—an N-extein followed by the intein followed by a C-extein. After splicing has taken place, the resulting protein contains the N-extein linked to the C-extein; this splicing product is also termed an extein.
Marlene Belfort is an American biochemist known for her research on the factors that interrupt genes and proteins. She is a fellow of the American Academy of Arts and Sciences and has been admitted to the United States National Academy of Sciences.
In molecular biology, a twintron is an intron-within-intron excised by sequential splicing reactions. A twintron is presumably formed by the insertion of a mobile intron into an existing intron.
Group II introns are a large class of self-catalytic ribozymes and mobile genetic elements found within the genes of all three domains of life. Ribozyme activity can occur under high-salt conditions in vitro. However, assistance from proteins is required for in vivo splicing. In contrast to group I introns, intron excision occurs in the absence of GTP and involves the formation of a lariat, with an A-residue branchpoint strongly resembling that found in lariats formed during splicing of nuclear pre-mRNA. It is hypothesized that pre-mRNA splicing may have evolved from group II introns, due to the similar catalytic mechanism as well as the structural similarity of the Group II Domain V substructure to the U6/U2 extended snRNA. Finally, their ability to site-specifically insert into DNA sites has been exploited as a tool for biotechnology. For example, group II introns can be modified to make site-specific genome insertions and deliver cargo DNA such as reporter genes or lox sites
I-CreI is a homing endonuclease whose gene was first discovered in the chloroplast genome of Chlamydomonas reinhardtii, a species of unicellular green algae. It is named for the facts that: it resides in an Intron; it was isolated from Clamydomonas reinhardtii; it was the first (I) such gene isolated from C. reinhardtii. Its gene resides in a group I intron in the 23S ribosomal RNA gene of the C. reinhardtii chloroplast, and I-CreI is only expressed when its mRNA is spliced from the primary transcript of the 23S gene. I-CreI enzyme, which functions as a homodimer, recognizes a 22-nucleotide sequence of duplex DNA and cleaves one phosphodiester bond on each strand at specific positions. I-CreI is a member of the LAGLIDADG family of homing endonucleases, all of which have a conserved LAGLIDADG amino acid motif that contributes to their associative domains and active sites. When the I-CreI-containing intron encounters a 23S allele lacking the intron, I-CreI enzyme "homes" in on the "intron-minus" allele of 23S and effects its parent intron's insertion into the intron-minus allele. Introns with this behavior are called mobile introns. Because I-CreI provides for its own propagation while conferring no benefit on its host, it is an example of selfish DNA.
The homing endonucleases are a collection of endonucleases encoded either as freestanding genes within introns, as fusions with host proteins, or as self-splicing inteins. They catalyze the hydrolysis of genomic DNA within the cells that synthesize them, but do so at very few, or even singular, locations. Repair of the hydrolyzed DNA by the host cell frequently results in the gene encoding the homing endonuclease having been copied into the cleavage site, hence the term 'homing' to describe the movement of these genes. Homing endonucleases can thereby transmit their genes horizontally within a host population, increasing their allele frequency at greater than Mendelian rates.
Group I introns are large self-splicing ribozymes. They catalyze their own excision from mRNA, tRNA and rRNA precursors in a wide range of organisms. The core secondary structure consists of nine paired regions (P1-P9). These fold to essentially two domains – the P4-P6 domain and the P3-P9 domain. The secondary structure mark-up for this family represents only this conserved core. Group I introns often have long open reading frames inserted in loop regions.
60S ribosomal protein L7a is a protein that in humans is encoded by the RPL7A gene.
40S ribosomal protein S3 is a protein that in humans is encoded by the RPS3 gene.
PstI is a type II restriction endonuclease isolated from the Gram negative species, Providencia stuartii.
Meganucleases are endodeoxyribonucleases characterized by a large recognition site ; as a result this site generally occurs only once in any given genome. For example, the 18-base pair sequence recognized by the I-SceI meganuclease would on average require a genome twenty times the size of the human genome to be found once by chance. Meganucleases are therefore considered to be the most specific naturally occurring restriction enzymes.
Numerous key discoveries in biology have emerged from studies of RNA, including seminal work in the fields of biochemistry, genetics, microbiology, molecular biology, molecular evolution and structural biology. As of 2010, 30 scientists have been awarded Nobel Prizes for experimental work that includes studies of RNA. Specific discoveries of high biological significance are discussed in this article.
tRNA-intron lyase is an enzyme. As an endonuclease enzyme, tRNA-intron lyase is responsible for splicing phosphodiester bonds within non-coding ribonucleic acid chains. These non-coding RNA molecules form tRNA molecules after being processed, and this is dependent on tRNA-intron lyase to splice the pretRNA. tRNA processing is an important post-transcriptional modification necessary for tRNA maturation because it locates and removes introns in the pretRNA. This enzyme catalyses the following chemical reaction: