This article may be too technical for most readers to understand.(January 2024) |
The homing endonucleases are a collection of endonucleases encoded either as freestanding genes within introns, as fusions with host proteins, or as self-splicing inteins. They catalyze the hydrolysis of genomic DNA within the cells that synthesize them, but do so at very few, or even singular, locations. Repair of the hydrolyzed DNA by the host cell frequently results in the gene encoding the homing endonuclease having been copied into the cleavage site, hence the term 'homing' to describe the movement of these genes. Homing endonucleases can thereby transmit their genes horizontally within a host population, increasing their allele frequency at greater than Mendelian rates.
Although the origin and function of homing endonucleases is still being researched, the most established hypothesis considers them as selfish genetic elements, [1] similar to transposons, because they facilitate the perpetuation of the genetic elements that encode them independent of providing a functional attribute to the host organism.
Homing endonuclease recognition sequences are long enough to occur randomly only with a very low probability (approximately once every 7×109 bp ), [2] and are normally found in one or very few instances per genome. Generally, owing to the homing mechanism, the gene encoding the endonuclease (the HEG, "homing endonuclease gene") is located within the recognition sequence which the enzyme cuts, thus interrupting the homing endonuclease recognition sequence and limiting DNA cutting only to sites that do not (yet) carry the HEG.
Prior to transmission, one allele carries the gene (HEG+) while the other does not (HEG−), and is therefore susceptible to being cut by the enzyme. Once the enzyme is synthesized, it breaks the chromosome in the HEG− allele, initiating a response from the cellular DNA repair system. The damage is repaired using recombination, taking the pattern of the opposite, undamaged DNA allele, HEG+, that contains the gene for the endonuclease. Thus, the gene is copied to the allele that initially did not have it and it is propagated through successive generations. [3] This process is called "homing". [3]
Homing endonucleases are always indicated with a prefix that identifies their genomic origin, followed by a hyphen: "I-" for homing endonucleases encoded within an intron, "PI-" (for "protein insert") for those encoded within an intein. Some authors have proposed using the prefix "F-" ("freestanding") for viral enzymes and other natural enzymes not encoded by introns nor inteins, [4] and "H-" ("hybrid") for enzymes synthesized in a laboratory. [5] Next, a three-letter name is derived from the binominal name of the organism, taking one uppercase letter from the genus name and two lowercase letters from the specific name. (Some mixing is usually done for hybrid enzymes.) Finally, a Roman numeral distinguishes different enzymes found in the same organism:
Homing endonucleases differ from Type II restriction enzymes in the several respects: [4]
|
LAGLIDADG endonuclease | |||||||||
---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||
Symbol | LAGLIDADG_1 | ||||||||
Pfam | PF00961 | ||||||||
Pfam clan | CL0324 | ||||||||
InterPro | IPR001982 | ||||||||
CATH | 1af5 | ||||||||
SCOP2 | 1af5 / SCOPe / SUPFAM | ||||||||
| |||||||||
See clan entry for related Pfam families. |
GIY-YIG endonuclease, catalytic | |||||||||
---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||
Symbol | GIY-YIG | ||||||||
Pfam | PF01541 | ||||||||
InterPro | IPR000305 | ||||||||
PROSITE | PS50164 | ||||||||
CATH | 1mk0 | ||||||||
SCOP2 | 1mk0 / SCOPe / SUPFAM | ||||||||
|
Currently there are six known structural families. Their conserved structural motifs are: [4]
Hom_end-associated Hint | |||||||||
---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||
Symbol | Hom_end_hint | ||||||||
Pfam | PF05203 | ||||||||
Pfam clan | CL0363 | ||||||||
InterPro | IPR007868 | ||||||||
SCOP2 | 1gpp / SCOPe / SUPFAM | ||||||||
| |||||||||
Intein motif of the larger LAGLIDADG Hom_end domain. |
The yeast homing endonuclease PI-Sce is a LAGLIDADG-type endonuclease encoded as an intein that splices itself out of another protein ( P17255 ). The high-resolution structure reveals two domains: an endonucleolytic centre resembling the C-terminal domain of Hedgehog proteins, and a Hint domain (Hedgehog/Intein) containing the protein-splicing active site. [31]
A restriction enzyme, restriction endonuclease, REase, ENase orrestrictase is an enzyme that cleaves DNA into fragments at or near specific recognition sites within molecules known as restriction sites. Restriction enzymes are one class of the broader endonuclease group of enzymes. Restriction enzymes are commonly classified into five types, which differ in their structure and whether they cut their DNA substrate at their recognition site, or if the recognition and cleavage sites are separate from one another. To cut DNA, all restriction enzymes make two incisions, once through each sugar-phosphate backbone of the DNA double helix.
The central dogma of molecular biology deals with the flow of genetic information within a biological system. It is often stated as "DNA makes RNA, and RNA makes protein", although this is not its original meaning. It was first stated by Francis Crick in 1957, then published in 1958:
The Central Dogma. This states that once "information" has passed into protein it cannot get out again. In more detail, the transfer of information from nucleic acid to nucleic acid, or from nucleic acid to protein may be possible, but transfer from protein to protein, or from protein to nucleic acid is impossible. Information here means the precise determination of sequence, either of bases in the nucleic acid or of amino acid residues in the protein.
Protein splicing is an intramolecular reaction of a particular protein in which an internal protein segment is removed from a precursor protein with a ligation of C-terminal and N-terminal external proteins on both sides. The splicing junction of the precursor protein is mainly a cysteine or a serine, which are amino acids containing a nucleophilic side chain. The protein splicing reactions which are known now do not require exogenous cofactors or energy sources such as adenosine triphosphate (ATP) or guanosine triphosphate (GTP). Normally, splicing is associated only with pre-mRNA splicing. This precursor protein contains three segments—an N-extein followed by the intein followed by a C-extein. After splicing has taken place, the resulting protein contains the N-extein linked to the C-extein; this splicing product is also termed an extein.
In molecular biology, endonucleases are enzymes that cleave the phosphodiester bond within a polynucleotide chain. Some, such as deoxyribonuclease I, cut DNA relatively nonspecifically, while many, typically called restriction endonucleases or restriction enzymes, cleave only at very specific nucleotide sequences. Endonucleases differ from exonucleases, which cleave the ends of recognition sequences instead of the middle (endo) portion. Some enzymes known as "exo-endonucleases", however, are not limited to either nuclease function, displaying qualities that are both endo- and exo-like. Evidence suggests that endonuclease activity experiences a lag compared to exonuclease activity.
The restriction endonuclease Fok1, naturally found in Flavobacterium okeanokoites, is a bacterial type IIS restriction endonuclease consisting of an N-terminal DNA-binding domain and a non sequence-specific DNA cleavage domain at the C-terminal. Once the protein is bound to duplex DNA via its DNA-binding domain at the 5'-GGATG-3' recognition site, the DNA cleavage domain is activated and cleaves the DNA at two locations, regardless of the nucleotide sequence at the cut site. The DNA is cut 9 nucleotides downstream of the motif on the forward strand, and 13 nucleotides downstream of the motif on the reverse strand, producing two sticky ends with 4-bp overhangs.
In molecular biology, a twintron is an intron-within-intron excised by sequential splicing reactions. A twintron is presumably formed by the insertion of a mobile intron into an existing intron.
I-CreI is a homing endonuclease whose gene was first discovered in the chloroplast genome of Chlamydomonas reinhardtii, a species of unicellular green algae. It is named for the facts that: it resides in an Intron; it was isolated from Clamydomonas reinhardtii; it was the first (I) such gene isolated from C. reinhardtii. Its gene resides in a group I intron in the 23S ribosomal RNA gene of the C. reinhardtii chloroplast, and I-CreI is only expressed when its mRNA is spliced from the primary transcript of the 23S gene. I-CreI enzyme, which functions as a homodimer, recognizes a 22-nucleotide sequence of duplex DNA and cleaves one phosphodiester bond on each strand at specific positions. I-CreI is a member of the LAGLIDADG family of homing endonucleases, all of which have a conserved LAGLIDADG amino acid motif that contributes to their associative domains and active sites. When the I-CreI-containing intron encounters a 23S allele lacking the intron, I-CreI enzyme "homes" in on the "intron-minus" allele of 23S and effects its parent intron's insertion into the intron-minus allele. Introns with this behavior are called mobile introns. Because I-CreI provides for its own propagation while conferring no benefit on its host, it is an example of selfish DNA.
EcoRV is a type II restriction endonuclease isolated from certain strains of Escherichia coli. It has the alternative name Eco32I.
Nuclease S1 is an endonuclease enzyme that splits single-stranded DNA (ssDNA) and RNA into oligo- or mononucleotides. This enzyme catalyses the following chemical reaction
DNA-(apurinic or apyrimidinic site) lyase is an enzyme that in humans is encoded by the APEX1 gene.
Restriction endonuclease (REase) EcoRII is an enzyme of restriction modification system (RM) naturally found in Escherichia coli, a Gram-negative bacteria. Its molecular mass is 45.2 kDa, being composed of 402 amino acids.
The B3 DNA binding domain (DBD) is a highly conserved domain found exclusively in transcription factors combined with other domains. It consists of 100-120 residues, includes seven beta strands and two alpha helices that form a DNA-binding pseudobarrel protein fold ; it interacts with the major groove of DNA.
Meganucleases are endodeoxyribonucleases characterized by a large recognition site ; as a result this site generally occurs only once in any given genome. For example, the 18-base pair sequence recognized by the I-SceI meganuclease would on average require a genome twenty times the size of the human genome to be found once by chance. Meganucleases are therefore considered to be the most specific naturally occurring restriction enzymes.
Genome editing, or genome engineering, or gene editing, is a type of genetic engineering in which DNA is inserted, deleted, modified or replaced in the genome of a living organism. Unlike early genetic engineering techniques that randomly inserts genetic material into a host genome, genome editing targets the insertions to site-specific locations. The basic mechanism involved in genetic manipulations through programmable nucleases is the recognition of target genomic loci and binding of effector DNA-binding domain (DBD), double-strand breaks (DSBs) in target DNA by the restriction endonucleases, and the repair of DSBs through homology-directed recombination (HDR) or non-homologous end joining (NHEJ).
Cas9 is a 160 kilodalton protein which plays a vital role in the immunological defense of certain bacteria against DNA viruses and plasmids, and is heavily utilized in genetic engineering applications. Its main function is to cut DNA and thereby alter a cell's genome. The CRISPR-Cas9 genome editing technique was a significant contributor to the Nobel Prize in Chemistry in 2020 being awarded to Emmanuelle Charpentier and Jennifer Doudna.
EcoRI is a restriction endonuclease enzyme isolated from species E. coli. It is a restriction enzyme that cleaves DNA double helices into fragments at specific sites, and is also a part of the restriction modification system. The Eco part of the enzyme's name originates from the species from which it was isolated - "E" denotes generic name which is "Escherichia" and "co" denotes species name, "coli" - while the R represents the particular strain, in this case RY13, and the I denotes that it was the first enzyme isolated from this strain.
The Intein Database and Registry (from New England Biolabs)