HUH endonucleases (HUH-tags) are sequence-specific single-stranded DNA (ssDNA) binding proteins originating from numerous species of bacteria and viruses. [1] Viral HUH endonucleases are involved in initiating rolling circle replication while ones of bacterial origin initiate bacterial conjugation. In biotechnology, they can be used to create protein-DNA linkages, [2] akin to other methods such as SNAP-tag. In doing so, they create a 5' covalent bond between the ssDNA and the protein. HUH endonucleases can be fused with other proteins or used as protein tags.
The name HUH stands for "histidine-hydrophobic-histidine," referring to the three amino acids at the active site of the endonuclease. Some DNA viruses code for an HUH endonuclease which initiates rolling circle replication of the viral genome, and this process defines the realm Monodnaviria . [3]
HUH endonucleases are broadly split into two categories of enzymes: replication initiator proteins (Rep) or relaxase / mobilization proteins. They both contain small protein domains that recognize sequence-specific origins of replication or origin of transfer at which site they nick DNA. The nicking domain of Reps tend to be smaller, on the order of 10-20 kDa while nicking domains from relaxases are larger, roughly 20-40 kDa in size. [2]
HUH endonucleases generally have two histidine (H) residues in the active site coordinating a metal cation (Mg2+ or Mn2+) that interacts with the phosphate backbone of DNA. These residues allow for a nucleophilic attack, most commonly by an activated tyrosine of the scissile phosphate in the DNA backbone, generating a 5' covalent bond with the ssDNA. In contrast to other DNA-protein linkage approaches, this reaction occurs at ambient conditions and does not require any additional modifications. X-ray crystallography and NMR structures have provided insight into the sequence specificity of DNA binding. [4] [5]
A DNA virus is a virus that has a genome made of deoxyribonucleic acid (DNA) that is replicated by a DNA polymerase. They can be divided between those that have two strands of DNA in their genome, called double-stranded DNA (dsDNA) viruses, and those that have one strand of DNA in their genome, called single-stranded DNA (ssDNA) viruses. dsDNA viruses primarily belong to two realms: Duplodnaviria and Varidnaviria, and ssDNA viruses are almost exclusively assigned to the realm Monodnaviria, which also includes some dsDNA viruses. Additionally, many DNA viruses are unassigned to higher taxa. Reverse transcribing viruses, which have a DNA genome that is replicated through an RNA intermediate by a reverse transcriptase, are classified into the kingdom Pararnavirae in the realm Riboviria.
Retroviral integrase (IN) is an enzyme produced by a retrovirus that integrates its genetic information into that of the host cell it infects. Retroviral INs are not to be confused with phage integrases (recombinases) used in biotechnology, such as λ phage integrase, as discussed in site-specific recombination.
In molecular biology, endonucleases are enzymes that cleave the phosphodiester bond within a polynucleotide chain. Some, such as deoxyribonuclease I, cut DNA relatively nonspecifically, while many, typically called restriction endonucleases or restriction enzymes, cleave only at very specific nucleotide sequences. Endonucleases differ from exonucleases, which cleave the ends of recognition sequences instead of the middle (endo) portion. Some enzymes known as "exo-endonucleases", however, are not limited to either nuclease function, displaying qualities that are both endo- and exo-like. Evidence suggests that endonuclease activity experiences a lag compared to exonuclease activity.
Geminiviridae is a family of plant viruses that encode their genetic information on a circular genome of single-stranded (ss) DNA. There are 520 species in this family, assigned to 14 genera. Diseases associated with this family include: bright yellow mosaic, yellow mosaic, yellow mottle, leaf curling, stunting, streaks, reduced yields. They have single-stranded circular DNA genomes encoding genes that diverge in both directions from a virion strand origin of replication. According to the Baltimore classification they are considered class II viruses. It is the largest known family of single stranded DNA viruses.
Rolling circle replication (RCR) is a process of unidirectional nucleic acid replication that can rapidly synthesize multiple copies of circular molecules of DNA or RNA, such as plasmids, the genomes of bacteriophages, and the circular RNA genome of viroids. Some eukaryotic viruses also replicate their DNA or RNA via the rolling circle mechanism.
A relaxase is a single-strand DNA transesterase enzyme produced by some prokaryotes and viruses. Relaxases are responsible for site- and strand-specific nicks in unwound double-stranded DNA. Known relaxases belong to the rolling circle replication (RCR) initiator superfamily of enzymes and fall into two broad classes: replicative (Rep) and mobilization (Mob). The nicks produced by Rep relaxases initiate plasmid or virus RCR. Mob relaxases nick at origin of transfer (oriT) to initiate the process of DNA mobilization and transfer known as bacterial conjugation. Relaxases are so named because the single-stranded DNA nick that they catalyze lead to relaxation of helical tension.
In Molecular biology, an insert is a piece of DNA that is inserted into a larger DNA vector by a recombinant DNA technique, such as ligation or recombination. This allows it to be multiplied, selected, further manipulated or expressed in a host organism.
Flap endonuclease 1 is an enzyme that in humans is encoded by the FEN1 gene.
Helitrons are one of the three groups of eukaryotic class 2 transposable elements (TEs) so far described. They are the eukaryotic rolling-circle transposable elements which are hypothesized to transpose by a rolling circle replication mechanism via a single-stranded DNA intermediate. They were first discovered in plants and in the nematode Caenorhabditis elegans, and now they have been identified in a diverse range of species, from protists to mammals. Helitrons make up a substantial fraction of many genomes where non-autonomous elements frequently outnumber the putative autonomous partner. Helitrons seem to have a major role in the evolution of host genomes. They frequently capture diverse host genes, some of which can evolve into novel host genes or become essential for Helitron transposition.
Genome editing, or genome engineering, or gene editing, is a type of genetic engineering in which DNA is inserted, deleted, modified or replaced in the genome of a living organism. Unlike early genetic engineering techniques that randomly inserts genetic material into a host genome, genome editing targets the insertions to site-specific locations. The basic mechanism involved in genetic manipulations through programmable nucleases is the recognition of target genomic loci and binding of effector DNA-binding domain (DBD), double-strand breaks (DSBs) in target DNA by the restriction endonucleases, and the repair of DSBs through homology-directed recombination (HDR) or non-homologous end joining (NHEJ).
Cas9 is a 160 kilodalton protein which plays a vital role in the immunological defense of certain bacteria against DNA viruses and plasmids, and is heavily utilized in genetic engineering applications. Its main function is to cut DNA and thereby alter a cell's genome. The CRISPR-Cas9 genome editing technique was a significant contributor to the Nobel Prize in Chemistry in 2020 being awarded to Emmanuelle Charpentier and Jennifer Doudna.
Cas12a is a subtype of Cas12 proteins and an RNA-guided endonuclease that forms part of the CRISPR system in some bacteria and archaea. It originates as part of a bacterial immune mechanism, where it serves to destroy the genetic material of viruses and thus protect the cell and colony from viral infection. Cas12a and other CRISPR associated endonucleases use an RNA to target nucleic acid in a specific and programmable matter. In the organisms from which it originates, this guide RNA is a copy of a piece of foreign nucleic acid that previously infected the cell.
CRISPR activation (CRISPRa) is a type of CRISPR tool that uses modified versions of CRISPR effectors without endonuclease activity, with added transcriptional activators on dCas9 or the guide RNAs (gRNAs).
Prime editing is a 'search-and-replace' genome editing technology in molecular biology by which the genome of living organisms may be modified. The technology directly writes new genetic information into a targeted DNA site. It uses a fusion protein, consisting of a catalytically impaired Cas9 endonuclease fused to an engineered reverse transcriptase enzyme, and a prime editing guide RNA (pegRNA), capable of identifying the target site and providing the new genetic information to replace the target DNA nucleotides. It mediates targeted insertions, deletions, and base-to-base conversions without the need for double strand breaks (DSBs) or donor DNA templates.
Ground squirrel hepatitis virus, abbreviated GSHV, is a partially double-stranded DNA virus that is closely related to human Hepatitis B virus (HBV) and Woodchuck hepatitis virus (WHV). It is a member of the family of viruses Hepadnaviridae and the genus Orthohepadnavirus. Like the other members of its family, GSHV has high degree of species and tissue specificity. It was discovered in Beechey ground squirrels, Spermophilus beecheyi, but also infects Arctic ground squirrels, Spermophilus parryi. Commonalities between GSHV and HBV include morphology, DNA polymerase activity in genome repair, cross-reacting viral antigens, and the resulting persistent infection with viral antigen in the blood (antigenemia). As a result, GSHV is used as an experimental model for HBV.
Monodnaviria is a realm of viruses that includes all single-stranded DNA viruses that encode an endonuclease of the HUH superfamily that initiates rolling circle replication of the circular viral genome. Viruses descended from such viruses are also included in the realm, including certain linear single-stranded DNA (ssDNA) viruses and circular double-stranded DNA (dsDNA) viruses. These atypical members typically replicate through means other than rolling circle replication.
Cressdnaviricota is a phylum of viruses with small, circular single-stranded DNA genomes and encoding rolling circle replication-initiation proteins with the N-terminal HUH endonuclease and C-terminal superfamily 3 helicase domains. While the replication-associated proteins are homologous among viruses within the phylum, the capsid proteins are very diverse and have presumably been acquired from RNA viruses on multiple independent occasions. Nevertheless, all cressdnaviruses for which structural information is available appear to contain the jelly-roll fold.
Nucleocytoviricota is a phylum of viruses. Members of the phylum are also known as the nucleocytoplasmic large DNA viruses (NCLDV), which serves as the basis of the name of the phylum with the suffix -viricota for virus phylum. These viruses are referred to as nucleocytoplasmic because they are often able to replicate in both the host's cell nucleus and cytoplasm.
Rolling hairpin replication (RHR) is a unidirectional, strand displacement form of DNA replication used by parvoviruses, a group of viruses that constitute the family Parvoviridae. Parvoviruses have linear, single-stranded DNA (ssDNA) genomes in which the coding portion of the genome is flanked by telomeres at each end that form hairpin loops. During RHR, these hairpin loops repeatedly unfold and refold to change the direction of DNA replication so that replication progresses in a continuous manner back and forth across the genome. RHR is initiated and terminated by an endonuclease encoded by parvoviruses that is variously called NS1 or Rep, and RHR is similar to rolling circle replication, which is used by ssDNA viruses that have circular genomes.
Faba bean necrotic yellows virus (FBNYV) is a Nanovirus disease of legumes.