AT-hook | |||||||||
---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||
Symbol | AT_hook | ||||||||
Pfam | PF02178 | ||||||||
InterPro | IPR017956 | ||||||||
SMART | AT_hook | ||||||||
SCOP2 | 2eze / SCOPe / SUPFAM | ||||||||
|
The AT-hook is a DNA-binding motif present in many proteins, including the high mobility group (HMG) proteins, [1] DNA-binding proteins from plants [2] and hBRG1 protein, a central ATPase of the human switching/sucrose non-fermenting (SWI/SNF) remodeling complex. [3]
This motif consists of a conserved, palindromic, core sequence of proline-arginine-glycine-arginine-proline, although some AT-hooks contain only a single proline in the core sequence. AT-hooks also include a variable number of positively charged lysine and arginine residues on either side of the core sequence. [4] The AT-hook binds to the minor groove of adenine-thymine (AT) rich DNA, hence the AT in the name. The rest of the name derives from a predicted asparagine/aspartate "hook" in the earliest AT-hooks reported in 1990. [5] In 1997 structural studies using NMR determined that a DNA-bound AT-hook adopted a crescent or hook shape around the minor groove of a target DNA strand (pictured at right). [6] HMGA proteins contain three AT-hooks, although some proteins contain as many as 30. [5] The optimal binding sequences for AT-hook proteins are repeats of the form (ATAA)n or (TATT)n, although the optimal binding sequences for the core sequence of the AT-hook are AAAT and AATT. [7]
The DNA dodecamer has eight consecutive AT base pairs, allowing the AT-hook to be positioned in several positions, with the preferred position being at one of the AATT regions to fully occupy the minor groove. Van der Waals interactions of the AT-hook with the adenines play an important role for the specificity of the position. [8] Van der Waals interactions of the AT-hook with the adenines play an important role for the specificity of the position. [8]
The figure shows the position of the main chain to allow hydrogen bonds with the minor groove thymine oxygen atoms. The interactions shown, caused the DNA to bend, extending the minor groove. The distorted DNA causes the complementary major groove to form interactions between the side chains.
AT-hook proteins can form hydrogen bonds between NH groups of Gly37 and Arg38 on the main-chain and thymine oxygen atoms in the minor groove, which bends the DNA and widens the minor groove. [8] The binding to the minor groove facilitates binding of other proteins in the major groove. [9] That enables HMG proteins to regular expression of genes and influence biological processes.
The AT-hooks have also been proposed to anchor chromatin-modifying proteins to AT-rich DNA sequences through their association with chromatin remodeling, histone modifications, and chromatin insulator function. [9]
Alterations or abnormal expression of the HMG proteins have led to metabolic disorders, such as obesity, type 2 diabetes, and cancer. [8]
Histone acetyltransferases (HATs) are enzymes that acetylate conserved lysine amino acids on histone proteins by transferring an acetyl group from acetyl-CoA to form ε-N-acetyllysine. DNA is wrapped around histones, and, by transferring an acetyl group to the histones, genes can be turned on and off. In general, histone acetylation increases gene expression.
In molecular biology, a histone octamer is the eight-protein complex found at the center of a nucleosome core particle. It consists of two copies of each of the four core histone proteins. The octamer assembles when a tetramer, containing two copies of H3 and two of H4, complexes with two H2A/H2B dimers. Each histone has both an N-terminal tail and a C-terminal histone-fold. Each of these key components interacts with DNA in its own way through a series of weak interactions, including hydrogen bonds and salt bridges. These interactions keep the DNA and the histone octamer loosely associated, and ultimately allow the two to re-position or to separate entirely.
DNA-binding proteins are proteins that have DNA-binding domains and thus have a specific or general affinity for single- or double-stranded DNA. Sequence-specific DNA-binding proteins generally interact with the major groove of B-DNA, because it exposes more functional groups that identify a base pair.
HMGN proteins are members of the broader class of high mobility group (HMG) chromosomal proteins that are involved in regulation of transcription, replication, recombination, and DNA repair.
A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence or have a general affinity to DNA. Some DNA-binding domains may also include nucleic acids in their folded structure.
Methyltransferases are a large group of enzymes that all methylate their substrates but can be split into several subclasses based on their structural features. The most common class of methyltransferases is class I, all of which contain a Rossmann fold for binding S-Adenosyl methionine (SAM). Class II methyltransferases contain a SET domain, which are exemplified by SET domain histone methyltransferases, and class III methyltransferases, which are membrane associated. Methyltransferases can also be grouped as different types utilizing different substrates in methyl transfer reactions. These types include protein methyltransferases, DNA/RNA methyltransferases, natural product methyltransferases, and non-SAM dependent methyltransferases. SAM is the classical methyl donor for methyltransferases, however, examples of other methyl donors are seen in nature. The general mechanism for methyl transfer is a SN2-like nucleophilic attack where the methionine sulfur serves as the leaving group and the methyl group attached to it acts as the electrophile that transfers the methyl group to the enzyme substrate. SAM is converted to S-Adenosyl homocysteine (SAH) during this process. The breaking of the SAM-methyl bond and the formation of the substrate-methyl bond happen nearly simultaneously. These enzymatic reactions are found in many pathways and are implicated in genetic diseases, cancer, and metabolic diseases. Another type of methyl transfer is the radical S-Adenosyl methionine (SAM) which is the methylation of unactivated carbon atoms in primary metabolites, proteins, lipids, and RNA.
In molecular biology and genetics, transcription coregulators are proteins that interact with transcription factors to either activate or repress the transcription of specific genes. Transcription coregulators that activate gene transcription are referred to as coactivators while those that repress are known as corepressors. The mechanism of action of transcription coregulators is to modify chromatin structure and thereby make the associated DNA more or less accessible to transcription. In humans several dozen to several hundred coregulators are known, depending on the level of confidence with which the characterisation of a protein as a coregulator can be made. One class of transcription coregulators modifies chromatin structure through covalent modification of histones. A second ATP dependent class modifies the conformation of chromatin.
High-Mobility Group or HMG is a group of chromosomal proteins that are involved in the regulation of DNA-dependent processes such as transcription, replication, recombination, and DNA repair.
High-mobility group AT-hook 2, also known as HMGA2, is a protein that, in humans, is encoded by the HMGA2 gene.
High-mobility group protein HMG-I/HMG-Y is a protein that in humans is encoded by the HMGA1 gene.
Uracil-DNA glycosylase is an enzyme. Its most important function is to prevent mutagenesis by eliminating uracil from DNA molecules by cleaving the N-glycosidic bond and initiating the base-excision repair (BER) pathway.
Non-histone chromosomal protein HMG-14 is a protein that in humans is encoded by the HMGN1 gene.
Chromobox protein homolog 5 is a protein that in humans is encoded by the CBX5 gene. It is a highly conserved, non-histone protein part of the heterochromatin family. The protein itself is more commonly called HP1α. Heterochromatin protein-1 (HP1) has an N-terminal domain that acts on methylated lysines residues leading to epigenetic repression. The C-terminal of this protein has a chromo shadow-domain (CSD) that is responsible for homodimerizing, as well as interacting with a variety of chromatin-associated, non-histone proteins.
In molecular biology, the HMG-box is a protein domain which is involved in DNA binding. The domain is composed of approximately 75 amino acid residues that collectively mediate the DNA-binding of chromatin-associated high-mobility group proteins. HMG-boxes are present in many transcription factors and chromatin-remodeling complexes, where they can mediate non-sequence or sequence-specific DNA binding.
Nucleic acid tertiary structure is the three-dimensional shape of a nucleic acid polymer. RNA and DNA molecules are capable of diverse functions ranging from molecular recognition to catalysis. Such functions require a precise three-dimensional structure. While such structures are diverse and seemingly complex, they are composed of recurring, easily recognizable tertiary structural motifs that serve as molecular building blocks. Some of the most common motifs for RNA and DNA tertiary structure are described below, but this information is based on a limited number of solved structures. Many more tertiary structural motifs will be revealed as new RNA and DNA molecules are structurally characterized.
Nucleic acid structure refers to the structure of nucleic acids such as DNA and RNA. Chemically speaking, DNA and RNA are very similar. Nucleic acid structure is often divided into four different levels: primary, secondary, tertiary, and quaternary.
Nucleic acid secondary structure is the basepairing interactions within a single nucleic acid polymer or between two polymers. It can be represented as a list of bases which are paired in a nucleic acid molecule. The secondary structures of biological DNAs and RNAs tend to be different: biological DNA mostly exists as fully base paired double helices, while biological RNA is single stranded and often forms complex and intricate base-pairing interactions due to its increased ability to form hydrogen bonds stemming from the extra hydroxyl group in the ribose sugar.
Nucleic acidquaternary structure refers to the interactions between separate nucleic acid molecules, or between nucleic acid molecules and proteins. The concept is analogous to protein quaternary structure, but as the analogy is not perfect, the term is used to refer to a number of different concepts in nucleic acids and is less commonly encountered. Similarly other biomolecules such as proteins, nucleic acids have four levels of structural arrangement: primary, secondary, tertiary, and quaternary structure. Primary structure is the linear sequence of nucleotides, secondary structure involves small local folding motifs, and tertiary structure is the 3D folded shape of nucleic acid molecule. In general, quaternary structure refers to 3D interactions between multiple subunits. In the case of nucleic acids, quaternary structure refers to interactions between multiple nucleic acid molecules or between nucleic acids and proteins. Nucleic acid quaternary structure is important for understanding DNA, RNA, and gene expression because quaternary structure can impact function. For example, when DNA is packed into heterochromatin, therefore exhibiting a type of quaternary structure, gene transcription will be inhibited.
In molecular biology, GATA zinc fingers are zinc-containing domains found in a number of transcription factors. Some members of this class of zinc fingers specifically bind the DNA sequence (A/T)GATA(A/G) in the regulatory regions of genes., giving rise to the name of the domain. In these domains, a single zinc ion is coordinated by 4 cysteine residues. NMR studies have shown the core of the Znf to comprise 2 irregular anti-parallel beta-sheets and an alpha-helix, followed by a long loop to the C-terminal end of the finger. The N-terminal part, which includes the helix, is similar in structure, but not sequence, to the N-terminal zinc module of the glucocorticoid receptor DNA-binding domain. The helix and the loop connecting the 2 beta-sheets interact with the major groove of the DNA, while the C-terminal tail wraps around into the minor groove. Interactions between the Znf and DNA are mainly hydrophobic, explaining the preponderance of thymines in the binding site; a large number of interactions with the phosphate backbone have also been observed. Two GATA zinc fingers are found in GATA-family transcription factors. However, there are several proteins that only contain a single copy of the domain. It is also worth noting that many GATA-type Znfs have not been experimentally demonstrated to be DNA-binding domains. Furthermore, several GATA-type Znfs have been demonstrated to act as protein-recognition domains. For example, the N-terminal Znf of GATA1 binds specifically to a zinc finger from the transcriptional coregulator FOG1 (ZFPM1).
The WRKY domain is found in the WRKY transcription factor family, a class of transcription factors. The WRKY domain is found almost exclusively in plants although WRKY genes appear present in some diplomonads, social amoebae and other amoebozoa, and fungi incertae sedis. They appear absent in other non-plant species. WRKY transcription factors have been a significant area of plant research for the past 20 years. The WRKY DNA-binding domain recognizes the W-box (T)TGAC(C/T) cis-regulatory element.