AT-hook

Last updated
AT-hook
PDB 2eze EBI.jpg
solution structure of a complex of the second dna binding domain of human hmg-i(y) bound to dna dodecamer containing the prdii site of the interferon-beta promoter, nmr, 35 structures
Identifiers
SymbolAT_hook
Pfam PF02178
InterPro IPR017956
SMART AT_hook
SCOP2 2eze / SCOPe / SUPFAM
Available protein structures:
Pfam   structures / ECOD  
PDB RCSB PDB; PDBe; PDBj
PDBsum structure summary
The second AT-hook of HMGA1 (black ribbon) bound to the minor-groove of AT-rich DNA. The amino-acid side chains and nucleotides have been hidden. AT-hook.png
The second AT-hook of HMGA1 (black ribbon) bound to the minor-groove of AT-rich DNA. The amino-acid side chains and nucleotides have been hidden.

The AT-hook is a DNA-binding motif present in many proteins, including the high mobility group (HMG) proteins, [1] DNA-binding proteins from plants [2] and hBRG1 protein, a central ATPase of the human switching/sucrose non-fermenting (SWI/SNF) remodeling complex. [3]

Contents

Structure

This motif consists of a conserved, palindromic, core sequence of proline-arginine-glycine-arginine-proline, although some AT-hooks contain only a single proline in the core sequence. AT-hooks also include a variable number of positively charged lysine and arginine residues on either side of the core sequence. [4] The AT-hook binds to the minor groove of adenine-thymine (AT) rich DNA, hence the AT in the name. The rest of the name derives from a predicted asparagine/aspartate "hook" in the earliest AT-hooks reported in 1990. [5] In 1997 structural studies using NMR determined that a DNA-bound AT-hook adopted a crescent or hook shape around the minor groove of a target DNA strand (pictured at right). [6] HMGA proteins contain three AT-hooks, although some proteins contain as many as 30. [5] The optimal binding sequences for AT-hook proteins are repeats of the form (ATAA)n or (TATT)n, although the optimal binding sequences for the core sequence of the AT-hook are AAAT and AATT. [7]

The DNA dodecamer has eight consecutive AT base pairs, allowing the AT-hook to be positioned in several positions, with the preferred position being at one of the AATT regions to fully occupy the minor groove. Van der Waals interactions of the AT-hook with the adenines play an important role for the specificity of the position. [8] Van der Waals interactions of the AT-hook with the adenines play an important role for the specificity of the position. [8]

The phosphate backbone of the DNA shown in orange is faded to focus on the central region of the AT-hook. Shown in magenta are the side chains Pro35, Arg36, Gly37, Arg38, and Pro39. Made with PyMol. PDB code: 3UXW. AT-Hook PyMol 1.png
The phosphate backbone of the DNA shown in orange is faded to focus on the central region of the AT-hook. Shown in magenta are the side chains Pro35, Arg36, Gly37, Arg38, and Pro39. Made with PyMol. PDB code: 3UXW.
There are multiple hydrogen bonds shown in yellow. The interactions occur between Arg38 and Pro39 (3.8 A), Pro35 and Arg36 (2.5 A), and Gly37 and Arg38 (2.4 A). The red sphere represents a water that forms a hydrogen bond 2.7 A from Arg38 with the bond shown in yellow. Made with PyMol. PDB code: 3UXW. AT-Hook PyMol 2.png
There are multiple hydrogen bonds shown in yellow. The interactions occur between Arg38 and Pro39 (3.8 Å), Pro35 and Arg36 (2.5 Å), and Gly37 and Arg38 (2.4 Å). The red sphere represents a water that forms a hydrogen bond 2.7 Å from Arg38 with the bond shown in yellow. Made with PyMol. PDB code: 3UXW.

The figure shows the position of the main chain to allow hydrogen bonds with the minor groove thymine oxygen atoms. The interactions shown, caused the DNA to bend, extending the minor groove. The distorted DNA causes the complementary major groove to form interactions between the side chains.

Function

AT-hook proteins can form hydrogen bonds between NH groups of Gly37 and Arg38 on the main-chain and thymine oxygen atoms in the minor groove, which bends the DNA and widens the minor groove. [8] The binding to the minor groove facilitates binding of other proteins in the major groove. [9] That enables HMG proteins to regular expression of genes and influence biological processes.

The AT-hooks have also been proposed to anchor chromatin-modifying proteins to AT-rich DNA sequences through their association with chromatin remodeling, histone modifications, and chromatin insulator function. [9]

Clinical significance

Alterations or abnormal expression of the HMG proteins have led to metabolic disorders, such as obesity, type 2 diabetes, and cancer. [8]

Related Research Articles

<span class="mw-page-title-main">Histone acetyltransferase</span> Enzymes that catalyze acyl group transfer from acetyl-CoA to histones

Histone acetyltransferases (HATs) are enzymes that acetylate conserved lysine amino acids on histone proteins by transferring an acetyl group from acetyl-CoA to form ε-N-acetyllysine. DNA is wrapped around histones, and, by transferring an acetyl group to the histones, genes can be turned on and off. In general, histone acetylation increases gene expression.

<span class="mw-page-title-main">Histone octamer</span> 8-protein complex forming the core of nucleosomes

In molecular biology, a histone octamer is the eight-protein complex found at the center of a nucleosome core particle. It consists of two copies of each of the four core histone proteins. The octamer assembles when a tetramer, containing two copies of H3 and two of H4, complexes with two H2A/H2B dimers. Each histone has both an N-terminal tail and a C-terminal histone-fold. Each of these key components interacts with DNA in its own way through a series of weak interactions, including hydrogen bonds and salt bridges. These interactions keep the DNA and the histone octamer loosely associated, and ultimately allow the two to re-position or to separate entirely.

<span class="mw-page-title-main">DNA-binding protein</span> Proteins that bind with DNA, such as transcription factors, polymerases, nucleases and histones

DNA-binding proteins are proteins that have DNA-binding domains and thus have a specific or general affinity for single- or double-stranded DNA. Sequence-specific DNA-binding proteins generally interact with the major groove of B-DNA, because it exposes more functional groups that identify a base pair.

HMGN proteins are members of the broader class of high mobility group (HMG) chromosomal proteins that are involved in regulation of transcription, replication, recombination, and DNA repair.

A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence or have a general affinity to DNA. Some DNA-binding domains may also include nucleic acids in their folded structure.

<span class="mw-page-title-main">Methyltransferase</span> Group of methylating enzymes

Methyltransferases are a large group of enzymes that all methylate their substrates but can be split into several subclasses based on their structural features. The most common class of methyltransferases is class I, all of which contain a Rossmann fold for binding S-Adenosyl methionine (SAM). Class II methyltransferases contain a SET domain, which are exemplified by SET domain histone methyltransferases, and class III methyltransferases, which are membrane associated. Methyltransferases can also be grouped as different types utilizing different substrates in methyl transfer reactions. These types include protein methyltransferases, DNA/RNA methyltransferases, natural product methyltransferases, and non-SAM dependent methyltransferases. SAM is the classical methyl donor for methyltransferases, however, examples of other methyl donors are seen in nature. The general mechanism for methyl transfer is a SN2-like nucleophilic attack where the methionine sulfur serves as the leaving group and the methyl group attached to it acts as the electrophile that transfers the methyl group to the enzyme substrate. SAM is converted to S-Adenosyl homocysteine (SAH) during this process. The breaking of the SAM-methyl bond and the formation of the substrate-methyl bond happen nearly simultaneously. These enzymatic reactions are found in many pathways and are implicated in genetic diseases, cancer, and metabolic diseases. Another type of methyl transfer is the radical S-Adenosyl methionine (SAM) which is the methylation of unactivated carbon atoms in primary metabolites, proteins, lipids, and RNA.

In molecular biology and genetics, transcription coregulators are proteins that interact with transcription factors to either activate or repress the transcription of specific genes. Transcription coregulators that activate gene transcription are referred to as coactivators while those that repress are known as corepressors. The mechanism of action of transcription coregulators is to modify chromatin structure and thereby make the associated DNA more or less accessible to transcription. In humans several dozen to several hundred coregulators are known, depending on the level of confidence with which the characterisation of a protein as a coregulator can be made. One class of transcription coregulators modifies chromatin structure through covalent modification of histones. A second ATP dependent class modifies the conformation of chromatin.

High-Mobility Group or HMG is a group of chromosomal proteins that are involved in the regulation of DNA-dependent processes such as transcription, replication, recombination, and DNA repair.

<span class="mw-page-title-main">HMGA2</span> Protein-coding gene in the species Homo sapiens

High-mobility group AT-hook 2, also known as HMGA2, is a protein that, in humans, is encoded by the HMGA2 gene.

<span class="mw-page-title-main">HMGA1</span> Protein-coding gene in the species Homo sapiens

High-mobility group protein HMG-I/HMG-Y is a protein that in humans is encoded by the HMGA1 gene.

<span class="mw-page-title-main">Uracil-DNA glycosylase</span> Enzyme that repairs DNA damage

Uracil-DNA glycosylase is an enzyme. Its most important function is to prevent mutagenesis by eliminating uracil from DNA molecules by cleaving the N-glycosidic bond and initiating the base-excision repair (BER) pathway.

<span class="mw-page-title-main">HMGN1</span> Protein-coding gene in the species Homo sapiens

Non-histone chromosomal protein HMG-14 is a protein that in humans is encoded by the HMGN1 gene.

<span class="mw-page-title-main">CBX5 (gene)</span> Protein-coding gene in humans

Chromobox protein homolog 5 is a protein that in humans is encoded by the CBX5 gene. It is a highly conserved, non-histone protein part of the heterochromatin family. The protein itself is more commonly called HP1α. Heterochromatin protein-1 (HP1) has an N-terminal domain that acts on methylated lysines residues leading to epigenetic repression. The C-terminal of this protein has a chromo shadow-domain (CSD) that is responsible for homodimerizing, as well as interacting with a variety of chromatin-associated, non-histone proteins.

<span class="mw-page-title-main">HMG-box</span> Protein domain which is involved in DNA binding

In molecular biology, the HMG-box is a protein domain which is involved in DNA binding. The domain is composed of approximately 75 amino acid residues that collectively mediate the DNA-binding of chromatin-associated high-mobility group proteins. HMG-boxes are present in many transcription factors and chromatin-remodeling complexes, where they can mediate non-sequence or sequence-specific DNA binding.

<span class="mw-page-title-main">Nucleic acid tertiary structure</span> Three-dimensional shape of a nucleic acid polymer

Nucleic acid tertiary structure is the three-dimensional shape of a nucleic acid polymer. RNA and DNA molecules are capable of diverse functions ranging from molecular recognition to catalysis. Such functions require a precise three-dimensional structure. While such structures are diverse and seemingly complex, they are composed of recurring, easily recognizable tertiary structural motifs that serve as molecular building blocks. Some of the most common motifs for RNA and DNA tertiary structure are described below, but this information is based on a limited number of solved structures. Many more tertiary structural motifs will be revealed as new RNA and DNA molecules are structurally characterized.

<span class="mw-page-title-main">Nucleic acid structure</span> Biomolecular structure of nucleic acids such as DNA and RNA

Nucleic acid structure refers to the structure of nucleic acids such as DNA and RNA. Chemically speaking, DNA and RNA are very similar. Nucleic acid structure is often divided into four different levels: primary, secondary, tertiary, and quaternary.

<span class="mw-page-title-main">Nucleic acid secondary structure</span>

Nucleic acid secondary structure is the basepairing interactions within a single nucleic acid polymer or between two polymers. It can be represented as a list of bases which are paired in a nucleic acid molecule. The secondary structures of biological DNAs and RNAs tend to be different: biological DNA mostly exists as fully base paired double helices, while biological RNA is single stranded and often forms complex and intricate base-pairing interactions due to its increased ability to form hydrogen bonds stemming from the extra hydroxyl group in the ribose sugar.

<span class="mw-page-title-main">Nucleic acid quaternary structure</span>

Nucleic acidquaternary structure refers to the interactions between separate nucleic acid molecules, or between nucleic acid molecules and proteins. The concept is analogous to protein quaternary structure, but as the analogy is not perfect, the term is used to refer to a number of different concepts in nucleic acids and is less commonly encountered. Similarly other biomolecules such as proteins, nucleic acids have four levels of structural arrangement: primary, secondary, tertiary, and quaternary structure. Primary structure is the linear sequence of nucleotides, secondary structure involves small local folding motifs, and tertiary structure is the 3D folded shape of nucleic acid molecule. In general, quaternary structure refers to 3D interactions between multiple subunits. In the case of nucleic acids, quaternary structure refers to interactions between multiple nucleic acid molecules or between nucleic acids and proteins. Nucleic acid quaternary structure is important for understanding DNA, RNA, and gene expression because quaternary structure can impact function. For example, when DNA is packed into heterochromatin, therefore exhibiting a type of quaternary structure, gene transcription will be inhibited.

<span class="mw-page-title-main">GATA zinc finger</span>

In molecular biology, GATA zinc fingers are zinc-containing domains found in a number of transcription factors. Some members of this class of zinc fingers specifically bind the DNA sequence (A/T)GATA(A/G) in the regulatory regions of genes., giving rise to the name of the domain. In these domains, a single zinc ion is coordinated by 4 cysteine residues. NMR studies have shown the core of the Znf to comprise 2 irregular anti-parallel beta-sheets and an alpha-helix, followed by a long loop to the C-terminal end of the finger. The N-terminal part, which includes the helix, is similar in structure, but not sequence, to the N-terminal zinc module of the glucocorticoid receptor DNA-binding domain. The helix and the loop connecting the 2 beta-sheets interact with the major groove of the DNA, while the C-terminal tail wraps around into the minor groove. Interactions between the Znf and DNA are mainly hydrophobic, explaining the preponderance of thymines in the binding site; a large number of interactions with the phosphate backbone have also been observed. Two GATA zinc fingers are found in GATA-family transcription factors. However, there are several proteins that only contain a single copy of the domain. It is also worth noting that many GATA-type Znfs have not been experimentally demonstrated to be DNA-binding domains. Furthermore, several GATA-type Znfs have been demonstrated to act as protein-recognition domains. For example, the N-terminal Znf of GATA1 binds specifically to a zinc finger from the transcriptional coregulator FOG1 (ZFPM1).

<span class="mw-page-title-main">WRKY protein domain</span> Protein domain

The WRKY domain is found in the WRKY transcription factor family, a class of transcription factors. The WRKY domain is found almost exclusively in plants although WRKY genes appear present in some diplomonads, social amoebae and other amoebozoa, and fungi incertae sedis. They appear absent in other non-plant species. WRKY transcription factors have been a significant area of plant research for the past 20 years. The WRKY DNA-binding domain recognizes the W-box (T)TGAC(C/T) cis-regulatory element.

References

  1. Reeves R, Beckerbauer L (May 2001). "HMGI/Y proteins: flexible regulators of transcription and chromatin structure". Biochimica et Biophysica Acta (BBA) - Gene Structure and Expression. 1519 (1–2): 13–29. doi:10.1016/S0167-4781(01)00215-9. PMID   11406267.
  2. Meijer AH, van Dijk EL, Hoge JH (June 1996). "Novel members of a family of AT hook-containing DNA-binding proteins from rice are identified through their in vitro interaction with consensus target sites of plant and animal homeodomain proteins". Plant Molecular Biology. 31 (3): 607–618. doi:10.1007/BF00042233. PMID   8790293. S2CID   24687309.
  3. Singh M, D'Silva L, Holak TA (2006). "DNA-binding properties of the recombinant high-mobility-group-like AT-hook-containing region from human BRG1 protein". Biological Chemistry. 387 (10–11): 1469–1478. doi:10.1515/BC.2006.184. PMID   17081121. S2CID   26580880.
  4. Reeves R (October 2001). "Molecular biology of HMGA proteins: hubs of nuclear function". Gene. 277 (1–2): 63–81. doi:10.1016/S0378-1119(01)00689-8. PMID   11602345.
  5. 1 2 Reeves R, Nissen MS (May 1990). "The A.T-DNA-binding domain of mammalian high mobility group I chromosomal proteins. A novel peptide motif for recognizing DNA structure". The Journal of Biological Chemistry. 265 (15): 8573–8582. doi: 10.1016/S0021-9258(19)38926-4 . PMID   1692833.
  6. Huth JR, Bewley CA, Nissen MS, Evans JN, Reeves R, Gronenborn AM, Clore GM (August 1997). "The solution structure of an HMG-I(Y)-DNA complex defines a new architectural minor groove binding motif". Nature Structural Biology. 4 (8): 657–665. doi:10.1038/nsb0897-657. PMID   9253416. S2CID   2183841.
  7. Reeves R (October 2000). "Structure and function of the HMGI(Y) family of architectural transcription factors". Environmental Health Perspectives. 108 Suppl 5 (Suppl 5): 803–809. doi: 10.2307/3454310 . JSTOR   3454310. PMID   11035986.
  8. 1 2 3 4 Fonfría-Subirós E, Acosta-Reyes F, Saperas N, Pous J, Subirana JA, Campos JL (2012). "Crystal structure of a complex of DNA with one AT-hook of HMGA1". PLOS ONE. 7 (5): e37120. Bibcode:2012PLoSO...737120F. doi: 10.1371/journal.pone.0037120 . PMC   3353895 . PMID   22615915.
  9. 1 2 Filarsky M, Zillner K, Araya I, Villar-Garea A, Merkl R, Längst G, Németh A (2015). "The extended AT-hook is a novel RNA binding motif". RNA Biology. 12 (8): 864–876. doi:10.1080/15476286.2015.1060394. PMC   4615771 . PMID   26156556.