AT-hook | |||||||||
---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||
Symbol | AT_hook | ||||||||
Pfam | PF02178 | ||||||||
InterPro | IPR017956 | ||||||||
SMART | AT_hook | ||||||||
SCOP2 | 2eze / SCOPe / SUPFAM | ||||||||
|
The AT-hook is a DNA-binding motif present in many proteins, including the high mobility group (HMG) proteins, [1] DNA-binding proteins from plants [2] and hBRG1 protein, a central ATPase of the human switching/sucrose non-fermenting (SWI/SNF) remodeling complex. [3]
This motif consists of a conserved, palindromic, core sequence of proline-arginine-glycine-arginine-proline, although some AT-hooks contain only a single proline in the core sequence. AT-hooks also include a variable number of positively charged lysine and arginine residues on either side of the core sequence. [4] The AT-hook binds to the minor groove of adenine-thymine (AT) rich DNA, hence the AT in the name. The rest of the name derives from a predicted asparagine/aspartate "hook" in the earliest AT-hooks reported in 1990. [5] In 1997 structural studies using NMR determined that a DNA-bound AT-hook adopted a crescent or hook shape around the minor groove of a target DNA strand (pictured at right). [6] HMGA proteins contain three AT-hooks, although some proteins contain as many as 30. [5] The optimal binding sequences for AT-hook proteins are repeats of the form (ATAA)n or (TATT)n, although the optimal binding sequences for the core sequence of the AT-hook are AAAT and AATT. [7]
The DNA dodecamer has eight consecutive AT base pairs, allowing the AT-hook to be positioned in several positions, with the preferred position being at one of the AATT regions to fully occupy the minor groove. Van der Waals interactions of the AT-hook with the adenines play an important role for the specificity of the position. [8] Van der Waals interactions of the AT-hook with the adenines play an important role for the specificity of the position. [8]
The figure shows the position of the main chain to allow hydrogen bonds with the minor groove thymine oxygen atoms. The interactions shown, caused the DNA to bend, extending the minor groove. The distorted DNA causes the complementary major groove to form interactions between the side chains.
AT-hook proteins can form hydrogen bonds between NH groups of Gly37 and Arg38 on the main-chain and thymine oxygen atoms in the minor groove, which bends the DNA and widens the minor groove. [8] The binding to the minor groove facilitates binding of other proteins in the major groove. [9] That enables HMG proteins to regular expression of genes and influence biological processes.
The AT-hooks have also been proposed to anchor chromatin-modifying proteins to AT-rich DNA sequences through their association with chromatin remodeling, histone modifications, and chromatin insulator function. [9]
Alterations or abnormal expression of the HMG proteins have led to metabolic disorders, such as obesity, type 2 diabetes, and cancer. [8]
Histone acetyltransferases (HATs) are enzymes that acetylate conserved lysine amino acids on histone proteins by transferring an acetyl group from acetyl-CoA to form ε-N-acetyllysine. DNA is wrapped around histones, and, by transferring an acetyl group to the histones, genes can be turned on and off. In general, histone acetylation increases gene expression.
In molecular biology, a histone octamer is the eight-protein complex found at the center of a nucleosome core particle. It consists of two copies of each of the four core histone proteins. The octamer assembles when a tetramer, containing two copies of H3 and two of H4, complexes with two H2A/H2B dimers. Each histone has both an N-terminal tail and a C-terminal histone-fold. Each of these key components interacts with DNA in its own way through a series of weak interactions, including hydrogen bonds and salt bridges. These interactions keep the DNA and the histone octamer loosely associated, and ultimately allow the two to re-position or to separate entirely.
DNA-binding proteins are proteins that have DNA-binding domains and thus have a specific or general affinity for single- or double-stranded DNA. Sequence-specific DNA-binding proteins generally interact with the major groove of B-DNA, because it exposes more functional groups that identify a base pair.
A leucine zipper is a common three-dimensional structural motif in proteins. They were first described by Landschulz and collaborators in 1988 when they found that an enhancer binding protein had a very characteristic 30-amino acid segment and the display of these amino acid sequences on an idealized alpha helix revealed a periodic repetition of leucine residues at every seventh position over a distance covering eight helical turns. The polypeptide segments containing these periodic arrays of leucine residues were proposed to exist in an alpha-helical conformation and the leucine side chains from one alpha helix interdigitate with those from the alpha helix of a second polypeptide, facilitating dimerization.
A DNA-binding domain (DBD) is an independently folded protein domain that contains at least one structural motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence or have a general affinity to DNA. Some DNA-binding domains may also include nucleic acids in their folded structure.
Histone H4 is one of the five main histone proteins involved in the structure of chromatin in eukaryotic cells. Featuring a main globular domain and a long N-terminal tail, H4 is involved with the structure of the nucleosome of the 'beads on a string' organization. Histone proteins are highly post-translationally modified. Covalently bonded modifications include acetylation and methylation of the N-terminal tails. These modifications may alter expression of genes located on DNA associated with its parent histone octamer. Histone H4 is an important protein in the structure and function of chromatin, where its sequence variants and variable modification states are thought to play a role in the dynamic and long term regulation of genes.
Methyltransferases are a large group of enzymes that all methylate their substrates but can be split into several subclasses based on their structural features. The most common class of methyltransferases is class I, all of which contain a Rossmann fold for binding S-Adenosyl methionine (SAM). Class II methyltransferases contain a SET domain, which are exemplified by SET domain histone methyltransferases, and class III methyltransferases, which are membrane associated. Methyltransferases can also be grouped as different types utilizing different substrates in methyl transfer reactions. These types include protein methyltransferases, DNA/RNA methyltransferases, natural product methyltransferases, and non-SAM dependent methyltransferases. SAM is the classical methyl donor for methyltransferases, however, examples of other methyl donors are seen in nature. The general mechanism for methyl transfer is a SN2-like nucleophilic attack where the methionine sulfur serves as the leaving group and the methyl group attached to it acts as the electrophile that transfers the methyl group to the enzyme substrate. SAM is converted to S-Adenosyl homocysteine (SAH) during this process. The breaking of the SAM-methyl bond and the formation of the substrate-methyl bond happen nearly simultaneously. These enzymatic reactions are found in many pathways and are implicated in genetic diseases, cancer, and metabolic diseases. Another type of methyl transfer is the radical S-Adenosyl methionine (SAM) which is the methylation of unactivated carbon atoms in primary metabolites, proteins, lipids, and RNA.
In molecular biology and genetics, transcription coregulators are proteins that interact with transcription factors to either activate or repress the transcription of specific genes. Transcription coregulators that activate gene transcription are referred to as coactivators while those that repress are known as corepressors. The mechanism of action of transcription coregulators is to modify chromatin structure and thereby make the associated DNA more or less accessible to transcription. In humans several dozen to several hundred coregulators are known, depending on the level of confidence with which the characterisation of a protein as a coregulator can be made. One class of transcription coregulators modifies chromatin structure through covalent modification of histones. A second ATP dependent class modifies the conformation of chromatin.
High-Mobility Group or HMG is a group of chromosomal proteins that are involved in the regulation of DNA-dependent processes such as transcription, replication, recombination, and DNA repair.
DNA adenine methylase, (Dam methylase) (also site-specific DNA-methyltransferase (adenine-specific), EC 2.1.1.72, modification methylase, restriction-modification system) is an enzyme that adds a methyl group to the adenine of the sequence 5'-GATC-3' in newly synthesized DNA. Immediately after DNA synthesis, the daughter strand remains unmethylated for a short time. It is an orphan methyltransferase that is not part of a restriction-modification system and regulates gene expression. This enzyme catalyses the following chemical reaction
Histone-modifying enzymes are enzymes involved in the modification of histone substrates after protein translation and affect cellular processes including gene expression. To safely store the eukaryotic genome, DNA is wrapped around four core histone proteins, which then join to form nucleosomes. These nucleosomes further fold together into highly condensed chromatin, which renders the organism's genetic material far less accessible to the factors required for gene transcription, DNA replication, recombination and repair. Subsequently, eukaryotic organisms have developed intricate mechanisms to overcome this repressive barrier imposed by the chromatin through histone modification, a type of post-translational modification which typically involves covalently attaching certain groups to histone residues. Once added to the histone, these groups elicit either a loose and open histone conformation, euchromatin, or a tight and closed histone conformation, heterochromatin. Euchromatin marks active transcription and gene expression, as the light packing of histones in this way allows entry for proteins involved in the transcription process. As such, the tightly packed heterochromatin marks the absence of current gene expression.
High-mobility group AT-hook 2, also known as HMGA2, is a protein that, in humans, is encoded by the HMGA2 gene.
High-mobility group protein HMG-I/HMG-Y is a protein that in humans is encoded by the HMGA1 gene.
Non-histone chromosomal protein HMG-14 is a protein that in humans is encoded by the HMGN1 gene.
In molecular biology, the HMG-box is a protein domain which is involved in DNA binding. The domain is composed of approximately 75 amino acid residues that collectively mediate the DNA-binding of chromatin-associated high-mobility group proteins. HMG-boxes are present in many transcription factors and chromatin-remodeling complexes, where they can mediate non-sequence or sequence-specific DNA binding.
Nucleic acid tertiary structure is the three-dimensional shape of a nucleic acid polymer. RNA and DNA molecules are capable of diverse functions ranging from molecular recognition to catalysis. Such functions require a precise three-dimensional structure. While such structures are diverse and seemingly complex, they are composed of recurring, easily recognizable tertiary structural motifs that serve as molecular building blocks. Some of the most common motifs for RNA and DNA tertiary structure are described below, but this information is based on a limited number of solved structures. Many more tertiary structural motifs will be revealed as new RNA and DNA molecules are structurally characterized.
Nucleic acid structure refers to the structure of nucleic acids such as DNA and RNA. Chemically speaking, DNA and RNA are very similar. Nucleic acid structure is often divided into four different levels: primary, secondary, tertiary, and quaternary.
Nucleic acidquaternary structure refers to the interactions between separate nucleic acid molecules, or between nucleic acid molecules and proteins. The concept is analogous to protein quaternary structure, but as the analogy is not perfect, the term is used to refer to a number of different concepts in nucleic acids and is less commonly encountered. Similarly other biomolecules such as proteins, nucleic acids have four levels of structural arrangement: primary, secondary, tertiary, and quaternary structure. Primary structure is the linear sequence of nucleotides, secondary structure involves small local folding motifs, and tertiary structure is the 3D folded shape of nucleic acid molecule. In general, quaternary structure refers to 3D interactions between multiple subunits. In the case of nucleic acids, quaternary structure refers to interactions between multiple nucleic acid molecules or between nucleic acids and proteins. Nucleic acid quaternary structure is important for understanding DNA, RNA, and gene expression because quaternary structure can impact function. For example, when DNA is packed into heterochromatin, therefore exhibiting a type of quaternary structure, gene transcription will be inhibited.
In molecular biology, GATA zinc fingers are zinc-containing domains found in a number of transcription factors. Some members of this class of zinc fingers specifically bind the DNA sequence (A/T)GATA(A/G) in the regulatory regions of genes., giving rise to the name of the domain. In these domains, a single zinc ion is coordinated by 4 cysteine residues. NMR studies have shown the core of the Znf to comprise 2 irregular anti-parallel beta-sheets and an alpha-helix, followed by a long loop to the C-terminal end of the finger. The N-terminal part, which includes the helix, is similar in structure, but not sequence, to the N-terminal zinc module of the glucocorticoid receptor DNA-binding domain. The helix and the loop connecting the 2 beta-sheets interact with the major groove of the DNA, while the C-terminal tail wraps around into the minor groove. Interactions between the Znf and DNA are mainly hydrophobic, explaining the preponderance of thymines in the binding site; a large number of interactions with the phosphate backbone have also been observed. Two GATA zinc fingers are found in GATA-family transcription factors. However, there are several proteins that only contain a single copy of the domain. It is also worth noting that many GATA-type Znfs have not been experimentally demonstrated to be DNA-binding domains. Furthermore, several GATA-type Znfs have been demonstrated to act as protein-recognition domains. For example, the N-terminal Znf of GATA1 binds specifically to a zinc finger from the transcriptional coregulator FOG1 (ZFPM1).
The WRKY domain is found in the WRKY transcription factor family, a class of transcription factors. The WRKY domain is found almost exclusively in plants although WRKY genes appear present in some diplomonads, social amoebae and other amoebozoa, and fungi incertae sedis. They appear absent in other non-plant species. WRKY transcription factors have been a significant area of plant research for the past 20 years. The WRKY DNA-binding domain recognizes the W-box (T)TGAC(C/T) cis-regulatory element.