WD domain, G-beta repeat | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||
Symbol | WD40 | ||||||||||
Pfam | PF00400 | ||||||||||
Pfam clan | CL0186 | ||||||||||
InterPro | IPR001680 | ||||||||||
PROSITE | PDOC00574 | ||||||||||
SCOP2 | 1gp2 / SCOPe / SUPFAM | ||||||||||
CDD | cd00200 | ||||||||||
|
The WD40 repeat (also known as the WD or beta-transducin repeat) is a short structural motif of approximately 40 amino acids, often terminating in a tryptophan-aspartic acid (W-D) dipeptide. [2] Tandem copies of these repeats typically fold together to form a type of circular solenoid protein domain called the WD40 domain.
WD40 domain-containing proteins have 4 to 16 repeating units, all of which are thought to form a circularised beta-propeller structure (see figure to the right). [3] [4] The WD40 domain is composed of several repeats, a variable region of around 20 residues at the beginning followed by a more common repeated set of residues. These repeats typically form a four stranded anti-parallel beta sheet or blade. These blades come together to form a propeller with the most common being a 7 bladed beta propeller. The blades interlock so that the last beta strand of one repeat forms with the first three of the next repeat to form the 3D blade structure.
WD40-repeat proteins are a large family found in all eukaryotes and are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control, autophagy and apoptosis. [5] The underlying common function of all WD40-repeat proteins is coordinating multi-protein complex assemblies, where the repeating units serve as a rigid scaffold for protein interactions. The specificity of the proteins is determined by the sequences outside the repeats themselves. Examples of such complexes are G proteins (beta subunit is a beta-propeller), TAFII transcription factor, and E3 ubiquitin ligase. [3] [4]
According to the initial analysis of the human genome WD40 repeats are the eighth largest family of proteins. In all 277 proteins were identified to contain them. [6] Human genes encoding proteins containing this domain include:
WDR gene | other gene names | NCBI Entrez Gene ID | Human disease associated with mutations |
---|---|---|---|
WDR1 | AIP1; NORI-1; HEL-S-52 | 9948 | |
WDR2 | CORO2A; IR10; CLIPINB | 7464 | |
WDR3 | DIP2; UTP12 | 10885 | |
WDR4 | TRM82; TRMT82 | 10785 | |
WDR5 | SWD3; BIG-3; CFAP89 | 11091 | |
WDR6 | 11180 | ||
WDR7 | TRAG; KIAA0541; Rabconnectin 3 beta | 23335 | |
WDR8 | WRAP73 | 49856 | |
WDR9 | BRWD1; N143; C21orf107 | 54014 | |
WDR10 | IFT122; CED; SPG; CED1; WDR10p; WDR140 | 55764 | Sensenbrenner syndrome |
WDR11 | DR11; HH14; BRWD2; WDR15 | 55717 | Kallmann syndrome |
WDR12 | YTM1 | 55759 | |
WDR13 | MG21 | 64743 | |
WDR14 | GNB1L; GY2; FKSG1; WDVCF; DGCRK3 | 54584 | |
WDR15 | WDR11 | ||
WDR16 | CFAP52; WDRPUH | 146845 | |
WDR17 | 116966 | ||
WDR18 | Ipi3 | 57418 | |
WDR19 | ATD5; CED4; DYF-2; ORF26; Oseg6; PWDMP; SRTD5; IFT144; NPHP13 | 57728 | Sensenbrenner syndrome, Jeune syndrome |
WDR20 | DMR | 91833 | |
WDR21 | DCAF4; WDR21A | 26094 | |
WDR22 | DCAF5; BCRG2; BCRP2 | 8816 | |
WDR23 | DCAF11; GL014; PRO2389 | 80344 | |
WDR24 | JFP7; C16orf21 | 84219 | |
WDR25 | C14orf67 | 79446 | |
WDR26 | CDW2; GID7; MIP2 | 80232 | |
WDR27 | 253769 | ||
WDR28 | GRWD1; CDW4; GRWD; RRB1 | 83743 | |
WDR29 | SPAG16; PF20 | 79582 | |
WDR30 | ATG16L1; IBD10; APG16L; ATG16A; ATG16L | 55054 | Crohn’s disease |
WDR31 | 114987 | ||
WDR32 | DCAF10 | 79269 | |
WDR33 | NET14; WDC146 | 55339 | |
WDR34 | DIC5; FAP133; SRTD11 | 89891 | Jeune syndrome |
WDR35 | CED2; IFTA1; SRTD7; IFT121 | 57539 | Sensenbrenner syndrome |
WDR36 | GLC1G; UTP21; TAWDRP; TA-WDRP | 134430 | Primary Open Angle Glaucoma |
WDR37 | 22884 | ||
WDR38 | 401551 | ||
WDR39 | CIAO1; CIA1 | 9391 | |
WDR40A | DCAF12; CT102; TCC52; KIAA1892 | 25853 | |
WDR41 | MSTP048 | 55255 | |
WDR43 | UTP5; NET12 | 23160 | |
WDR44 | RPH11; RAB11BP | 54521 | |
WDR45 | JM5; NBIA4; NBIA5; WDRX1; WIPI4; WIPI-4 | 11152 | Beta-propeller protein-associated neurodegeneration (BPAN) |
WDR46 | UTP7; BING4; FP221; C6orf11 | 9277 | |
WDR47 | NEMITIN; KIAA0893 | 22911 | |
WDR48 | P80; UAF1; SPG60 | 57599 | |
WDR49 | 151790 | ||
WDR50 | UTP18; CGI-48 | 51096 | |
WDR52 | CFAP44 | 55779 | |
WDR53 | 348793 | ||
WDR54 | 84058 | ||
WDR55 | 54853 | ||
WDR56 | IFT80; ATD2; SRTD2 | 57560 | Jeune syndrome |
WDR57 | SNRNP40; SPF38; PRP8BP; HPRP8BP; PRPF8BP | 9410 | |
WDR58 | THOC6; BBIS; fSAP35 | 79228 | |
WDR59 | FP977 | 79726 | |
WDR60 | SRPS6; SRTD8; FAP163 | 55112 | Jeune syndrome |
WDR61 | SKI8; REC14 | 80349 | |
WDR62 | MCPH2; C19orf14 | 284403 | microcephaly |
WDR63 | DIC3; NYD-SP29 | 126820 | |
WDR64 | 128025 | ||
WDR65 | CFAP57; VWS2 | 149465 | Van der Woude syndrome |
WDR66 | CaM-IP4 | 144406 | |
WDR67 | TBC1D31; Gm85 | 93594 | |
WDR68 | DCAF7; AN11; HAN11; SWAN-1 | 10238 | |
WDR69 | DAW1; ODA16 | 164781 | |
WDR70 | 55100 | ||
WDR71 | PAAF1; PAAF; Rpn14 | 80227 | |
WDR72 | AI2A3 | 256764 | Amelogenesis imperfecta |
WDR73 | HSPC264 | 84942 | |
WDR74 | 54663 | ||
WDR75 | NET16; UTP17 | 84128 | |
WDR76 | CDW14 | 79968 | |
WDR77 | p44; MEP50; MEP-50; HKMT1069; Nbla10071; p44/Mep50 | 79084 | |
WDR78 | DIC4 | 79819 | |
WDR79 | WRAP53; DKCB3; TCAB1 | 55135 | |
WDR80 | ATG16L; ATG16B | 89849 | |
WDR81 | CAMRQ2; PPP1R166 | 124997 | cerebellar ataxia, mental retardation, and dysequilibrium syndrome-2 |
WDR82 | SWD2; MST107; WDR82A; MSTP107; PRO2730; TMEM113; PRO34047 | 80335 | |
WDR83 | MORG1 | 84292 | |
WDR84 | PAK1IP1; PIP1; MAK11 | 55003 | |
WDR85 | DPH7; RRT2; C9orf112 | 92715 | |
WDR86 | 349136 | ||
WDR87 | NYD-SP11 | 83889 | |
WDR88 | PQWD | 126248 | |
WDR89 | MSTP050; C14orf150 | 112840 | |
WDR90 | C16orf15; C16orf16; C16orf17; C16orf18; C16orf19 | 197335 | |
WDR91 | HSPC049 | 29062 | |
WDR92 | MONAD | 116143 | |
WDR93 | 56964 | ||
WDR94 | AMBRA1; DCAF3 | 55626 | |
WDR96 | CFAP43; C10orf79 | 80217 |
The SRC Homology 3 Domain is a small protein domain of about 60 amino acid residues. Initially, SH3 was described as a conserved sequence in the viral adaptor protein v-Crk. This domain is also present in the molecules of phospholipase and several cytoplasmic tyrosine kinases such as Abl and Src. It has also been identified in several other protein families such as: PI3 Kinase, Ras GTPase-activating protein, CDC24 and cdc25. SH3 domains are found in proteins of signaling pathways regulating the cytoskeleton, the Ras protein, and the Src kinase and many others. The SH3 proteins interact with adaptor proteins and tyrosine kinases. Interacting with tyrosine kinases, SH3 proteins usually bind far away from the active site. Approximately 300 SH3 domains are found in proteins encoded in the human genome. In addition to that, the SH3 domain was responsible for controlling protein-protein interactions in the signal transduction pathways and regulating the interactions of proteins involved in the cytoplasmic signaling.
14-3-3 proteins are a family of conserved regulatory molecules that are expressed in all eukaryotic cells. 14-3-3 proteins have the ability to bind a multitude of functionally diverse signaling proteins, including kinases, phosphatases, and transmembrane receptors. More than 200 signaling proteins have been reported as 14-3-3 ligands.
A nuclear localization signalorsequence (NLS) is an amino acid sequence that 'tags' a protein for import into the cell nucleus by nuclear transport. Typically, this signal consists of one or more short sequences of positively charged lysines or arginines exposed on the protein surface. Different nuclear localized proteins may share the same NLS. An NLS has the opposite function of a nuclear export signal (NES), which targets proteins out of the nucleus.
The ankyrin repeat is a 33-residue motif in proteins consisting of two alpha helices separated by loops, first discovered in signaling proteins in yeast Cdc10 and Drosophila Notch. Domains consisting of ankyrin tandem repeats mediate protein–protein interactions and are among the most common structural motifs in known proteins. They appear in bacterial, archaeal, and eukaryotic proteins, but are far more common in eukaryotes. Ankyrin repeat proteins, though absent in most viruses, are common among poxviruses. Most proteins that contain the motif have four to six repeats, although its namesake ankyrin contains 24, and the largest known number of repeats is 34, predicted in a protein expressed by Giardia lamblia.
An armadillo repeat is the name of a characteristic, repetitive amino acid sequence of about 40 residues in length that is found in many proteins. Proteins that contain armadillo repeats typically contain several tandemly repeated copies. Each armadillo repeat is composed of a pair of alpha helices that form a hairpin structure. Multiple copies of the repeat form what is known as an alpha solenoid structure.
A leucine-rich repeat (LRR) is a protein structural motif that forms an α/β horseshoe fold. It is composed of repeating 20–30 amino acid stretches that are unusually rich in the hydrophobic amino acid leucine. These tandem repeats commonly fold together to form a solenoid protein domain, termed leucine-rich repeat domain. Typically, each repeat unit has beta strand-turn-alpha helix structure, and the assembled domain, composed of many such repeats, has a horseshoe shape with an interior parallel beta sheet and an exterior array of helices. One face of the beta sheet and one side of the helix array are exposed to solvent and are therefore dominated by hydrophilic residues. The region between the helices and sheets is the protein's hydrophobic core and is tightly sterically packed with leucine residues.
F-box/WD repeat-containing protein 7 is a protein that in humans is encoded by the FBXW7 gene.
βTrCP2 is a protein that in humans is encoded by the FBXW11 gene.
F-box/WD repeat-containing protein 2 is a protein that in humans is encoded by the FBXW2 gene.
F-box/WD repeat-containing protein 5 is a protein that in humans is encoded by the FBXW5 gene.
Coronin is an actin binding protein which also interacts with microtubules and in some cell types is associated with phagocytosis. Coronin proteins are expressed in a large number of eukaryotic organisms from yeast to humans.
The inhibitor of apoptosis domain -- also known as IAP repeat, Baculovirus Inhibitor of apoptosis protein Repeat, or BIR -- is a structural motif found in proteins with roles in apoptosis, cytokine production, and chromosome segregation. Proteins containing BIR are known as inhibitor of apoptosis proteins (IAPs), or BIR-containing proteins, and include BIRC1 (NAIP), BIRC2 (cIAP1), BIRC3 (cIAP2), BIRC4 (xIAP), BIRC5 (survivin) and BIRC6.
The tetratricopeptide repeat (TPR) is a structural motif. It consists of a degenerate 34 amino acid tandem repeat identified in a wide variety of proteins. It is found in tandem arrays of 3–16 motifs, which form scaffolds to mediate protein–protein interactions and often the assembly of multiprotein complexes. These alpha-helix pair repeats usually fold together to produce a single, linear solenoid domain called a TPR domain. Proteins with such domains include the anaphase-promoting complex (APC) subunits cdc16, cdc23 and cdc27, the NADPH oxidase subunit p67-phox, hsp90-binding immunophilins, transcription factors, the protein kinase R (PKR), the major receptor for peroxisomal matrix protein import PEX5, protein arginine methyltransferase 9 (PRMT9), and mitochondrial import proteins.
WDR75 is a human protein encoded by the WDR75 gene containing a WD40 superfamily domain. The WD40 domain is found throughout many eukaryotic cell types and is known to be involved in cellular regulator functions such as pre-mRNA processing and cytoskeleton assembly. The function of the WDR75 protein is not defined by the scientific community.
The Kelch motif is a region of protein sequence found widely in proteins from bacteria and eukaryotes. This sequence motif is composed of about 50 amino acid residues which form a structure of a four stranded beta-sheet "blade". This sequence motif is found in between five and eight tandem copies per protein which fold together to form a larger circular solenoid structure called a beta-propeller domain.
BRCA1 C Terminus (BRCT) domain is a family of evolutionarily related proteins. It is named after the C-terminal domain of BRCA1, a DNA-repair protein that serves as a marker of breast cancer susceptibility.
WH1 domain is an evolutionary conserved protein domain. Therefore, it has an important function.
In molecular biology, the forkhead-associated domain is a phosphopeptide recognition domain found in many regulatory proteins. It displays specificity for phosphothreonine-containing epitopes but will also recognise phosphotyrosine with relatively high affinity. It spans approximately 80-100 amino acid residues folded into an 11-stranded beta sandwich, which sometimes contains small helical insertions between the loops connecting the strands.
In molecular biology, the eukaryotic translation initiation factor 4E family (eIF-4E) is a family of proteins that bind to the cap structure of eukaryotic cellular mRNAs. Members of this family recognise and bind the 7-methyl-guanosine-containing (m7Gppp) cap during an early step in the initiation of protein synthesis and facilitate ribosome binding to an mRNA by inducing the unwinding of its secondary structures. A tryptophan in the central part of the sequence of human eIF-4E seems to be implicated in cap-binding.
Clathrin adaptor proteins, also known as adaptins, are vesicular transport adaptor proteins associated with clathrin. These proteins are synthesized in the ribosomes, processed in the endoplasmic reticulum and transported from the Golgi apparatus to the trans-Golgi network, and from there via small carrier vesicles to their final destination compartment. The association between adaptins and clathrin are important for vesicular cargo selection and transporting. Clathrin coats contain both clathrin and adaptor complexes that link clathrin to receptors in coated vesicles. Clathrin-associated protein complexes are believed to interact with the cytoplasmic tails of membrane proteins, leading to their selection and concentration. Therefore, adaptor proteins are responsible for the recruitment of cargo molecules into a growing clathrin-coated pits. The two major types of clathrin adaptor complexes are the heterotetrameric vesicular transport adaptor proteins (AP1-5), and the monomeric GGA adaptors. Adaptins are distantly related to the other main type of vesicular transport proteins, the coatomer subunits, sharing between 16% and 26% of their amino acid sequence.