WDR75 is a human protein encoded by the WDR75 gene [1] containing a WD40 superfamily domain. [2] The WD40 domain is found throughout many eukaryotic cell types and is known to be involved in cellular regulator functions such as pre-mRNA processing and cytoskeleton assembly. [3] The function of the WDR75 protein is not defined by the scientific community.
Accession Numbers | Location | Identifiers | M.W. | pI |
---|---|---|---|---|
NM_032186, NP_115544.1 | 2q32.2 | FLJ12519, DKFZp781N2144 [1] | 94.5 kDa | 5.56 |
The amino acid sequence is 830 residues long and contains an acidic tail. [4]
1 MVEEENIRVV RCGGSELNFR RAVFSADSKY IFCVSGDFVK VYSTVTEECV HILHGHRNLV 61 TGIQLNPNNH LQLYSCSLDG TIKLWDYIDG ILIKTFIVGC KLHALFTLAQ AEDSVFVIVN 121 KEKPDIFQLV SVKLPKSSSQ EVEAKELSFV LDYINQSPKC IAFGNEGVYV AAVREFYLSV 181 YFFKKKTTSR FTLSSSRNKK HAKNNFTCVA CHPTEDCIAS GHMDGKIRLW RNFYDDKKYT 241 YTCLHWHHDM VMDLAFSVTG TSLLSGGRES VLVEWRDATE KNKEFLPRLG ATIEHISVSP 301 AGDLFCTSHS DNKIIIIHRN LEASAVIQGL VKDRSIFTGL MIDPRTKALV LNGKPGHLQF 361 YSLQSDKQLY NLDIIQQEYI NDYGLIQIEL TKAAFGCFGN WLATVEQRQE KETELELQMK 421 LWMYNKKTQG FILNTKINMP HEDCITALCF CNAEKSEQPT LVTASKDGYF KVWILTDDSD 481 IYKKAVGWTC DFVGSYHKYQ ATNCCFSEDG SLLAVSFEEI VTIWDSVTWE LKCTFCQRAG 541 KIRHLCFGRL TCSKYLLGAT ENGILCCWNL LSCALEWNAK LNVRVMEPDP NSENIAAISQ 601 SSVGSDLFVF KPSEPRPLYI QKGISREKVQ WGVFVPRDVP ESFTSEAYQW LNRSQFYFLT 661 KSQSLLTFST KSPEEKLTPT SKQLLAEESL PTTPFYFILG KHRQQQDEKL NETLENELVQ 721 LPLTENIPAI SELLHTPAHV LPSAAFLCSM FVNSLLLSKE TKSAKEIPED VDMEEEKESE 781 DSDEENDFTE KVQDTSNTGL GEDIIHQLSK SEEKELRKFR KIDYSWIAAL
The secondary structure is predicted [4] to contain alternating sets of alpha helices and beta strands. No helix-turn-helix (HTH) regions were predicted.
No known paralogs exist for the Human gene. No paralogs were found for the cow, mouse, or rat orthologs of WDR75. [5] Conservation among mammalian orthologs to the human gene was strong, with minimum identity among orthologs at 76%. The following table summarizes some vertebrate and invertebrate orthologs. Conservation of specific amino acids from positions 340-390, 430- 450, and 515-530 all correlate to predicted alpha helices. [4]
Organism | Accession Number | % Identity to Human Gene |
---|---|---|
Bos Taurus | NP_001095532.1 [6] | 91.7 |
Mus musculus | NP_082875.1 [7] | 84.9 |
Pan troglodytes | XP_001164827.1 [8] | 99.5 |
Rattus norvegicus | NP_001041354.1 [9] | 84.3 |
Canis familiaris | XP_545565 [10] | 90.7 |
Xenopus Laevis | NP_001086564.1 [11] | 61.0 |
Gallus gallus | XP_001233408.1 [12] | 60.8 |
Schizosaccharomyces pombe | NP_594798 [13] | 19.9 |
Anopheles gambiae | XP_317524 [14] | 22.7 |
MORN1 containing repeat 1, also known as Morn1, is a protein that in humans is encoded by the MORN1 gene.
FAM83H is a gene in humans that encodes a protein known as FAM83H. FAM83H is targeted for the nucleus and it predicted to play a role in the structural development and calcification of tooth enamel.
Uncharacterized LOC644249 gene., also known as RP11-195B21.3, is about 1058 base pairs long and is found in Homo sapiens on chromosome 9q12. More specifically, the sequence is located on Chromosome: 9; NC_000009.11(67977457..67987991 bp). This gene’s protein product is the “coiled-coil domain-containing protein 29” which is 291 amino acids long and may contain a conserved domain in the superfamily, pfam 12001. In particular, this conserved domain contains the domain of unknown function DUF3496 which is about 110 amino acids long, functionally uncharacterized, and found in eukaryotes. Other possible motifs for the protein product exist but the DUF3496 remains the most likely. This protein may play a role as a transmembrane protein.
Family with Sequence Similarity 203, Member B (FAM203B) is a protein encoded by the FAM203B gene (8q24.3) in humans. While FAM203B is only found in humans and possibly non-human primates, its paralog, FAM203A, is highly conserved. The FAM203B protein contains two conserved domains of unknown function, DUF383 and DUF384, and no transmembrane domains. This protein has no known function yet, although the homolog of FAM203A in Caenorhabditis elegans (Y54H5A.2) is thought to help regulate the actin cytoskeleton.
NBEAL1 is a protein that in humans is encoded by the NBEAL1 gene. It is found on chromosome 2q33.2 of Homo sapiens.
Coiled Coil Domain Containing protein 42B, also known as CCDC42B, is a protein encoded by the protein-coding gene CCDC42B.
Coiled-coil domain 47 (CCDC47) is a gene located on human chromosome 17, specifically locus 17q23.3 which encodes for the protein CCDC47. The gene has several aliases including GK001 and MSTP041. The protein itself contains coiled-coil domains, the SEEEED superfamily, a domain of unknown function (DUF1682) and a transmembrane domain. The function of the protein is unknown, but it has been proposed that CCDC47 is involved in calcium ion homeostasis and the endoplasmic reticulum overload response.
KIAA1841 is a gene in humans that encodes a protein known as KIAA1841. KIAA1841 is targeted for the nucleus and it predicted to play a role in regulating transcription.
Family with sequence similarity 98, member A, or FAM98A, is a gene that in the human genome encodes the FAM98A protein. FAM98A has two paralogs in humans, FAM98B and FAM98C. All three are characterized by DUF2465, a conserved domain shown to bind to RNA. FAM98A is also characterized by a glycine-rich C-terminal domain. FAM98A also has homologs in vertebrates and invertebrates and has distant homologs in choanoflagellates and green algae.
Transmembrane protein 251, also known as C14orf109 or UPF0694, is a protein that in humans is encoded by the TMEM251 gene. One notable feature of this protein is the presence of proline residues on one of its predicted transmembrane domains., which is a determinant of the intramitochondrial sorting of inner membrane proteins.
TMEM249 is a protein that in humans is encoded by the C8orfk29 gene.
WD repeat-containing protein 90 is a protein that, in humans, is encoded by the WDR90 gene (16p13.3). This human protein is 1750 amino acids, and has a molecular weight of 187.7 kDa. It contains multiple WD40 repeat domains and one domain of unknown function. This protein is conserved all the way back to invertebrates. Proteins containing WD transducin repeating domains have been found to play a role in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control, autophagy and apoptosis.
C6orf222 is a protein that in humans is encoded by the C6orf222 gene (6p21.31). C6orf222 is conserved in mammals, birds and reptiles with the most distant ortholog being the green sea turtle, Chelonia mydas. The C6orf222 protein contains one mammalian conserved domain: DUF3293. The protein is also predicted to contain a BH3 domain, which has predicted conservation in distant orthologs from the clade Aves.
FAM76A is a protein that in Homo sapiens is encoded by the FAM76A gene. Notable structural characteristics of FAM76A include an 83 amino acid coiled coil domain as well as a four amino acid poly-serine compositional bias. FAM76A is conserved in most chordates but it is not found in other deuterostrome phlya such as echinodermata, hemichordata, or xenacoelomorpha—suggesting that FAM76A arose sometime after chordates in the evolutionary lineage. Furthermore, FAM76A is not found in fungi, plants, archaea, or bacteria. FAM76A is predicted to localize to the nucleus and may play a role in regulating transcription.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
OCC-1 is a protein, which in humans is encoded by the gene C12orf75. The gene is approximately 40,882 bp long and encodes 63 amino acids. OCC-1 is ubiquitously expressed throughout the human body. OCC-1 has shown to be overexpressed in various colon carcinomas. Novel splice variant of this gene was also detected in various human cancer types; in addition to encoding a novel smaller protein, OCC-1 gene produces a non-protein coding RNA splice variant lncRNA.
WD repeat containing protein 53 (WDR53) is a protein encoded by the WDR53 gene that has been identified in the human genome by the Human Genome Project but has, at the moment, lacked experimental procedures to understand the function. It is located on chromosome 3 at location 3q29 in Homo sapiens. It has short up and down stream untranslated regions as well as WD40 repeat regions which have been linked to various functions.
Major facilitator superfamily domain containing 6 like (MFSD6L) is a protein encoded by the MFSD6L gene in humans. The MFSD6L protein is a transmembrane protein that is part of the major facilitator superfamily (MFS) that uses chemiosmotic gradients to facilitate the transport of small solutes across cell membranes.
Transmembrane epididymal protein 1 is a transmembrane protein encoded by the TEDDM1 gene. TEDDM1 is also commonly known as TMEM45C and encodes 273 amino acids that contains six alpha-helix transmembrane regions. The protein contains a 118 amino acid length family of unknown function. While the exact function of TEDDM1 is not understood, it is predicted to be an integral component of the plasma membrane.
Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.