Coiled-coil domain containing 166

Last updated
CCDC166
Identifiers
Aliases CCDC166 , coiled-coil domain containing 166
External IDs MGI: 1925902 HomoloGene: 109421 GeneCards: CCDC166
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001162914

NM_001163518
NM_146059

RefSeq (protein)

NP_001156386

n/a

Location (UCSC) Chr 8: 143.71 – 143.71 Mb Chr 15: 75.85 – 75.85 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Coiled-coil domain containing 166 is a protein that in humans is encoded by the CCDC166 gene. Its function is currently unknown. It contains a coiled-coil domain, hence the current origin of its name. It is primarily expressed in the testes. [5]

Contents

Gene

The gene currently is known to contain only two exons, and one isoform. This primary transcript consists of 1320 DNA base pairs. Its location is on chromosome 8q24.3, between positions 143706694-143708109, on the + strand. The gene is located near BREA2 and MAPK15. [6]

Transcripts

The gene has only a single transcript, due to only have two exons, both which are always transcribed. The coding portion of the mRNA is 1320 nucleotides. [6] In tissues found to express the transcript for this gene it is typically found in low levels. [7]

Protein

CCDC166 has only one isoform in humans, which has a molecular weight of 48.7 kDa and is composed of 439 amino acids. The pI of the protein is 10.537. [8] The protein has several amino acid repeat structures including; EREA, VQSL and (T)QLLH, all of which are conserved in mammals. [9] The composition of the protein reveals that it is high in serine, lysine, and arginine. [8] The protein contains three conserved domains including a coiled-coil domain between amino acids 27-74, a domain of unknown function between amino acids 72-260, and a serine-rich domain between amino acids 288-410. [10] It is believed that the 26-115 AA region is a SH3 domain. [11] The structure is mainly composed of alpha-helices that form a larger coiled-coil. It also contains several coiled-coils.

CCDC166
CCDC166.png
Proposed structure of CCDC166 [12]

Gene level regulation

The gene seems to be expressed heavily in the testes, and this may be conserved in evolution. [13] The promoter region contains several conserved transcription factor binding sites. Notably among them are the CREB family, KLFs, and perhaps the most telling of which is the presence of Testis-determining factor. [14] These transcription factors are all important during the process of development.

Transcript level regulation

In situ hybridization (ISH) data has found the gene's mRNAs are mostly found in the nucleus of Sertoli cells, with low expression in Leydig cells. [13] The gene has also been found in other germ cell tumors. [7] In addition the gene's primary transcript contains several miRNA binding sites, including: hsa-miR-2278, hsa-miR-3178, and hsa-miR-4516. [15]

Protein level regulation

CCDC166 is predicted to be regulated by SUMO protein. It has a conserved IKAD sequence at amino acid 220-223. [16] This combined with a conserved nuclear localization signal of PKKKR starting at amino acid 3, supports that this protein is imported into the nucleus. [17] The gene also contains several predicted phosphorylation sites, most of which are predicted to be clustered into the serine-rich domain. The occurrences of highest probability occur at serine 10, serine 308, and serine 391. [18]

Homology / evolution

While the current function of the gene is unknown, many mammals possess on ortholog of the gene. In various primate species studies, several species have been found to possess on orthologous gene that shares 90% sequence identity. [10] While the gene does not seem to have paralogs, it has homologs that have been conserved throughout its evolutionary history. Evidence that its function has been conserved comes from the promoter region, which has predicted SRY-transcription factors binding sites conserved from zebrafish all the way to humans. [14]

"Evolutionary History of CCDC166"
SpeciesGene nameDate of divergence [19] Percent similarity [20] Accession number
HumanCCDC1660 MYA100%NP_001156386.1
Chimpanzees CCDC166 isoform 16.65 MYA98%PNI46222.1
Grey mouse lemur CCDC16674 MYA76%XP_017516497.1
Horse CCDC16696 MYA85%XP_023504891.1
Florida manatee CCDC166105 MYA79%XP_004387488.1
Japanese gecko CCDC166-like protein312 MYA74%XP_007444987.1
Mallard duck CCDC166-like protein312 MYA39%ENSAPLG00000001712
Mexican tetra CCDC166435 MYA26%ENSAMXG00000003745.1

Function / biochemistry

The function of the protein is currently unknown.

Composition of CCDC166 [21]
Amino AcidNumber of OccurrencesPercent Composition
Ala (A)5713.0%
Arg (R)5813.2%
Asn (N)51.1%
Asp (D)153.4%
Cys (C)20.5%
Gln (Q)327.3%
Glu (E)368.2%
Gly (G)214.8%
His (H)122.7%
Ile (I)61.4%
Leu (L)5612.8%
Lys (K)112.5%
Met (M)61.4%
Phe (F)40.9%
Pro (P)265.9%
Ser (S)4710.7%
Thr (T)122.7%
Trp (W)30.7%
Tyr (Y)51.1%
Val (V)255.7%

Interactions

The gene has been found to interact with FAT3, a tumor suppressor gene, as well as INTS2 a gene that is involved in snRNA processing and transcription. [22] Expression of CCDC166 has shown to be affected by methylphenidate, but the mechanism of this interaction is not known. [10]

Clinical significance

CCDC166 has some single nucleotide variants that are associated with lung, liver colon, thyroid pancreatic and testicular cancers. [23] The clinical significance of the protein has not been fully characterized as of yet, however.

Related Research Articles

<span class="mw-page-title-main">C20orf27</span> Protein-coding gene in the species Homo sapiens

UPF0687 protein C20orf27 is a protein that in humans is encoded by the C20orf27 gene. It is expressed in the majority of the human tissues. One study on this protein revealed its role in regulating cell cycle, apoptosis, and tumorigenesis via promoting the activation of NFĸB pathway.

<span class="mw-page-title-main">TMEM242</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 242 (TMEM242) is a protein that in humans is encoded by the TMEM242 gene. The tmem242 gene is located on chromosome 6, on the long arm, in band 2 section 5.3. This protein is also commonly called C6orf35, BM033, and UPF0463 Transmembrane Protein C6orf35. The tmem242 gene is 35,238 base pairs long, and the protein is 141 amino acids in length. The tmem242 gene contains 4 exons. The function of this protein is not well understood by the scientific community. This protein contains a DUF1358 domain.

<span class="mw-page-title-main">QRICH1</span> Protein-coding gene in the species Homo sapiens

QRICH1, also known as Glutamine-rich protein 1, is a protein that in humans is encoded by the QRICH1 gene. One notable feature of this protein is that it contains a Caspase Activation Recruitment Domain, also known as a CARD domain. As a result of having this domain, QRICH1 is believed to be involved in apoptotic, inflammatory, and host-immune response pathways.

<span class="mw-page-title-main">KIAA1704</span> Protein-coding gene in the species Homo sapiens

KIAA1704, also known as LSR7, is a protein that in humans is encoded by the GPALPP1 gene. The function of KIAA1704 is not yet well understood. KIAA1704 contains one domain of unknown function, DUF3752. The protein contains a conserved, uncharged, repeated motif GPALPP(GF) near the N terminus and an unusual, conserved, mixed charge throughout. It is predicted to be localized to the nucleus.

<span class="mw-page-title-main">Protein FAM46B</span> Protein-coding gene in the species Homo sapiens

Protein FAM46B also known as family with sequence similarity 46 member B is a protein that in humans is encoded by the FAM46B gene. FAM46B contains one protein domain of unknown function, DUF1693. Yeast two-hybrid screening has identified three proteins that physically interact with FAM46B. These are ATX1, PEPP2 and DAZAP2.

<span class="mw-page-title-main">CCDC130</span> Protein found in humans

Coiled-coil domain containing 130 is a protein that in humans is encoded by the CCDC130 gene. It is part of the U4/U5/U6 tri-snRNP in the U5 portion. This tri-snRNP comes together with other proteins to form complex B of the mature spliceosome. The mature protein is approximately 45 kilodaltons (kDa) and is extremely hydrophilic due to the abnormally high number of charged and polar amino acids. CCDC130 is a highly conserved protein, it has orthologous genes in some yeasts and plants that were found using nucleotide and protein versions of the basic local alignment search tool (BLAST) from the National Center for Biotechnology Information. GEO profiles for CCDC130 have shown that this protein is ubiquitously expressed, but the highest levels of expression are found in T-lymphocytes.

<span class="mw-page-title-main">CCDC109B</span> Protein found in humans

Coiled-coil domain containing 109B (CCDC109B) is a potential calcium uniporter protein found in the membrane of human cells and is encoded by the CCDC109B gene. While CCDC109B is a transmembrane protein it is unclear if it is located within the cell membrane or mitochondrial membrane.

<span class="mw-page-title-main">FAM214A</span> Protein-coding gene in the species Homo sapiens

Protein FAM214A, also known as protein family with sequence similarity 214, A (FAM214A) is a protein that, in humans, is encoded by the FAM214A gene. FAM214A is a gene with unknown function found at the q21.2-q21.3 locus on Chromosome 15 (human). The protein product of this gene has two conserved domains, one of unknown function (DUF4210) and another one called Chromosome_Seg. Although the function of the FAM214A protein is uncharacterized, both DUF4210 and Chromosome_Seg have been predicted to play a role in chromosome segregation during meiosis.

<span class="mw-page-title-main">CCDC94</span> Protein found in humans

Coiled-coil domain containing 94 (CCDC94) is a protein that in humans is encoded by the CCDC94 gene. The CCDC94 protein contains a coiled-coil domain, a domain of unknown function (DUF572), an uncharacterized conserved protein (COG5134), and lacks a transmembrane domain.

<span class="mw-page-title-main">CCDC138</span> Protein found in humans

Coiled-coil domain-containing protein 138, also known as CCDC138, is a human protein encoded by the CCDC138 gene. The exact function of CCDC138 is unknown.

<span class="mw-page-title-main">Coiled-coil domain containing 42B</span> Protein found in humans

Coiled Coil Domain Containing protein 42B, also known as CCDC42B, is a protein encoded by the protein-coding gene CCDC42B.

<span class="mw-page-title-main">CCDC47</span> Protein-coding gene in humans

Coiled-coil domain 47 (CCDC47) is a gene located on human chromosome 17, specifically locus 17q23.3 which encodes for the protein CCDC47. The gene has several aliases including GK001 and MSTP041. The protein itself contains coiled-coil domains, the SEEEED superfamily, a domain of unknown function (DUF1682) and a transmembrane domain. The function of the protein is unknown, but it has been proposed that CCDC47 is involved in calcium ion homeostasis and the endoplasmic reticulum overload response.

The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.

<span class="mw-page-title-main">SMCO3</span> Protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

<span class="mw-page-title-main">TMEM128</span>

TMEM128, also known as Transmembrane Protein 128, is a protein that in humans is encoded by the TMEM128 gene. TMEM128 has three variants, varying in 5' UTR's and start codon location. TMEM128 contains four transmembrane domains and is localized in the Endoplasmic Reticulum membrane. TMEM128 contains a variety of regulation at the gene, transcript, and protein level. While the function of TMEM128 is poorly understood, it interacts with several proteins associated with the cell cycle, signal transduction, and memory.

<span class="mw-page-title-main">C1orf185</span> Protein-coding gene in the species Homo sapiens

Chromosome 1 open reading frame 185, also known as C1orf185, is a protein that in humans is encoded by the C1orf185 gene. In humans, C1orf185 is a lowly expressed protein that has been found to be occasionally expressed in the circulatory system.

<span class="mw-page-title-main">WD Repeat and Coiled Coil Containing Protein</span> Protein-coding gene in humans

WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.

<span class="mw-page-title-main">ZNF337</span> Protein-coding gene in the species Homo sapiens

ZNF337, also known as zinc finger protein 337, is a protein that in humans is encoded by the ZNF337 gene. The ZNF337 gene is located on human chromosome 20 (20p11.21). Its protein contains 751 amino acids, has a 4,237 base pair mRNA and contains 6 exons total. In addition, alternative splicing results in multiple transcript variants. The ZNF337 gene encodes a zinc finger domain containing protein, however, this gene/protein is not yet well understood by the scientific community. The function of this gene has been proposed to participate in a processes such as the regulation of transcription (DNA-dependent), and proteins are expected to have molecular functions such as DNA binding, metal ion binding, zinc ion binding, which would be further localized in various subcellular locations. While there are no commonly associated or known aliases, an important paralog of this gene is ZNF875

<span class="mw-page-title-main">CCDC190</span> Protein found in humans

Coiled-Coil Domain Containing 190, also known as C1orf110, the Chromosome 1 Open Reading Frame 110, MGC48998 and CCDC190, is found to be a protein coding gene widely expressed in vertebrates. RNA-seq gene expression profile shows that this gene selectively expressed in different organs of human body like lung brain and heart. The expression product of c1orf110 is often called Coiled-coil domain-containing protein 190 with a size of 302 aa. It may get the name because a coiled-coil domain is found from position 14 to 72. At least 6 spliced variants of its mRNA and 3 isoforms of this protein can be identified, which is caused by alternative splicing in human.

<span class="mw-page-title-main">ZNF548</span> Protein-coding gene in the species Homo sapiens

Zinc Finger Protein 548 (ZNF548) is a human protein encoded by the ZNF548 gene which is located on chromosome 19. It is found in the nucleus and is hypothesized to play a role in the regulation of transcription by RNA Polymerase II. It belongs to the Krüppel C2H2-type zinc-finger protein family as it contains many zinc-finger repeats.

References

  1. 1 2 3 ENSG00000278749 GRCh38: Ensembl release 89: ENSG00000255181, ENSG00000278749 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000098176 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. "Entrez Gene: Coiled-coil domain containing 166" . Retrieved 2018-05-05.
  6. 1 2 "CCDC166 coiled-coil domain containing protein 166 [Homo sapiens (human)]" . Retrieved 2018-05-05.
  7. 1 2 "Hs.730002 - CCDC166: Coiled-coil domain containing 166" . Retrieved 2018-05-05.
  8. 1 2 Rice P., Longden I. and Bleasby A. (2000) EMBOSS: The European Molecular Biology Open Software Suite Trends Genet. 16(6)276-277 PubMed: 10827456 DOI: 10.1016/S0168-9525(00)02024-2
  9. Holger Dinkel, Kim Van Roey, Sushama Michael, Manjeet Kumar, Bora Uyar, Brigitte Altenberg, Vladislava Milchevskaya, Melanie Schneider, Helen Kühn, Annika Behrendt, Sophie Luise Dahl, Victoria Damerell, Sandra Diebel, Sara Kalman, Steffen Klein, Arne C. Knudsen, Christina Mäder, Sabina Merrill, Angelina Staudt, Vera Thiel, Lukas Welti, Norman E. Davey, Francesca Diella, Toby J. Gibson; ELM 2016—data update and new functionality of the eukaryotic linear motif resource, Nucleic Acids Research, Volume 44, Issue D1, 4 January 2016, Pages D294–D300, https://doi.org/10.1093/nar/gkv1291
  10. 1 2 3 "CCDC166 coiled-coil domain containing protein 166" . Retrieved 2018-05-05.
  11. "Conserved domains on [gi|347602472|sp|P0CW27.1]".
  12. Sunyaev S.R., Eisenhaber F., Rodchenkov I.V., Eisenhaber B., Tumanyan V.G., and Kuznetsov E.N. "PSIC: Profile extraction from sequence alignments with position-specific counts of independent observations" Protein Engineering (1999) 12, No.5, 387-394
  13. 1 2 6. CCDC166. (n.d.). Retrieved April 02, 2018, from https://www.proteinatlas.org/ENSG00000255181-CCDC166/antibody
  14. 1 2 Cartharius K, Frech K, Grote K, Klocke B, Haltmeier M, Klingenhoff A, Frisch M, Bayerlein M, Werner T (2005) MatInspector and beyond: promoter analysis based on transcription factor binding sites. Bioinformatics 21, 2933-42
  15. MiRDB - MicroRNA Target Prediction And Functional Study Database. Retrieved April 02, 2018, from http://mirdb.org/
  16. Cheng TS, Chang LK, Howng SL, Lu PJ, Lee CI, Hong YR (February 2006). "SUMO-1 modification of centrosomal protein hNinein promotes hNinein nuclear localization". Life Sciences. 78 (10): 1114–20.
  17. Kalderon, D., Roberts, B. L., Richardson, W. D., & Smith, A. E. (1984). A short amino acid sequence able to specify nuclear location. Cell,39(3), 499-509. doi:10.1016/0092-8674(84)90457-4
  18. Prediction of post-translational glycosylation and phosphorylation of proteins from the amino acid sequence. Blom N, Sicheritz-Ponten T, Gupta R, Gammeltoft S, Brunak S. Proteomics: Jun;4(6):1633-49, review 2004.
  19. Kumar S, Stecher G, Suleski M, Hedges SB (2017) TimeTree: A Resource for Timelines, Timetrees, and Divergence Times. Mol Biol Evol 34 (7): 1812-1819
  20. Stephen F. Altschul, Thomas L. Madden, Alejandro A. Schäffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs", Nucleic Acids Res. 25:3389-3402.
  21. Gasteiger E., Hoogland C., Gattiker A., Duvaud S., Wilkins M.R., Appel R.D., Bairoch A.; Protein Identification and Analysis Tools on the ExPASy Server; (In) John M. Walker (ed): The Proteomics Protocols Handbook, Humana Press (2005). pp. 571-607 "
  22. "CCDC166" . Retrieved 2018-05-05.
  23. "CCDC166" . Retrieved 2018-05-05.