Coiled-coil domain-containing 37 (FLJ40083)

Last updated

Coiled-coil domain-containing 37, also known as FLJ40083, is a protein that in humans is encoded by the CCDC37 gene (3q21.3). There is no confirmed function of CCDC37.

Contents

Gene

Locus

The human gene CCDC37 is found on chromosome 3 at the band 3q21.3. [1] It extends from base pairs 90,403,731 to 90,429,231, making the gene 25,500 base pairs long. It is located on the plus strand and contains 17 exons.[ citation needed ]

Homology

Paralogs

There is only one paralog for CCDC37 found in humans, CCDC38. CCDC38 is located on chromosome 12. [2]

Orthologs

The ortholog space of CCDC37 is fairly broad including mammals, reptiles, birds, amphibians, fish, invertebrates, and fungi.[ citation needed ]

Genus and speciesCommon nameClassAccessionPercent identity
Pan troglodytes ChimpanzeeMammaliaXP_516716.399%
Otolemur crassicaudatus BushbabyMammalia78%
Sus scrofa PigMammaliaXP_005666178.170%
Felis catus CatMammaliaXP_006929102.169%
Canis lupus DogMammaliaXP_005632343.168%
Mus musculus MouseMammaliaNP_776136.267%
Orcinus orca Killer whaleMammaliaXP_004285517.167%
Anolis carolinensis Carolina anoleReptiliaXP_003217822.154%
Python bivittatus Burmese pythonReptiliaXP_007437868.153%
Chrysemys picta bellii Painted turtleReptiliaXP_00528847952%
Pseudopodoces humilis Ground titAvesXP_005522312.146%
Gallus gallus ChickenAvesXP_425162.446%
Xenopus (Silurana) tropicalisWestern clawed frogAmphibiaXP_002938271.243%
Astyanax mexicanusBlind cave fishActinopterygiiXP_007253378.143%
Saccoglossus kowalevskiiAcorn wormEnteropneustaXP_002742365.143%
Ciona intestinalis Vase tunicateAscidiaceaXP_002131495.143%
Aplysia californicaCalifornia sea slugGastropodaXP_005108122.141%
Crassostrea gigasPacific oysterBivalviagbEKC37281.137%
Batrachochytrium dendrobatidisAmphibian chytrid fungusChytridiomycetesXP_006680088.137%

Protein

Primary sequence

The gene encodes a protein called CCDC37. This protein in 611 amino acids in length and has a molecular weight of 71.1 kilodaltons and an isoelectric point of pI=6.7.[ citation needed ]

Domains

CCDC37 contains a DUF4200 region located from amino acid 151 to 269. [1] There is no known function for DUF4200. CCDC37 also contains three coiled coil domains at amino acids 164–203, 392–436, and 526–571. [3]

Post-translational modifications

The protein has several probable post-translational modifications. It contains four possible PEST sequence at amino acids 17–36, 293–304, 337–360, and 360–395. [4] It also contains a possible substrate of N-acetyltransferase A at Ser2. [5]

Signal peptides

CCDC37 has a predicted nuclear localization via Reinhard's method [6] (reliability 94.1%) using a bipartite nuclear localization signal peptide starting at amino acid 155: KRQMFLLQYALDVKRRE. [7] CCDC37 also has a few predicted nuclear export signals I232, L235, I239, and M551. [8]

Expression

CCDC37 protein is widely expressed in mus musculus but only minimally so. Most areas that express CCDC37 have an expression level of 20-40%. Expression levels in the trigeminal nerve, testis, medial olfactory epithelium, dorsal root ganglia, and trachea are the highest with almost 75% expression. [9] CCDC37 is expressed in the cerebellum, medulla, and hippocampal formation in the brain of mus musculus. [10]

Between 20 and 30 days after birth in mus musculus, CCDC37 expression increases from less than 50% to about 85%. [11]

In rattus norvegicus CCDC37 is highly expressed in oligodendrocyte progenitor cells (approximately 85%) but only narrowly expressed in oligodendrocytes themselves (~40%). [12]

Interacting proteins

Transcription factors

There are many predicted transcription factor binding sites in the CCDC37 promoter.[ citation needed ] Below is a table of the best possibilities, which have high confidence values, evolutionary conservation, and/or multiple possible binding sites in the promoter.

Transcription FactorStartEndStrandSequence
Activator-, mediator- and TBP-dependent core promoter element for RNA polymerase II transcription from TATA-less promoters115125-ggGAGGgatcg
Nuclear factor 1185205+tgctTGGCacgtggcgaataa
E-box binding factors186202-ttcgccaCGTGccaagc
E-box binding factors187203+cttggCACGtggcgaat
Vertebrate homologues of enhancer of split complex187201-tcgccaCGTGccaag
Vertebrate homologues of enhancer of split complex188202+ttggCACGtggcgaa
CCAAT binding factors203217+taaaCCAAtcaggat
E-box binding factors288304+tgggccaGGCGctgtcc
E-box binding factors352368+ccggcccCGCGccctcc
Activator-, mediator- and TBP-dependent core promoter element for RNA polymerase II transcription from TATA-less promoters353363-gcGCGGggccg
RNA polymerase II transcription factor II B358364+ccgCGCC
cAMP-responsive element binding proteins416436-tggctgTGACgccacaaaggc
SOX/SRY-sex/testis determining and related HMG box factors438462+cctgaAGAAtggttgccttggagac
X-box binding factors449469-ggtttacgtctccaAGGCaac
cAMP-responsive element binding proteins452472-ttgggtTTACgtctccaaggc
GLI-Kruppel family member GLI37488-atcgCCACccacact
"Negative" glucocoticoid response elements478492-cagctccaGGAGcag
Core promoter motif ten elements538558-gaccgcgAGCGcacccaccga
GC-Box factors SP1/GC564580+agagggGGCGcgcgggg
ZF5 POZ domain zinc finger566580-ccccgCGCGccccct
E2F-myc activator/cell cycle regulator566582+aggggGCGCgcggggtc

Interactions

There have been three proteins found to interact by physical association with CCDC37 through a yeast two-hybrid screen: histone-lysine N-methyltransferase (SUV39H1), histone-lysine N-methyltransferase (SUV39H2), and lysine-specific histone demethylase 1A (KDM1A). [13]

Clinical significance

In a study of the genes expressed in lung squamous cell carcinomas it was found that the promoter region of CCDC37 was hyper methylated causing down regulation of the expression of CCDC37. [14] In a separate study, CCDC37 was also found in spatial and temporal regions in mice that are associated with hereditary congenital facial paresis (HCFP) gene. However through knock out experiments in mice it was found that CCDC37 was unlikely to be a causative agent for the HCFP phenotype. [15]

Related Research Articles

<span class="mw-page-title-main">CZIB</span> Protein-coding gene in the species Homo sapiens

CZIB is a gene in the human genome that encodes the protein CXXC motif containing zinc binding protein. CZIB was previously referred to as C1orf123.

<span class="mw-page-title-main">FAM63A</span> Protein-coding gene in the species Homo sapiens

Family with sequence similarity 63, member A is a protein that, is encoded by the FAM63A gene in humans,. It is located on the minus strand of chromosome 1 at locus 1q21.3.

C5orf34 is a protein that in humans is encoded by the C5orf34 gene (5p12).

The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

Coiled-coil domain containing protein 180 (CCDC180) is a protein that in humans is encoded by the CCDC180 gene. This protein is known to localize to the nucleus and is thought to be involved in regulation of transcription as are many proteins containing coiled-coil domains. As it is expressed most highly in the testes and is regulated by SRY and SOX transcription factors, it could be involved in sex determination.

<span class="mw-page-title-main">C12orf60</span> Protein-coding gene in humans

Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.

Leukocyte Receptor Cluster Member 9 is an uncharacterized protein encoded by the LENG9 gene. In humans, LENG9 is predicted to play a role in fertility and reproductive disorders associated with female endometrium structures.

BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

Forkhead-associated domain containing protein 1 (FHAD1) is a protein encoded by the FHAD1 gene.

<span class="mw-page-title-main">TEX9</span> Protein-coding gene in the species Homo sapiens

Testis-expressed protein 9 is a protein that in humans is encoded the TEX9 gene. TEX9 that encodes a 391-long amino acid protein containing two coiled-coil regions. The gene is conserved in many species and encodes orthologous proteins in eukarya, archaea, and one species of bacteria. The function of TEX9 is not yet fully understood, but it is suggested to have ATP-binding capabilities.

<span class="mw-page-title-main">SMCO3</span> Protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

<span class="mw-page-title-main">C1orf122</span> Protein-coding gene in the species Homo sapiens

C1orf122 is a gene in the human genome that encodes the cytosolic protein ALAESM.. ALAESM is present in all tissue cells and highly up-regulated in the brain, spinal cord, adrenal gland and kidney. This gene can be expressed up to 2.5 times the average gene in its highly expressed tissues. Although the function of C1orf122 is unknown, it is predicted to be used for mitochondria localization.

<span class="mw-page-title-main">C7orf50</span> Mammalian protein found in Homo sapiens

C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.

<span class="mw-page-title-main">KRBA1</span> Protein-coding gene in the species Homo sapiens

KRBA1 is a protein that in humans is encoded by the KRBA1 gene. It is located on the plus strand of chromosome 7 from 149,411,872 to 149,431,664. It is also commonly known under two other aliases: KIAA1862 and KRAB A Domain Containing 1 gene and encodes the KRBA1 protein in humans. The KRBA family of genes is understood to encode different transcriptional repressor proteins

<span class="mw-page-title-main">C14orf119</span> Protein-coding gene in the species Homo sapiens

C14orf119 is a protein that in humans is encoded by the c14orf119 gene. The c14orf119 protein is predicted to be localized in the nucleus. Additionally, c14orf119 expression is decreased in individuals with systemic lupus erythematosus (SLE) when compared with healthy individual and is increased in individuals with various types of lymphomas when compared to healthy individuals.

<span class="mw-page-title-main">CCDC190</span> Protein found in humans

Coiled-Coil Domain Containing 190, also known as C1orf110, the Chromosome 1 Open Reading Frame 110, MGC48998 and CCDC190, is found to be a protein coding gene widely expressed in vertebrates. RNA-seq gene expression profile shows that this gene selectively expressed in different organs of human body like lung brain and heart. The expression product of c1orf110 is often called Coiled-coil domain-containing protein 190 with a size of 302 aa. It may get the name because a coiled-coil domain is found from position 14 to 72. At least 6 spliced variants of its mRNA and 3 isoforms of this protein can be identified, which is caused by alternative splicing in human.

<span class="mw-page-title-main">PANO1</span> Mammalian protein found in Homo sapiens

PANO1 is a protein which in humans is encoded by the PANO1 gene. PANO1 is an apoptosis inducing protein that is able to regulate the function of tumor suppressor. More specifically, P14ARF is a protein in which in humans is modulated by the PANO1 gene. P14ARF is known to function as a tumor suppressor. When PANO1 is highly expressed in the cells, it is able to modulate p14ARF by stabilizing it and protecting it from degradation. With a confidence level of 5 out of 5, PANO1 has been theorized to be expressed in the nucleolus of the cell. PANO1 is an intron-less gene. Intron-less genes only make up about 3% of the human genome. A functional analysis of these types of genes revealed that they often have tissue-specific expression in tissues such as the nervous system and testis. This kind of expression is commonly associated with neuropathies, disease, and cancer. The tissue types that PANO1 has the highest expression in, are the cerebellum regions of the brain as well as pituitary and testis tissues.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

References

  1. 1 2 "CCDC37 coiled-coil domain containing 37 [Homo sapiens (human)] - Gene". Ncbi.nlm.nih.gov. Retrieved 2015-03-07.
  2. "CCDC38 coiled-coil domain containing 38 [Homo sapiens (human)] - Gene". Ncbi.nlm.nih.gov. Retrieved 2015-03-07.
  3. Dinkel, H. The eukaryotic linear motif resource ELM: 10 years and counting. Nucleic Acids Res. 2014 Jan;42(Database issue):D259-66.
  4. "EMBOSS: epestfind". Emboss.bioinformatics.nl. Retrieved 2015-03-07.
  5. Kiemer, Lars, Kyrlov Bendtsen, and Blom, Nikolai. NetAcet: Prediction of N-terminal Acetylation Sites. Bioinformatics, 2004.
  6. A. Reinhardt and T. Hubbard, Nucleic Acids Res. 26, 2230, 1998
  7. Dingwall C, Robbins J, Dilworth SM, Roberts B, Richardson WD (Sep 1988). "The nucleoplasmin nuclear location sequence is larger and more complex than that of SV-40 large T antigen". J. Cell Biol. 107 (3): 841–9.
  8. Analysis and prediction of leucine-rich nuclear export signals Tanja la Cour, Lars Kiemer, Anne Mølgaard, Ramneek Gupta, Karen Skriver and Søren Brunak Protein Eng. Des. Sel., 17(6):527-36, 2004.
  9. "4632676 - GEO Profiles - NCBI". Ncbi.nlm.nih.gov. 2014-11-12. Retrieved 2015-03-07.
  10. Primary publication: Lein, E.S. et al. (2007) Genome-wide atlas of gene expression in the adult mouse brain, Nature 445: 168-176. doi: 10.1038/nature05453; and
  11. "4786837 - GEO Profiles - NCBI". Ncbi.nlm.nih.gov. 2014-11-12. Retrieved 2015-03-07.
  12. "31253706 - GEO Profiles - NCBI". Ncbi.nlm.nih.gov. 2014-11-12. Retrieved 2015-03-07.
  13. Weimann, M. A Y2H-seq approach defines the human protein methyltransferase interactome. Nat Methods. 2013 Apr;10(4):339-42.
  14. Kwon, Yong-Jae PhD; Lee, Seog Joo MSc et al. Genome-Wide Analysis of DNA Methylation and the Gene Expression Change in Lung Cancer Journal of Thoracic Oncology: January 2012 - Volume 7 - Issue 1 - pp 20-33.
  15. "OMIM Entry - % 601471 - FACIAL PARESIS, HEREDITARY CONGENITAL, 1; HCFP1". Omim.org. Retrieved 2015-03-07.