CCDC92

Last updated
CCDC92
Identifiers
Symbol?
Alt. namesLimkain beta-2
UniProt Q53HC0
Other data
Locus Chr. 12 q24.31{{{LocusSupplementaryData}}}
Search for
Structures Swiss-model
Domains InterPro

CCDC92, or Limkain beta-2, is a protein which in humans is encoded by the CCDC92 gene. It is likely involved in DNA repair or reduction/oxidation reactions. The gene ubiquitously found in humans and is highly conserved across animals. [1] [2]

Contents

The CCDC92 gene is located at cytogenic location 12q24.31 and is 36,576 bases long with nine exons [3] which codes for a 331 amino-acid long protein.

Protein

The protein CCDC92 (Accession Number: NP_079416) is found in the nucleus [4] in humans. It has one domain, coiled-coil domain 92, from amino acids 23-82, which has no known function. The protein is rich in histidine and glutamic acid, and is deficient in phenylalanine. It has a molecular weight of 37kDal, a PI of 9.3, and has no charged domains, hydrophobic domains, or transmembrane domains. CCDC92 has conserved predicted phosphorylation sites at S211, S325, T21, T52, T122, Y130 [5] and conserved glycosylation sites at S183 and T244. [6]

Sequence

mtsphfssyd egpldvsmaa tnlenqlhsa qknllflqre hastlkglhs eirrlqqhct dltyeltvks seqtgdgtsk sselkkrcee leaqlkvken enaellkele qknamitvle ntikerekky leelkakshk ltllsseleq rastiaylts qlhaakkklm sssgtsdasp sgspvlasyk pappkdklpe tprrrmkksl saplhpefee vyrfgaesrk lllrepvdam pdptpfllar esaevhlike rplvippias drsgeqhspa rekphkahvg vahrihhatp  pqaqpevktl avdqvnggkv vrkhsgtdrt v

Structure

There is a large alpha helical section near the start of the protein which extends to near the midpoint of the protein, then two smaller helical sections are near the end (see conceptual translation below).

Predicted tertiary structure (I-TASSER) Tertiary Structure of CCDC92.jpg
Predicted tertiary structure (I-TASSER)

The tertiary structure of CCDC92 was predicted using I-TASSER and is shown to the right. I-TASSER has moderate confidence in the reliability of this structure (C-Score of -1.61). This structure is remarkably similar to that of an antiparallel domain in the protein PcsB in Streptococcus pneumoniae. This protein is involved in cleaving the cell wall, however the antiparallel domain's function is unknown.

Expression

GEO Profile of CCDC92 expression in humans. The blue dots indicate the percentile at which the protein was expressed in that tissue compared with all other proteins. Human GEO Profile.png
GEO Profile of CCDC92 expression in humans. The blue dots indicate the percentile at which the protein was expressed in that tissue compared with all other proteins.

In humans, CCDC92 is expressed ubiquitously at a medium to high level (shown right). [2] In dogs and mice, it is expressed ubiquitously, however at significantly lower levels. [7] [8]

Orthologs

Adjusted amino acid changes per 100 (m) against millions of years since the species diverged from humans determines how quickly the protein changes. Fibrinogen and Cytochrome C are given as a reference. Molecular Clock of CCDC92.png
Adjusted amino acid changes per 100 (m) against millions of years since the species diverged from humans determines how quickly the protein changes. Fibrinogen and Cytochrome C are given as a reference.

CCDC92 has orthologs as far back as an acorn worm, [9] which diverged from humans 750 million years ago. [10] The most highly conserved domain is the coiled-coil domain 92, which is amino acids 23-83 in humans. [11] This region has no known functions and is not present in any other gene.

CCDC92 shares a 54% similarity with the protein SPPG_05228 in the fungus Spizellomyces punctatus. [9] Spizellomyces punctatus has an 8 amino acid stretch (LKGLHSEI) which matches perfectly with the coil-coiled domain 92 of the human variant. This sequence is present in primarily proteins which are involved in reduction/oxidation reactions, and some bind to DNA. [12]

TaxonCommon nameDivergence dateAccession #LengthIdentitySimilarity
Homo sapiensHuman0 myaNP_079416331 aa100%100%
Pan paniscusBonobo6.6 myaXP_003812138331 aa99%100%
Mus musculusHouse mouse90.9 myaNP_659068314 aa87%92%
Ceratotherium simum simumWhite rhino97.5 myaXP_014642670314 aa90%94%
Orycteropus aferAardvark105.0 myaXP_007936428276 aa69%76%
Meleagris gallapavoWild turkey320.5 myaXP_010718371315 aa86%92%
Alligator mississippiensisAmerican Alligator320.5 myaKQL90045338 aa86%91%
Python bivittatusBurmese python320.5 myaXP_007424723335 aa84%91%
Xenopus tropicalisWestern clawed frog355.7 myaXP_012821424357 aa76%87%
Salmo salarAtlantic salmon429.6 myaNP_001167260332 aa71%83%
Callorhinchus miliiAustralian ghostshark482.9 myaXP_007905974320 aa72%83%
Danio rerioZebrafish429.6 myaNP_001032794349 aa62%74%
Saccoglossus kowalevskiiAcorn worm747.8 myaXP_002731228316 aa34%55%
Spizellomyces punctatusSpizellomycetales1302.5 myaKNC99855363 aa40%54%

Function

The precise function of CCDC92 is not definitively known. However, based on interacting proteins, conserved sequences, and subcellular localization (nucleus), it can be discerned that a likely function of CCDC92 is DNA repair.

Interacting Proteins

Interacting ProteinProtein Function
CEP164DNA UV repair
CHGBUnknown
UCH37DNA repair, recombination; breaks Lys-48 linked chains
RPN9Regulatory Subunit for ATP degradation of ubiquitinated proteins
aspSAttaches glutamate to tRNA
ppsAPhosphorylates pyruvate to phosphoenolpyruvate
ASPP2Regulates apoptosis
TRIM27Mediates formation of Lys-48 linked polyubiquitin chains
ELAVL1RNA-binding protein that increases stability; involved in embryonic stem cell differentiation

[13]

Clinical significance

GEO Profile of GDS4006. Observes the effect of a histone deacetylase inhibitor and a hypomethylating agent on CCDC92 levels. GEO GDS4006 CCDC92.png
GEO Profile of GDS4006. Observes the effect of a histone deacetylase inhibitor and a hypomethylating agent on CCDC92 levels.
GEO profile of CCDC92 for experiment GDS3049. Observes expression levels in the presence and absence of a tyrosine kinase inhibitor (Imantinib) GEO Profile of CCDC92 GDS3049.png
GEO profile of CCDC92 for experiment GDS3049. Observes expression levels in the presence and absence of a tyrosine kinase inhibitor (Imantinib)

In large B-cell Lymphona Lines, CCDC92 expression is increased in the presence of a histone deacetylase inhibitor (Panobinostat) or a hypomethylating agent (Decitabine). [14] It is further increased when these two drugs are combined and increase expression by up to 10 percentiles. In leukemia cell line, CCDC92 expression is also increased in the presence of a tyrosine-kinase inhibitor, Imantinib [15]

These two changes could be significant if CCDC92 is involved in repairing damaged oncogenes. If that was the case, any of the pharmaceuticals which increased CCDC92 expression could be used to introduce more of it into the body to find damaged DNA sequences and repair them.

Related Research Articles

<span class="mw-page-title-main">SOGA2</span> Protein-coding gene in the species Homo sapiens

SOGA2, also known as Suppressor of glucose autophagy associated 2 or CCDC165, is a protein that in humans is encoded by the SOGA2 gene. SOGA2 has two human paralogs, SOGA1 and SOGA3. In humans, the gene coding sequence is 151,349 base pairs long, with an mRNA of 6092 base pairs, and a protein sequence of 1586 amino acids. The SOGA2 gene is conserved in gorilla, baboon, galago, rat, mouse, cat, and more. There is distant conservation seen in organisms such as zebra finches and anoles. SOGA2 is ubiquitously expressed in humans, with especially high expression in brain, colon, pituitary gland, small intestine, spinal cord, testis and fetal brain.

<span class="mw-page-title-main">TSR3</span> Hypothetical human protein

TSR3, or TSR3 Ribosome Maturation Factor, is a hypothetical human protein found on chromosome 16. Its protein is 312 amino acids long and its cDNA has 1214 base pairs. It was previously designated C16orf42.

<span class="mw-page-title-main">NBEAL1</span> Protein-coding gene in the species Homo sapiens

NBEAL1 is a protein that in humans is encoded by the NBEAL1 gene. It is found on chromosome 2q33.2 of Homo sapiens.

Coiled-coil domain-containing 37, also known as FLJ40083, is a protein that in humans is encoded by the CCDC37 gene (3q21.3). There is no confirmed function of CCDC37.

<span class="mw-page-title-main">DEPDC1B</span> Protein-coding gene in the species Homo sapiens

DEP Domain Containing Protein 1B also known as XTP1, XTP8, HBV XAg-Transactivated Protein 8, [formerly referred to as BRCC3] is a human protein encoded by a gene of similar name located on chromosome 5.

<span class="mw-page-title-main">FAM63A</span> Protein-coding gene in the species Homo sapiens

Family with sequence similarity 63, member A is a protein that, is encoded by the FAM63A gene in humans,. It is located on the minus strand of chromosome 1 at locus 1q21.3.

<span class="mw-page-title-main">C8orf48</span> Protein-coding gene in the species Homo sapiens

C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.

<span class="mw-page-title-main">FAM76A</span> Protein-coding gene in the species Homo sapiens

FAM76A is a protein that in Homo sapiens is encoded by the FAM76A gene. Notable structural characteristics of FAM76A include an 83 amino acid coiled coil domain as well as a four amino acid poly-serine compositional bias. FAM76A is conserved in most chordates but it is not found in other deuterostrome phlya such as echinodermata, hemichordata, or xenacoelomorpha—suggesting that FAM76A arose sometime after chordates in the evolutionary lineage. Furthermore, FAM76A is not found in fungi, plants, archaea, or bacteria. FAM76A is predicted to localize to the nucleus and may play a role in regulating transcription.

The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

<span class="mw-page-title-main">RTL6</span>

Retrotransposon Gag Like 6 is a protein encoded by the RTL6 gene in humans. RTL6 is a member of the Mart family of genes, which are related to Sushi-like retrotransposons and were derived from fish and amphibians. The RTL6 protein is localized to the nucleus and has a predicted leucine zipper motif that is known to bind nucleic acids in similar proteins, such as LDOC1.

Chromosome 19 open reading frame 18 (c19orf18) is a protein which in humans is encoded by the c19orf18 gene. The gene is exclusive to mammals and the protein is predicted to have a transmembrane domain and a coiled coil stretch. This protein has a function that is not yet fully understood by the scientific community.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

<span class="mw-page-title-main">C9orf50</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.

Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.

<span class="mw-page-title-main">C7orf50</span> Mammalian protein found in Homo sapiens

C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.

<span class="mw-page-title-main">C1orf94</span> Protein-coding gene in the species Homo sapiens

Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.

RING Finger Protein 227, also known as RNF227 and LINC02581, is a protein which in humans is encoded by the RNF227 gene. According to DNA microarray data, it is found in at least 15 tissues.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

References

  1. GeneCards page for CCDC92
  2. 1 2 NCBI GEO Profile of CCDC92 for experiment GDS596 https://www.ncbi.nlm.nih.gov/geoprofiles/4697540
  3. NCBI Gene entry for CCDC92
  4. PSORT2 k-NN prediction for CCDC92
  5. Blom N, Gammeltoft S, Brunak S (December 1999). "Sequence and structure-based prediction of eukaryotic protein phosphorylation sites". Journal of Molecular Biology. 294 (5): 1351–62. doi:10.1006/jmbi.1999.3310. PMID   10600390.
  6. Steentoft C, Vakhrushev SY, Joshi HJ, Kong Y, Vester-Christensen MB, Schjoldager KT, Lavrsen K, Dabelsteen S, Pedersen NB, Marcos-Silva L, Gupta R, Bennett EP, Mandel U, Brunak S, Wandall HH, Levery SB, Clausen H (May 2013). "Precision mapping of the human O-GalNAc glycoproteome through SimpleCell technology". The EMBO Journal. 32 (10): 1478–88. doi:10.1038/emboj.2013.79. PMC   3655468 . PMID   23584533.
  7. NCBI GEO Profile for CCDC92 in experiment GDS3142
  8. NCBI GEO Profile of CCDC92 for experiment GDS4164.
  9. 1 2 NCBI BLAST query for Homo sapiens CCDC92 (Accession Number NP_079416)
  10. Time Tree query for Homo sapiens and Saccoglossus kowalevskii comparison
  11. NCBI accession number NP_079416
  12. NCBI BLAST query for sequence "LKGLHSEI"
  13. European Bioinformatics Institute PSICQUIC View Query for CCDC92
  14. NCBI GEO Profile of CCDC92 for experiment GDS4006 https://www.ncbi.nlm.nih.gov/geo/tools/profileGraph.cgi?ID=GDS4006:ILMN_1731107
  15. NCBI GEO Profile of CCDC92 for experiment GDS3049 https://www.ncbi.nlm.nih.gov/geo/tools/profileGraph.cgi?ID=GDS3049:218175_at

Sources