CCDC92 | |||||||
---|---|---|---|---|---|---|---|
Identifiers | |||||||
Symbol | ? | ||||||
Alt. names | Limkain beta-2 | ||||||
UniProt | Q53HC0 | ||||||
Other data | |||||||
Locus | Chr. 12 q24.31{{{LocusSupplementaryData}}} | ||||||
|
CCDC92, or Limkain beta-2, is a protein which in humans is encoded by the CCDC92 gene. It is likely involved in DNA repair or reduction/oxidation reactions. The gene ubiquitously found in humans and is highly conserved across animals. [1] [2]
The CCDC92 gene is located at cytogenic location 12q24.31 and is 36,576 bases long with nine exons [3] which codes for a 331 amino-acid long protein.
The protein CCDC92 (Accession Number: NP_079416) is found in the nucleus [4] in humans. It has one domain, coiled-coil domain 92, from amino acids 23-82, which has no known function. The protein is rich in histidine and glutamic acid, and is deficient in phenylalanine. It has a molecular weight of 37kDal, a PI of 9.3, and has no charged domains, hydrophobic domains, or transmembrane domains. CCDC92 has conserved predicted phosphorylation sites at S211, S325, T21, T52, T122, Y130 [5] and conserved glycosylation sites at S183 and T244. [6]
mtsphfssyd egpldvsmaa tnlenqlhsa qknllflqre hastlkglhs eirrlqqhct dltyeltvks seqtgdgtsk sselkkrcee leaqlkvken enaellkele qknamitvle ntikerekky leelkakshk ltllsseleq rastiaylts qlhaakkklm sssgtsdasp sgspvlasyk pappkdklpe tprrrmkksl saplhpefee vyrfgaesrk lllrepvdam pdptpfllar esaevhlike rplvippias drsgeqhspa rekphkahvg vahrihhatp pqaqpevktl avdqvnggkv vrkhsgtdrt v
There is a large alpha helical section near the start of the protein which extends to near the midpoint of the protein, then two smaller helical sections are near the end (see conceptual translation below).
The tertiary structure of CCDC92 was predicted using I-TASSER and is shown to the right. I-TASSER has moderate confidence in the reliability of this structure (C-Score of -1.61). This structure is remarkably similar to that of an antiparallel domain in the protein PcsB in Streptococcus pneumoniae. This protein is involved in cleaving the cell wall, however the antiparallel domain's function is unknown.
In humans, CCDC92 is expressed ubiquitously at a medium to high level (shown right). [2] In dogs and mice, it is expressed ubiquitously, however at significantly lower levels. [7] [8]
CCDC92 has orthologs as far back as an acorn worm, [9] which diverged from humans 750 million years ago. [10] The most highly conserved domain is the coiled-coil domain 92, which is amino acids 23-83 in humans. [11] This region has no known functions and is not present in any other gene.
CCDC92 shares a 54% similarity with the protein SPPG_05228 in the fungus Spizellomyces punctatus. [9] Spizellomyces punctatus has an 8 amino acid stretch (LKGLHSEI) which matches perfectly with the coil-coiled domain 92 of the human variant. This sequence is present in primarily proteins which are involved in reduction/oxidation reactions, and some bind to DNA. [12]
Taxon | Common name | Divergence date | Accession # | Length | Identity | Similarity |
Homo sapiens | Human | 0 mya | NP_079416 | 331 aa | 100% | 100% |
Pan paniscus | Bonobo | 6.6 mya | XP_003812138 | 331 aa | 99% | 100% |
Mus musculus | House mouse | 90.9 mya | NP_659068 | 314 aa | 87% | 92% |
Ceratotherium simum simum | White rhino | 97.5 mya | XP_014642670 | 314 aa | 90% | 94% |
Orycteropus afer | Aardvark | 105.0 mya | XP_007936428 | 276 aa | 69% | 76% |
Meleagris gallapavo | Wild turkey | 320.5 mya | XP_010718371 | 315 aa | 86% | 92% |
Alligator mississippiensis | American Alligator | 320.5 mya | KQL90045 | 338 aa | 86% | 91% |
Python bivittatus | Burmese python | 320.5 mya | XP_007424723 | 335 aa | 84% | 91% |
Xenopus tropicalis | Western clawed frog | 355.7 mya | XP_012821424 | 357 aa | 76% | 87% |
Salmo salar | Atlantic salmon | 429.6 mya | NP_001167260 | 332 aa | 71% | 83% |
Callorhinchus milii | Australian ghostshark | 482.9 mya | XP_007905974 | 320 aa | 72% | 83% |
Danio rerio | Zebrafish | 429.6 mya | NP_001032794 | 349 aa | 62% | 74% |
Saccoglossus kowalevskii | Acorn worm | 747.8 mya | XP_002731228 | 316 aa | 34% | 55% |
Spizellomyces punctatus | Spizellomycetales | 1302.5 mya | KNC99855 | 363 aa | 40% | 54% |
The precise function of CCDC92 is not definitively known. However, based on interacting proteins, conserved sequences, and subcellular localization (nucleus), it can be discerned that a likely function of CCDC92 is DNA repair.
Interacting Protein | Protein Function |
CEP164 | DNA UV repair |
CHGB | Unknown |
UCH37 | DNA repair, recombination; breaks Lys-48 linked chains |
RPN9 | Regulatory Subunit for ATP degradation of ubiquitinated proteins |
aspS | Attaches glutamate to tRNA |
ppsA | Phosphorylates pyruvate to phosphoenolpyruvate |
ASPP2 | Regulates apoptosis |
TRIM27 | Mediates formation of Lys-48 linked polyubiquitin chains |
ELAVL1 | RNA-binding protein that increases stability; involved in embryonic stem cell differentiation |
In large B-cell Lymphona Lines, CCDC92 expression is increased in the presence of a histone deacetylase inhibitor (Panobinostat) or a hypomethylating agent (Decitabine). [14] It is further increased when these two drugs are combined and increase expression by up to 10 percentiles. In leukemia cell line, CCDC92 expression is also increased in the presence of a tyrosine-kinase inhibitor, Imantinib [15]
These two changes could be significant if CCDC92 is involved in repairing damaged oncogenes. If that was the case, any of the pharmaceuticals which increased CCDC92 expression could be used to introduce more of it into the body to find damaged DNA sequences and repair them.
SOGA2, also known as Suppressor of glucose autophagy associated 2 or CCDC165, is a protein that in humans is encoded by the SOGA2 gene. SOGA2 has two human paralogs, SOGA1 and SOGA3. In humans, the gene coding sequence is 151,349 base pairs long, with an mRNA of 6092 base pairs, and a protein sequence of 1586 amino acids. The SOGA2 gene is conserved in gorilla, baboon, galago, rat, mouse, cat, and more. There is distant conservation seen in organisms such as zebra finches and anoles. SOGA2 is ubiquitously expressed in humans, with especially high expression in brain, colon, pituitary gland, small intestine, spinal cord, testis and fetal brain.
TSR3, or TSR3 Ribosome Maturation Factor, is a hypothetical human protein found on chromosome 16. Its protein is 312 amino acids long and its cDNA has 1214 base pairs. It was previously designated C16orf42.
NBEAL1 is a protein that in humans is encoded by the NBEAL1 gene. It is found on chromosome 2q33.2 of Homo sapiens.
Coiled-coil domain-containing 37, also known as FLJ40083, is a protein that in humans is encoded by the CCDC37 gene (3q21.3). There is no confirmed function of CCDC37.
DEP Domain Containing Protein 1B also known as XTP1, XTP8, HBV XAg-Transactivated Protein 8, [formerly referred to as BRCC3] is a human protein encoded by a gene of similar name located on chromosome 5.
Family with sequence similarity 63, member A is a protein that, is encoded by the FAM63A gene in humans,. It is located on the minus strand of chromosome 1 at locus 1q21.3.
C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.
FAM76A is a protein that in Homo sapiens is encoded by the FAM76A gene. Notable structural characteristics of FAM76A include an 83 amino acid coiled coil domain as well as a four amino acid poly-serine compositional bias. FAM76A is conserved in most chordates but it is not found in other deuterostrome phlya such as echinodermata, hemichordata, or xenacoelomorpha—suggesting that FAM76A arose sometime after chordates in the evolutionary lineage. Furthermore, FAM76A is not found in fungi, plants, archaea, or bacteria. FAM76A is predicted to localize to the nucleus and may play a role in regulating transcription.
The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Retrotransposon Gag Like 6 is a protein encoded by the RTL6 gene in humans. RTL6 is a member of the Mart family of genes, which are related to Sushi-like retrotransposons and were derived from fish and amphibians. The RTL6 protein is localized to the nucleus and has a predicted leucine zipper motif that is known to bind nucleic acids in similar proteins, such as LDOC1.
Chromosome 19 open reading frame 18 (c19orf18) is a protein which in humans is encoded by the c19orf18 gene. The gene is exclusive to mammals and the protein is predicted to have a transmembrane domain and a coiled coil stretch. This protein has a function that is not yet fully understood by the scientific community.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.
Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.
C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.
Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.
RING Finger Protein 227, also known as RNF227 and LINC02581, is a protein which in humans is encoded by the RNF227 gene. According to DNA microarray data, it is found in at least 15 tissues.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.