C18orf63

Last updated
C18orf63
Identifiers
Aliases C18orf63 , DKFZP781G0119, chromosome 18 open reading frame 63
External IDs MGI: 4936900 HomoloGene: 124404 GeneCards: C18orf63
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001174123

NM_001370919

RefSeq (protein)

NP_001167594

NP_001357848

Location (UCSC) Chr 18: 74.32 – 74.36 Mb Chr 18: 84.82 – 84.85 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Chromosome 18 open reading frame 63 is a protein which in humans is encoded by the C18orf63 gene. [5] This protein is not yet well understood by the scientific community. Research has been conducted suggesting that C18orf63 could be a potential biomarker for early stage pancreatic cancer and breast cancer. [6] [7]

Contents

Gene

This gene is located at band 22, sub-band 3, on the long arm of chromosome 18. It is composed of 5065 base pairs spanning from 74,315,875 to 74,359,187 bp on chromosome 18. [5] The gene has a total of 14 exons. [5] C18orf63 is also known by the alias DKFZP78G0119. [8] No isoforms exist for this gene. [5]

NCBI GEO Expression Profile for C18orf63 GEO Expression Profile for C18orf63.png
NCBI GEO Expression Profile for C18orf63

Expression

C18orf63 has high expression in the testis. [5] The gene shows low expression in the kidneys, liver, lung, and pelvis. [9] There is no phenotype associated with this gene. [5] [10]

Promoter

The promoter region for C18orf63 is 1163 bp long starting at 74,314,813 bp and ending at 74,315,975 bp. [11] The promoter ID is GXP_4417391. The presence of multiple y-box binding transcription factors and SRY transcription factor binding sites suggest that C18orf63 is involved in male sex determination. [12]

Protein

Amino acid composition of the average protein (left) and Amino acid composition of C18orf63 (right) Amino acid composition normal vs c18orf63 .png
Amino acid composition of the average protein (left) and Amino acid composition of C18orf63 (right)

The C18orf63 protein is composed up of 685 amino acids and has a molecular weight of 77230.50 Da, with a predicted isoelectric point of 9.83. [5] [13] No isoforms exist for this protein. [14] This protein is rich in glutamine, isoleucine, lysine, and serine when compared to the average protein, but lacks in aspartic acid and glycine. [15] [16]

Structure

Partial 3D structure for C18orf63 Image of DUF 4708.png
Partial 3D structure for C18orf63

In the predicted secondary structure for this protein there are a number of beta turns, beta strands and alpha helices. For C18orf63 48.6% of the protein is expected to form alpha helices and 28.6% of the structure is expected to be composed of beta strands. [17] [18]

Domains and Motifs

Motifs and Domains for C18orf63 Motifs and domains for c18orf63.png
Motifs and Domains for C18orf63

The protein contains one domain of unknown function, DUF 4709, spanning from the 7th amino acid to the 280th amino acid. [19] Motifs that are predicted to exist include an N-terminal motif, RxxL motif, and KEN conserving motif, which all signal for protein degradation. [20] Another motif that is predicted to exist is a Wxxx motif, which facilitates entrance of PTS1 cargo proteins into the organellar lumen, and a RVxPx motif which allows protein transport from the trans-Golgi network to the plasma membrane of the cilia. [21] [22] There is also a bipartite nuclear localization signal at the end of the protein sequence. [23] There is no trans-membrane domain present, indicating that C18orf63 is not a trans-membrane protein. [24]

Post-Translational Modifications

Post-translational modifications the protein is predicted to undergo include SUMOylation, PKC and CK2 phosphorylation, N-glycosylation, amiditation, and cleavage. [25] [26] [27] [28] There are six total PKC phosphorylation sites and 2 CK2 phosphorylation sites, 2 SUMOylation sites, and 2 N-glycosylation sites. There are no signal peptides present in this sequence. [28]

Subcellular Location

Due to the nuclear localization signal at the end of the protein sequence, C18orf63 is predicted to be nuclear. C18orf63 has also been predicted to be targeted to the mitochondria in addition to the nucleus. [29] [30] [31]

Homology

Orthologs

Orthologs have been found in most eukaryotes, with the exception of the class Amphibia . [14] No human paralogs exist for C18orf63. [14] [32] The most distant homolog detectable is Mizuhopecten yessoensis , sharing a 37% identity with the human protein sequence. The domain of unknown function was the only homologous domain present in the protein sequence, it was found to be highly conserved in all orthologs. The table below shows some examples of various orthologs for this protein.

Table of Orthologs for C18orf63
GenusSpeciesCommon NameAccession NumberSequence LengthSequence IdentitySequence Similarity
Mammalia GaleopterusvariegatusFlying lemurXP_008582575.167778%87%
FukomysdamarensisDamara mole-ratXP_019061329.165470%81%
EquusprzewalskiiPrzewalski's horseXP_008534756.175176%83%
LoxodontaafricanaAfrican bush elephantXP_023399495.167673%83%
ChinchillalanigeraLong-tailed chinchillaXP_005373135.167974%83%
Aves CorvuscornixHooded crowXP_019138065.274352%69%
SturnusvulgarisCommon starlingXP_014726419.174251%68%
StruthiocamelusSouthern ostrichXP_009668441.174144%62%
PhaethonlepturusWhite-tailed tropicbirdXP_010287785.174044%60%
NestornotabillisKeaXP_010018784.174143%60%
Reptillia OphiophagushannahKing cobraETE73844.167155%69%
AnoliscarolinensisCarolina anoleXP_008106943.171948%66%
PogonavitticepsCentral bearded dragonXP_020657479.167652%70%
ChrysemyspictaPainted turtleXP_008162704.177045%60%
Fish CallorhinchusmiliiAustralian ghostsharkXP_007901438.173857%74%
RhincodontypusWhale sharkXP_020370482.171241%55%
SalmosalarAtlantic salmonXP_0140366110.162643%60%
Invertebrates StylophorapistillataCoralXP_022802513.172133%57%
AcanthasterplanciCrown of thorns starfishXP_022082271.175037%56%
MizuhopectenyessoensisScallopOWF48219.126037%57%
Rate of evolution for C18orf63 when compared to betaglobin, fibrinogen alpha, and cytochrome c Rate of evolution c18orf63.png
Rate of evolution for C18orf63 when compared to betaglobin, fibrinogen alpha, and cytochrome c

Rate of Evolution

C18orf63 is a mildly slow evolving protein. The protein evolves faster than Cytochorme C but slower than Betaglobin. [14]

Interacting proteins

Transcription factors of interest predicted to bind to the regulatory sequence include p53 tumor suppressors, SRY testis determining factors, Y-box binding transcription factors, and glucocorticoid responsive elements. [11] The JUN protein was found to interact with C18orf63 through anti-bait co-immunoprecipitation. [33] The JUN protein binds to the USP28 promoter in colorectal cancer cells and is involved in the activation of these cancer cells. [34] [35]

Clinical significance

Mutations

A variety of missense mutations occur in the human population for this protein. In the regulatory sequence missense mutations occur at two transcription factor binding sites. [32] Transcription factors affected are glucocorticoid responsive elements and E2F-myc cell cycle regulars. There are eleven common mutations that occur that affect the protein sequence itself. [32] None of these mutations affect predicted post-translational modifications the protein sequence undergoes.

Disease association

C18orf63 has been associated with personality disorders, obesity, and type two diabetes through a genome-wide association study. [36] [37] [38] Currently research has not shown if C18orf63 plays a direct role in any of these diseases.

Related Research Articles

<span class="mw-page-title-main">REEP5</span> Protein-coding gene in the species Homo sapiens

Receptor expression-enhancing protein 5 is a protein that in humans is encoded by the REEP5 gene. Receptor Expression Enhancing Protein is a protein encoded for in Humans by the REEP5 gene.

<span class="mw-page-title-main">TMEM242</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 242 (TMEM242) is a protein that in humans is encoded by the TMEM242 gene. The tmem242 gene is located on chromosome 6, on the long arm, in band 2 section 5.3. This protein is also commonly called C6orf35, BM033, and UPF0463 Transmembrane Protein C6orf35. The tmem242 gene is 35,238 base pairs long, and the protein is 141 amino acids in length. The tmem242 gene contains 4 exons. The function of this protein is not well understood by the scientific community. This protein contains a DUF1358 domain.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

<span class="mw-page-title-main">ERICH2</span> Protein-coding gene in the species Homo sapiens

Glutamate Rich Protein 2 is a protein in humans encoded by the gene ERICH2. This protein is expressed heavily in male tissues specifically in the testes, and proteins are specifically found in the nucleoli fibrillar center and the vesicles of these testicular cells. The protein has multiple protein interactions which indicate that it may play a role in histone modification and proper histone functioning.

BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.

<span class="mw-page-title-main">CRACD-like protein</span>

CRACD-like protein. previously known as KIAA1211L is a protein that in humans is encoded by the CRACDL gene. It is highly expressed in the cerebral cortex of the brain. Furthermore, it is localized to the microtubules and the centrosomes and is subcellularly located in the nucleus. Finally, CRACDL is associated with certain mental disorders and various cancers.

C12orf66 is a protein that in humans is encoded by the C12orf66 gene. The C12orf66 protein is one of four proteins in the KICSTOR protein complex which negatively regulates mechanistic target of rapamycin complex 1 (mTORC1) signaling.

LCHN is a protein that in humans is encoded by the KIAA1147 gene located on chromosome 7. It is likely part of the tripartite DENN domain family of proteins that often function as Rab-GEFs to regulate vesicular trafficking. Both the mRNA and protein have been shown to be upregulated following ischemic stroke, and to be produced at altered levels in patients with FTD-ALS, however the gene's contribution to these states is not well understood.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

<span class="mw-page-title-main">LRRIQ3</span> Protein-coding gene in the species Homo sapiens

LRRIQ3, which is also known as LRRC44, is a protein that in humans is encoded by the LRRIQ3 gene. It is predominantly expressed in the testes, and is linked to a number of diseases.

<span class="mw-page-title-main">C4orf51</span> Protein-coding gene in the species Homo sapiens

Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.

<span class="mw-page-title-main">CXorf38 Isoform 1</span> Human protein

Chromosome X Open Reading Frame 38 (CXorf38) is a protein which, in humans, is encoded by the CXorf38 gene. CXorf38 appears in multiple studies regarding the escape of X chromosome inactivation.

<span class="mw-page-title-main">TEX55</span> Protein-coding gene in the species Homo sapiens

Testis expressed 55 (TEX55) is a human protein that is encoded by the C3orf30 gene located on the forward strand of human chromosome three, open reading frame 30 (3q13.32). TEX55 is also known as Testis-specific conserved, cAMP-dependent type II PK anchoring protein (TSCPA), and uncharacterized protein C3orf30.

Chromosome 1 open reading frame 141, or C1orf141 is a protein which, in humans, is encoded by gene C1orf141. It is a precursor protein that becomes active after cleavage. The function is not yet well understood, but it is suggested to be active during development

<span class="mw-page-title-main">C22orf23</span> Protein-coding gene in the species Homo sapiens

C22orf23 is a protein which in humans is encoded by the C22orf23 gene. Its predicted secondary structure consists of alpha helices and disordered/coil regions. It is expressed in many tissues and highest in the testes and it is conserved across many orthologs.

<span class="mw-page-title-main">TMEM155</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 155 is a protein that in humans is encoded by the TMEM155 gene. It is located on human chromosome 4, spanning 6,497 bases. It is also referred to as FLJ30834 and LOC132332. This protein is known to be expressed mainly in the brain, placenta, and lymph nodes and is conserved throughout most placental mammals. The function and structure of this protein is still not well understood, but its level of expression has been studied pertaining to various pathologies.

<span class="mw-page-title-main">C1orf185</span> Protein-coding gene in the species Homo sapiens

Chromosome 1 open reading frame 185, also known as C1orf185, is a protein that in humans is encoded by the C1orf185 gene. In humans, C1orf185 is a lowly expressed protein that has been found to be occasionally expressed in the circulatory system.

<span class="mw-page-title-main">C1orf94</span> Protein-coding gene in the species Homo sapiens

Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.

<span class="mw-page-title-main">LSMEM2</span> Protein-coding gene in the species Homo sapiens

Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.

Human protein 53 intron 1 (Hp53int1) is a protein encoded by the Hp53int1 gene in humans.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000206043 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000117781 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. 1 2 3 4 5 6 7 "C18orf63 chromosome 18 open reading frame 63 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2018-02-19.
  6. Zheng H, Zhao C, Qian M, Roy S, Soherwardy A, Roy D, Kuruc M (30 September 2015). New Proteomic Workflows Combine Albumin Depletion and On- Bead Digestion, for Quantitative Cancer Serum (PDF). Biotech Support Group (Report). Application Report. Rutgers Center for Integrative Proteomics.
  7. Kuruc M (April 2016). The Commonality of the Cancer Serum Proteome Phenotype as analyzed by LC-MS/MS, and Its Application to Monitor Dysregulated Wellness. American Association for Cancer Research Annual Meeting 2016. New Orleans LA, USA. doi:10.13140/rg.2.2.23237.65765.
  8. "C18orf63 Gene". GeneCards. Retrieved 2018-02-19.
  9. github.com/gxa/atlas/graphs/contributors, EMBL-EBI Expression Atlas development team. "Search results < Expression Atlas < EMBL-EBI". www.ebi.ac.uk. Retrieved 2018-04-26.{{cite web}}: |last= has generic name (help)
  10. Cosmic. "C18orf63 Gene - COSMIC". cancer.sanger.ac.uk. Retrieved 2018-04-27.
  11. 1 2 "Genomatix - NGS Data Analysis & Personalized Medicine". www.genomatix.de. Archived from the original on 2001-02-24. Retrieved 2018-04-27.
  12. "SRY gene". Genetics Home Reference. Retrieved 2018-05-05.
  13. "ExPASy - Compute pI/Mw tool". web.expasy.org. Retrieved 2018-04-26.
  14. 1 2 3 4 "Protein BLAST: search protein databases using a protein query". blast.ncbi.nlm.nih.gov. Retrieved 2018-04-26.
  15. EMBL-EBI. "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2018-05-01.
  16. "Amino Acid Frequency". www.tiem.utk.edu. Archived from the original on 2017-04-29. Retrieved 2018-05-01.
  17. Kumar TA. "CFSSP: Chou & Fasman Secondary Structure Prediction Server". www.biogem.org. Retrieved 2018-05-01.
  18. "I-TASSER server for protein structure and function prediction". zhanglab.ccmb.med.umich.edu. Retrieved 2018-05-01.
  19. "uncharacterized protein C18orf63 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2018-04-26.
  20. Morgan DO (June 2013). "The D box meets its match". Molecular Cell. 50 (5): 609–10. doi:10.1016/j.molcel.2013.05.023. PMC   3702177 . PMID   23746347.
  21. Neuhaus A, Kooshapur H, Wolf J, Meyer NH, Madl T, Saidowsky J, Hambruch E, Lazam A, Jung M, Sattler M, Schliebs W, Erdmann R (January 2014). "A novel Pex14 protein-interacting site of human Pex5 is critical for matrix protein import into peroxisomes". The Journal of Biological Chemistry. 289 (1): 437–48. doi: 10.1074/jbc.M113.499707 . PMC   3879566 . PMID   24235149.
  22. Ou Y, Zhang Y, Cheng M, Rattner JB, Dobrinski I, van der Hoorn FA (2012). "Targeting of CRMP-2 to the primary cilium is modulated by GSK-3β". PLOS ONE. 7 (11): e48773. Bibcode:2012PLoSO...748773O. doi: 10.1371/journal.pone.0048773 . PMC   3504062 . PMID   23185275.
  23. Nakai K, Horton P (January 1999). "PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization". Trends in Biochemical Sciences. 24 (1): 34–6. doi:10.1016/S0968-0004(98)01336-X. PMID   10087920.
  24. Möller S, Croning MD, Apweiler R (July 2001). "Evaluation of methods for the prediction of membrane spanning regions". Bioinformatics. 17 (7): 646–53. doi:10.1093/bioinformatics/17.7.646. PMID   11448883.
  25. "Motif Scan". myhits.isb-sib.ch. Retrieved 2018-04-27.
  26. "NetAcet 1.0 Server". www.cbs.dtu.dk. Retrieved 2018-04-27.
  27. "NetNGlyc 1.0 Server". www.cbs.dtu.dk. Retrieved 2018-04-27.
  28. 1 2 Petersen TN, Brunak S, von Heijne G, Nielsen H (September 2011). "SignalP 4.0: discriminating signal peptides from transmembrane regions". Nature Methods. 8 (10): 785–6. doi: 10.1038/nmeth.1701 . PMID   21959131. S2CID   16509924.
  29. "Cell atlas - C18orf63 - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2018-05-01.
  30. "PSORT: Protein Subcellular Localization Prediction Tool". www.genscript.com. Retrieved 2018-05-01.
  31. "TargetP 1.1 Server". www.cbs.dtu.dk. Retrieved 2018-05-01.
  32. 1 2 3 "Human BLAT Search". genome.ucsc.edu. Retrieved 2018-04-27.
  33. Li X, Wang W, Wang J, Malovannaya A, Xi Y, Li W, Guerra R, Hawke DH, Qin J, Chen J (January 2015). "Proteomic analyses reveal distinct chromatin-associated and soluble transcription factor complexes". Molecular Systems Biology. 11 (1): 775. doi:10.15252/msb.20145504. PMC   4332150 . PMID   25609649.
  34. "JUN - Transcription factor AP-1 - Homo sapiens (Human) - JUN gene & protein". www.uniprot.org. Retrieved 2018-05-01.
  35. Serra RW, Fang M, Park SM, Hutchinson L, Green MR (March 2014). "A KRAS-directed transcriptional silencing pathway that mediates the CpG island methylator phenotype". eLife. 3: e02313. doi: 10.7554/eLife.02313 . PMC   3949416 . PMID   24623306.
  36. Terracciano A, Sanna S, Uda M, Deiana B, Usala G, Busonero F, Maschio A, Scally M, Patriciu N, Chen WM, Distel MA, Slagboom EP, Boomsma DI, Villafuerte S, Sliwerska E, Burmeister M, Amin N, Janssens AC, van Duijn CM, Schlessinger D, Abecasis GR, Costa PT (June 2010). "Genome-wide association scan for five major dimensions of personality". Molecular Psychiatry. 15 (6): 647–56. doi:10.1038/mp.2008.113. PMC   2874623 . PMID   18957941.
  37. Comuzzie AG, Cole SA, Laston SL, Voruganti VS, Haack K, Gibbs RA, Butte NF (2012). "Novel genetic loci identified for the pathophysiology of childhood obesity in the Hispanic population". PLOS ONE. 7 (12): e51954. Bibcode:2012PLoSO...751954C. doi: 10.1371/journal.pone.0051954 . PMC   3522587 . PMID   23251661.
  38. Saxena R, Voight BF, Lyssenko V, Burtt NP, de Bakker PI, Chen H, Roix JJ, Kathiresan S, Hirschhorn JN, Daly MJ, Hughes TE, Groop L, Altshuler D, Almgren P, Florez JC, Meyer J, Ardlie K, Bengtsson Boström K, Isomaa B, Lettre G, Lindblad U, Lyon HN, Melander O, Newton-Cheh C, Nilsson P, Orho-Melander M, Råstam L, Speliotes EK, Taskinen MR, Tuomi T, Guiducci C, Berglund A, Carlson J, Gianniny L, Hackett R, Hall L, Holmkvist J, Laurila E, Sjögren M, Sterner M, Surti A, Svensson M, Svensson M, Tewhey R, Blumenstiel B, Parkin M, Defelice M, Barry R, Brodeur W, Camarata J, Chia N, Fava M, Gibbons J, Handsaker B, Healy C, Nguyen K, Gates C, Sougnez C, Gage D, Nizzari M, Gabriel SB, Chirn GW, Ma Q, Parikh H, Richardson D, Ricke D, Purcell S (June 2007). "Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels". Science. 316 (5829): 1331–6. Bibcode:2007Sci...316.1331.. doi: 10.1126/science.1142358 . PMID   17463246. S2CID   26332244.