C6orf136

Last updated
C6orf136
Identifiers
Aliases C6orf136 , chromosome 6 open reading frame 136
External IDs MGI: 1916912 HomoloGene: 17027 GeneCards: C6orf136
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001109938
NM_001161376
NM_145029

NM_001033630

RefSeq (protein)

NP_001103408
NP_001154848
NP_659466

n/a

Location (UCSC) Chr 6: 30.65 – 30.65 Mb Chr 17: 36.2 – 36.21 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

C6orf136 (Chromosome 6 Open Reading Frame 136) is a protein in humans ( Homo sapiens ) encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. [5] While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. [6] Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. [7] C6orf136 has three known isoforms.

Contents

Gene

Background

C6orf136, also known as DADB-129D20.1, MGC15854, LOC221545, and OTTHUMP00000214979. The gene is a poorly characterized protein coding gene in need of further research. The C6orf136 gene can be accessed on NCBI with accession number NM_001109938.3.

Location

C6orf136 is located on the short arm of chromosome 6 (6p21.33), starting at base pair (bp) 30,647,133 and ending at bp 30,653,207. This gene spans 6,074 bit/s on the plus (+) strand and contains a total of 6 exons. [8]

Gene Neighborhood

Genes in the neighborhood of C6orf136 are the following: ATAT1, PPP1R10, DHX16, PPP1R18, MDC1, MRPS18B, TUBB, and FLOT1. [8]

mRNA

C6orf136 has a total of 3 different isoforms. Isoform 1 is the base version of C6orf136 that encodes for the 315 amino acid protein. Isoform 3 uses an alternate in-frame splice site in the 5' coding region when compared to isoform 1, resulting in isoform 3 being longer than isoform 1. Alternatively, isoform 2 lacks an alternate in-frame exon in the 5' coding region when compared to isoform 1, resulting an isoform 2 being shorter than isoform 1

Protein

General Properties

The sequence for the C6orf136 isoform 1 gene per NCBI is as follows: [9]

MYQPSRGAARRLGPCLRAYQARPQDQLYPGTLPFPPLWPHSTTTTSPSSPLFWSPLPPRLPTQRLPQVPP  70 LPLPQIQALSSAWVVLPPGKGEEGPGPELHSGCLDGLRSLFEGPPCPYPGAWIPFQVPGTAHPSPATPSG 140 DPSMEEHLSVMYERLRQELPKLFLQSHDYSLYSLDVEFINEILNIRTKGRTWYILSLTLCRFLAWNYFAH 210 LRLEVLQLTRHPENWTLQARWRLVGLPVHLLFLRFYKRDKDEHYRTYDAYSTFYLNSSGLICRHRLDKLM 280 PSHSPPTPVKKLLVGALVALGLSEPEPDLNLCSKP                                    315

The bolded region in this sequence indicates a domain of unknown function (DUF2358) found in all three isoforms of C6orf136.

The C6orf136 protein has a molecular weight of 35.8 kD and an isoelectric point of 8.99, making the protein slightly basic and physiological pH.

Domains

DUF2358 is a domain of unknown function found within the C6orf136 protein from aa149 to aa274. [10] This domain is highly conserved in the C-terminus region and is evolutionarily conserved from plants to humans. [11] Additionally, a proline rich domain was also predicted from aa29 to aa142 of the human C6orf136 protein. [10]

Schematic illustration of the C6orf136 protein with proline rich domain and DU2358 domain. The gray markers indicate predicted phosphorylation sites, and the red marker indicates a predicted SUMOylation site. Image made with Prosite MyDomains tool. C6orf136 Diagram.png
Schematic illustration of the C6orf136 protein with proline rich domain and DU2358 domain. The gray markers indicate predicted phosphorylation sites, and the red marker indicates a predicted SUMOylation site. Image made with Prosite MyDomains tool.

Structure

Secondary Structure

The conserved DUF2358 domain of C6orf136 contains an equal mix of alpha helices and beta sheets interspersed in that region. [12] [13] [14] The N-terminus of the protein contained primarily alpha helices, but was poorly conserved across species.

Tertiary Structure

The tertiary structure illustrates a primarily alpha helices in the N-terminus of the protein loosely wound up, followed by a densely packed and folded region correlating to the DUF2358 domain with a mix of alpha helices and beta sheets as determined by I-TASSER. [15] [16] [17]

Regulation

Gene Regulation

Promotor

C6orf136 has 5 predicted promotor regions. The GXP_6051617 promotor had the largest number of transcripts and CAGE tags. It's located on the plus (+) strand, starts at position 30646644, ends at position 30647460, and is 817 bp in length. It also has 12 total coding transcripts. [18]

Schematic diagram of the C6orf136 mRNA transcript with the ElDorado suggested promotor sites and axons labelled. Regions are not drawn to scale. Promotor regions of C6orf136.png
Schematic diagram of the C6orf136 mRNA transcript with the ElDorado suggested promotor sites and axons labelled. Regions are not drawn to scale.
Promotor Regions of C6orf136
Promotor IDStart PositionEnd PositionLength# of Coding Transcripts
GXP_6051617 (+)306466443064746081712
GXP_2563514 (+)306489063064994510401
GXP_6051618 (+)306500543065109310401
GXP_6051619 (+)306502663065142311582
GXP_3204858 (+)306516113065265010400

Transcription Factor Binding Sites

The following table highlights the most likely transcription factors binding to the GXP_6051617 promotor for C6orf136. [18]

Matrix FamilyDetailed Family Information
V$ZF15C2H2 zinc finger transcription factors 15
V$NRF1Nuclear respiratory factor 1
V$MYBLCellular and viral myb-like transcriptional regulators
V$CALMCalmodulin-binding transcription factors
V$ZF07C2H2 zinc finger transcription factors 7
V$ZF5FZF5 POZ domain zinc finger
V$HANDTwist subfamily of class B bHLH transcription factors
V$KLFSKrueppel like transcription factors
V$SP1FGC-Box factors SP1/GC
V$EGRFEGR/nerve growth factor induced protein C & related factors
V$PLAGPleomorphic adenoma gene
V$EBOXE-box binding factors
V$RXRFRXR heterodimer binding sites
V$RREBRas-responsive element binding protein
V$NKXHNKX homeodomain factors
V$ETSFHuman and murine ETS1 factors
V$CEBPCcaat/Enhancer Binding Protein

Expression Pattern

C6orf136 is expressed highly in the heart, intestine, brain, and kidney tissue. [8] According to AceView, it is well expressed at 1.3x the average gene expression. [19]

Transcription Regulation

Stem Loop Prediction

The 3’ UTR sequence had a total of 7 step loops with a single site for potential miRNA binding. In contrast, the 5’ UTR had only 2 stem loops and contained no other notable regions. [20]

miRNA Targeting

TargetScan indicated a single has-miRNA-585-3p miRNA binding site in the 3' UTR, shown to be associated with tumor-suppressing properties with respect to gastric cancer. [21] [22]

Protein Regulation

Subcellular Localization

C6orf136 is predicted to be localized primarily in the nucleus in Homo sapiens , but is predicted to be primarily expressed in the mitochondria in other species. [23]

Post-Translational Modification

The C6orf136 gene has 8 predicted kinase-specific phosphorylation sites at positions 5, 28, 137, 139, 191, 256, 261, and 303, where 4 of the phosphorylation sites are serines, 3 sites are threonines, and 1 is a tryptophan. [24] Additionally, the protein also has a single predicted SUMOylation site at position 247 on a lysine with a p-value of 0.063. [25]

Homology

Paralogs

Relative mutation rate of C6orf136 (blue) compared to fibrinogen alpha (grey) and cytochrome C (orange) C6orf136 Evolution Rate.png
Relative mutation rate of C6orf136 (blue) compared to fibrinogen alpha (grey) and cytochrome C (orange)

No paralogs of C6orf136 have been detected in the human genome.

Orthologs

Below is a table of selected orthologs of the C6orf136 gene, including closely and distantly related orthologs. [26] C6orf136 has evolved moderately and evenly over time with a rate faster than Cytochrome C but slower than Fibrinogen Alpha.

Selected Orthologs of C6orf136
Genus and SpeciesCommon NameTaxon ClassDate of Divergence (MYA)Accession #Length (AA)% Identity with Human% Similarity with Human
Homo sapiensHumansPrimates0NP_001103408.1315100%100%
Pan troglodytesChimpanzeePrimates6.4PNI76372.1315100%100%
Mus musculusMouseRodentia89EDL23245.131580%87%
Chiroxiphia lanceolataLance-tailed manakinPasserine318XP_032533412.138460%76%
Chelonia mydasSea TurtleTestudines318XP_007068287.238663%74%
Gopherus evgoodeiGopher tortoiseTestudines318XP_030399707.132060%72%
Melopsittacus undulatusParakeetPsittaciformes318XP_033929477.128861%76%
Geotrypetes seraphiniGaboon caecilianGymnophiona351.7XP_033771275.141656%70%
Danio rerioZebrafishCypriniformes433NP_001076315.142349%70%
Apostichopus japonicusSea cucumberSynallactida627PIK49576.137641%59%
Strongylocentrotus purpuratusSea UrchinEchinoida627XP_030853574.151838%56%
Branchiostoma floridaeLanceletLancelet637XP_035683876.146045%64%
Aplysia californicaSea hareAplysiidae736XP_005104721.240925%50%
Anopheles darlingiMalaria mosquitoDiptera736ETN63757.130336%53%
Crassostrea virginicaOysterOstreoida736XP_022320078.135927%44%
Ixodes scapularisTicksIxodida736XP_029848376.135235%51%
Mytilus coruscushard-shelled musselMytilida736CAC5413351.136333%59%
Pomacea canaliculataChanneled applesnailMollusca736XP_025112199.128624%39%
Wasmannia auropunctataElectric antHymenoptera736XP_011701036.138736%56%
Trichoplax adhaerensTrichoplaxTricoplaciformes747XP_002109420.141534%57%
Amphimedon queenslandicaPoriferaPorifera777XP_019852039.130333%7%

Function

Proteins Interacting with C6orf136
ProteinFunctionMethodDatabases Present inTotal # of appearances
CSNK2B Localized to ER and Golgi, and involved with regulating metabolic pathways, signal transduction, transcription, translation, and replication. [27] Y2HiRefIndex; MINT; IMEx; mentha13
PLK1 Regulates cell cycle, specifically G2/M transition. Loss of PLK1 expression can induce pro-apoptotic pathways. This is being studied as a target for cancer drugs, specifically colon and lung cancers that are dependent on PLK1. (Oncogene). Also possible leukemia involvement. [28] Y2HiRefIndex; MINT; InnateDB-ALL; IMEx; mentha11
RBM8A Found predominantly in nucleus, but also in cytoplasm. Is associated with the mRNAs produced after splicing, and is thought to act as a tag to indicate where introns were present, thus coupling pre- and post-mRNA binding events. [29] Y2H; Affinity Chromotography; Anti-Tag CoimmunoprecipitationiRefIndex; InnateDB-All; MatrixDB; IntAct; IMEx; metha6
KIF21A Kinesin-like protein (motor protein). Could be involved in microtubule dependent transport. Mutation of this gene results in fibrosis of extraocular muscles. Not much else is currently known about this gene. [30] Affinity Chromotography; Anti-Tag CoimmunoprecipitationMatrixDB; IntAct; IMEx; mentha4
FBXW7 Gene that encodes for many proteins in the F-box protein family. Mutations in this gene are associated with a variety of cancers (cholangiocarcinoma, Endometrial carcinoma, colorectal carcinoma, bladder cancer, gastric carcinoma, lung squamous cell carcinoma, etc.). Thus it's likely that this gene plays a role in the pathogenesis of human cancers. [31] Genetic InterferenceInnateDB-1

Related Research Articles

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

<span class="mw-page-title-main">Glutamate rich 5</span> Protein-coding gene in the species Homo sapiens

Glutamate rich protein 5 is a protein in humans encoded by the ERICH5 gene, also known as chromosome 8 open reading frame 47 (C8orf47).

<span class="mw-page-title-main">C2orf73</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

C2orf81 is a human gene encoding protein c2orf81, which is predicted to have nuclear localization.

<span class="mw-page-title-main">C9orf50</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.

Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.

<span class="mw-page-title-main">TEDC2</span> Protein-coding gene in the species Homo sapiens

Tubulin epsilon and delta complex 2 (TEDC2), also known as Chromosome 16 open reading frame 59 (C16orf59), is a protein that in humans is encoded by the TEDC2 gene. Its NCBI accession number is NP_079384.2.

<span class="mw-page-title-main">C1orf185</span> Protein-coding gene in the species Homo sapiens

Chromosome 1 open reading frame 185, also known as C1orf185, is a protein that in humans is encoded by the C1orf185 gene. In humans, C1orf185 is a lowly expressed protein that has been found to be occasionally expressed in the circulatory system.

<span class="mw-page-title-main">C7orf50</span> Mammalian protein found in Homo sapiens

C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.

<span class="mw-page-title-main">C1orf94</span> Protein-coding gene in the species Homo sapiens

Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.

<span class="mw-page-title-main">C12orf24</span> Protein-coding gene in humans

C12orf24 is a gene in humans that encodes a protein known as FAM216A. This gene is primarily expressed in the testis and brain, but has constitutive expression in 25 other tissues. FAM216A is an intracellular protein that has been predicted to reside within the nucleus of cells. The exact function of C12orf24 is unknown. FAM216A is highly expressed in Sertoli cells of the testis as well as different stage spermatids.

C2orf74, also known as LOC339804, is a protein encoding gene located on the short arm of chromosome 2 near position 15 (2p15). Isoform 1 of the gene is 19,713 base pairs long. C2orf74 has orthologs in 135 different species, including primarily placental mammals and some marsupials.

<span class="mw-page-title-main">FAM214B</span> Protein-coding gene in the species Homo sapiens

The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.

<span class="mw-page-title-main">FAM120AOS</span> Protein-coding gene in the species Homo sapiens

FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

Chromosome 4 open reading frame 50 is a protein that in humans is encoded by the C4orf50 gene. The protein localizes in the nucleus. C4orf50 has orthologs in vertebrates but not invertebrates

<span class="mw-page-title-main">C13orf42</span> C13orf42 gene page

C13orf42 is a protein which, in humans, is encoded by the gene chromosome 13 open reading frame 42 (C13orf42). RNA sequencing data shows low expression of the C13orf42 gene in a variety of tissues. The C13orf42 protein is predicted to be localized in the mitochondria, nucleus, and cytosol. Tertiary structure predictions for C13orf42 indicate multiple alpha helices.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

<span class="mw-page-title-main">Chromosome 5 open reading frame 47</span> Human C5ORF47 Gene

Chromosome 5 Open Reading Frame 47, or C5ORF47, is a protein which, in humans, is encoded by the C5ORF47 gene. It also goes by the alias LOC133491. The human C5ORF47 gene is primarily expressed in the testis.

References

  1. 1 2 3 ENSG00000233164, ENSG00000237012, ENSG00000237100, ENSG00000204564, ENSG00000224120, ENSG00000206487 GRCh38: Ensembl release 89: ENSG00000233641, ENSG00000233164, ENSG00000237012, ENSG00000237100, ENSG00000204564, ENSG00000224120, ENSG00000206487 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000050705 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. "C6orf136 orthologs". NCBI. Retrieved 2020-09-30.
  6. Hwang S, Mahadevan S, Qadir F, Hutchison IL, Costea DE, Neppelberg E, et al. (December 2013). "Identification of FOXM1-induced epigenetic markers for head and neck squamous cell carcinomas". Cancer. 119 (24): 4249–58. doi: 10.1002/cncr.28354 . PMID   24114764.
  7. Tao T, Yuan S, Liu J, Shi D, Peng M, Li C, Wu S (February 2020). "Cancer stem cell-specific expression profiles reveal emerging bladder cancer biomarkers and identify circRNA_103809 as an important regulator in bladder cancer". Aging. 12 (4): 3354–3370. doi:10.18632/aging.102816. PMC   7066924 . PMID   32065779.
  8. 1 2 3 "C6orf136 chromosome 6 open reading frame 136 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-10-23.
  9. "uncharacterized protein C6orf136 isoform 1 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-10-24.
  10. 1 2 "Motif Scan". myhits.sib.swiss. Retrieved 2020-12-14.
  11. "Pfam: Family: DUF2358 (PF10184)". pfam.xfam.org. Retrieved 2020-12-14.
  12. "I-TASSER server for protein structure and function prediction". zhanglab.ccmb.med.umich.edu. Retrieved 2020-12-14.
  13. "PHYRE2 Protein Fold Recognition Server". www.sbg.bio.ic.ac.uk. Retrieved 2020-12-14.
  14. "Bioinformatics Toolkit". toolkit.tuebingen.mpg.de. Retrieved 2020-12-14.
  15. Roy A, Kucukural A, Zhang Y (April 2010). "I-TASSER: a unified platform for automated protein structure and function prediction". Nature Protocols. 5 (4): 725–38. doi:10.1038/nprot.2010.5. PMC   2849174 . PMID   20360767.
  16. Yang J, Zhang Y (July 2015). "I-TASSER server: new development for protein structure and function predictions". Nucleic Acids Research. 43 (W1): W174-81. doi: 10.1093/nar/gkv342 . PMC   4489253 . PMID   25883148.
  17. Yang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y (January 2015). "The I-TASSER Suite: protein structure and function prediction". Nature Methods. 12 (1): 7–8. doi:10.1038/nmeth.3213. PMC   4428668 . PMID   25549265.
  18. 1 2 "Genomatix - NGS Data Analysis & Personalized Medicine". www.genomatix.de. Retrieved 2020-12-14.
  19. "AceView: Gene:C6orf136, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov. Retrieved 2020-12-15.
  20. "miRDB - MicroRNA Target Prediction Database". www.mirdb.org. Retrieved 2020-12-15.
  21. "TargetScanHuman 7.2". www.targetscan.org. Retrieved 2020-12-15.
  22. Cummins JM, He Y, Leary RJ, Pagliarini R, Diaz LA, Sjoblom T, et al. (March 2006). "The colorectal microRNAome". Proceedings of the National Academy of Sciences of the United States of America. 103 (10): 3687–92. Bibcode:2006PNAS..103.3687C. doi: 10.1073/pnas.0511155103 . PMC   1450142 . PMID   16505370.
  23. "PSORT II Prediction". psort.hgc.jp. Retrieved 2020-12-15.
  24. "GPS 5.0 - Kinase-specific Phosphorylation Site Prediction". gps.biocuckoo.cn. Retrieved 2020-12-15.
  25. "GPS-SUMO: Prediction of SUMOylation Sites & SUMO-interaction Motifs". sumosp.biocuckoo.org. Retrieved 2020-12-15.
  26. "Protein BLAST: search protein databases using a protein query". blast.ncbi.nlm.nih.gov. Retrieved 2020-10-23.
  27. "CSNK2B Gene - GeneCards | CSK2B Protein | CSK2B Antibody". www.genecards.org. Retrieved 2020-12-15.
  28. "PLK1 Gene - GeneCards | PLK1 Protein | PLK1 Antibody". www.genecards.org. Retrieved 2020-12-15.
  29. "RBM8A Gene - GeneCards | RBM8A Protein | RBM8A Antibody". www.genecards.org. Retrieved 2020-12-15.
  30. "KIF21A Gene - GeneCards | KI21A Protein | KI21A Antibody". www.genecards.org. Retrieved 2020-12-15.
  31. "FBXW7 Gene - GeneCards | FBXW7 Protein | FBXW7 Antibody". www.genecards.org. Retrieved 2020-12-15.