C6orf58

Last updated
C6orf58
Identifiers
Aliases C6orf58 , LEG1, chromosome 6 open reading frame 58
External IDs HomoloGene: 134042 GeneCards: C6orf58
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001010905

n/a

RefSeq (protein)

NP_001010905

n/a

Location (UCSC) Chr 6: 127.52 – 127.59 Mb n/a
PubMed search [2] n/a
Wikidata
View/Edit Human

C6orf58 is a human gene located at locus 6q22.33 of chromosome 6 and encodes for UPF0762, a protein which is subsequently secreted after cleavage of a signal peptide. [3] DUF781, which is the singular identifiable domain in UPF0762, is tied to liver development in an orthologous protein in zebrafish. [4] The function of the human UPF0762 is not yet well characterized. [5]

Contents

Gene and mRNA

Genomic DNA Length (base pairs) Exons Mature mRNA Length (base pairs)Splice variantsSignal peptide CDS (base pair)Mature Peptide CDS (base pair) 5'-UTR (base pair) 3'-UTR (base pair)
14644 [3] 6 [3] 1200 [3] 3 [5] 13-72 [3] 73-1002 [3] 1-12 [3] 1003-1200 [3]

Expression

While there are 3 splice variants of C6orf58, only one encodes a good protein. [5] In humans, C6orf58 expressed sequence tags were primarily detected in the larynx and trachea. [6] Transcripts were only detected during the adult stage of development. [6] Experimental microarray data, however, reveals additional regions of C6orf58 expression, namely in the salivary gland, thyroid, and small intestine. [7] Arsenic may also regulate expression as it increases methylation of the C6orf58 promoter. [8]

A microarray experiment of various tissues shows C6orf58 expression to be limited. GEO Profiles Tissue Expression Graph.png
A microarray experiment of various tissues shows C6orf58 expression to be limited.

Gene Neighborhood

Genes within 500 Kilobases of C6orf58 include RSPO3, C6orf174, KIAA0408, RPL17P23, ECHDC1, RPL5P18, YWHAZP4, LOC100420743, LOC100421513, MRPS17P5, and THEMIS.

Homology

A selected set of homologous sequences are listed below, with sequence identity being calculated in comparison to the human reference sequence.

SpeciesCommon NameAccession NumberSequence Length (base pairs)Sequence Identity
Nomascus leucogenys Northern white-cheeked gibbon XM_003255689.1 1190.97
Macaca mulatta Rhesus monkey NM_001194318.1 1190.95
Oryctolagus cuniculus European rabbit XM_002714721.1 1014.79
Loxodonta africana African bush elephant XM_003404026.1 1020.78
Cavia porcellus Guinea pig XM_003468475.1 1017.76
Equus caballus Horse XM_001917090.1 990.77

Protein

Properties

Amino acid length (amino acids) Signal Peptide Length (amino acids) Molecular Weight of Precursor Protein Molecular Weight of Signal Peptide (Predicted) Molecular Weight of Mature Peptide(predicted) Molecular Weight(observed) Isoelectric Point (Predicted) N-linked glycosylation Site
330 [3] 20 [3] 37.9 kDa [9] 2.1 kDa [9] 35.8 kDa [9] 32 kDa [10] 5.78 [9] Amino acid 69

Mass spectrometry has shown that the observed molecular weight of UPF0762 is 32kDa. [10] It remains unclear why the observed molecular weight is less than predicted, even after accounting for cleavage of the signal peptide. Attachment of a sugar at the site of N-linked glycosylation would also increase the molecular weight.

Homology

UPF0762 shows high homology in primates and orthologous proteins can be traced back as far as trichoplax adhaerens. The list of proteins below is not a comprehensive listing of UPF0762 orthologs. Sequence identity and similarity were determined using BLAST [11] with the reference human sequence as the query.

SpeciesCommon NameAccession NumberSequence Length (amino acids)Sequence Identity (%)Sequence Similarity (%)
Pan troglodytes Chimpanzee XP_518733.2 33011
Pongo abelii Sumatran orangutan XP_002817388.1 330.98.99
Callithrix jacchus Marmoset XP_002746989.1 330.87.93
Canis lupus Gray wolf XP_851589.1 310.7.82
Taeniopygia guttata Zebra finch XP_002190886.1 364.43.63
Gallus gallus Red junglefowl XP_419749.3 371.42.6
Xenopus tropicalis Western clawed frog XP_002940437.1 1780.290.51
Trichoplax adhaerensN/A XP_002111384 381.34.49

Conserved domains

DUF781 is the singular domain of the protein and spans 318 of the protein's 330 amino acids. DUF781 has been linked to liver development in zebrafish. [4]

Post-translational modifications

Observed post-translational modifications include N-linked glycosylation at amino acid 69. [12] A signal peptide, which is predicted to direct the protein to the endoplasmic reticulum for secretion, [13] is cleaved from the first 20 amino acids of the peptide sequence. [3] The missense mutation S18F detected in hepatocellular carcinoma [14] significantly reduces the predicted cleavage score of the signal peptide. [15]

A graphical representation of UPF0762 showing various post-translational modifications. C6orf58 protein structure.png
A graphical representation of UPF0762 showing various post-translational modifications.

Interactions

Human C6orf58 has been reported to interact with the enzyme ribonucleotide reductase as encoded by the vaccinia virus through a yeast two-hybrid screen. [16]

Pathology

Statistical analysis has shown C6orf58 to be associated with pancreatic cancer survival time. [17] In addition, a missense mutation at amino acid 18 has been observed in liver cancer cells where serine becomes phenylalanine. [14] Analysis of the mutated protein sequence for a signal peptide shows cleavability at the regular amino acid 20 is lost. [15] DUF781's association with liver development and the missense mutation's association with liver cancer is a correlation that remains to be investigated.

A SignalP analysis of the reference sequence and a sequence with the mutation S18F resulted in a significant drop in cleavage of the signal peptide. Problem set 4 SignalP Final3.tif
A SignalP analysis of the reference sequence and a sequence with the mutation S18F resulted in a significant drop in cleavage of the signal peptide.

Related Research Articles

<span class="mw-page-title-main">Point mutation</span> Replacement, insertion, or deletion of a single DNA or RNA nucleotide

A point mutation is a genetic mutation where a single nucleotide base is changed, inserted or deleted from a DNA or RNA sequence of an organism's genome. Point mutations have a variety of effects on the downstream protein product—consequences that are moderately predictable based upon the specifics of the mutation. These consequences can range from no effect to deleterious effects, with regard to protein production, composition, and function.

<span class="mw-page-title-main">Aldolase B</span> Mammalian protein found in Homo sapiens

Aldolase B also known as fructose-bisphosphate aldolase B or liver-type aldolase is one of three isoenzymes of the class I fructose 1,6-bisphosphate aldolase enzyme, and plays a key role in both glycolysis and gluconeogenesis. The generic fructose 1,6-bisphosphate aldolase enzyme catalyzes the reversible cleavage of fructose 1,6-bisphosphate (FBP) into glyceraldehyde 3-phosphate and dihydroxyacetone phosphate (DHAP) as well as the reversible cleavage of fructose 1-phosphate (F1P) into glyceraldehyde and dihydroxyacetone phosphate. In mammals, aldolase B is preferentially expressed in the liver, while aldolase A is expressed in muscle and erythrocytes and aldolase C is expressed in the brain. Slight differences in isozyme structure result in different activities for the two substrate molecules: FBP and fructose 1-phosphate. Aldolase B exhibits no preference and thus catalyzes both reactions, while aldolases A and C prefer FBP.

<span class="mw-page-title-main">Ornithine aminotransferase</span> Class of enzymes

Ornithine aminotransferase (OAT) is an enzyme which is encoded in human by the OAT gene located on chromosome 10.

<span class="mw-page-title-main">Fumarylacetoacetate hydrolase</span>

Fumarylacetoacetase is an enzyme that in humans is encoded by the FAH gene located on chromosome 15. The FAH gene is thought to be involved in the catabolism of the amino acid phenylalanine in humans.

<span class="mw-page-title-main">Aminomethyltransferase</span> Protein-coding gene in the species Homo sapiens

Aminomethyltransferase is an enzyme that catabolizes the creation of methylenetetrahydrofolate. It is part of the glycine decarboxylase complex.

<span class="mw-page-title-main">PRKCSH</span> Protein-coding gene in the species Homo sapiens

Glucosidase 2 subunit beta is an enzyme that in humans is encoded by the PRKCSH gene.

<span class="mw-page-title-main">CRYBB1</span> Protein-coding gene in the species Homo sapiens

Beta-crystallin B1 is a protein that in humans is encoded by the CRYBB1 gene. Variants in CRYBB1 are associated with autosomal dominant congenital cataract.

<span class="mw-page-title-main">EIF4G3</span> Protein-coding gene in the species Homo sapiens

Eukaryotic translation initiation factor 4 gamma 3 is a protein that in humans is encoded by the EIF4G3 gene. The gene encodes a protein that functions in translation by aiding the assembly of the ribosome onto the messenger RNA template. Confusingly, this protein is usually referred to as eIF4GII, as although EIF4G3 is the third gene that is similar to eukaryotic translation initiation factor 4 gamma, the second isoform EIF4G2 is not an active translation initiation factor.

<span class="mw-page-title-main">VRK1</span> Protein-coding gene in the species Homo sapiens

Serine/threonine-protein kinase VRK1 is an enzyme that in humans is encoded by the VRK1 gene.

<span class="mw-page-title-main">60S ribosomal protein L23</span> Protein found in humans

60S ribosomal protein L23 is a protein that in humans is encoded by the RPL23 gene.

<span class="mw-page-title-main">CPN1</span> Protein-coding gene in the species Homo sapiens

Carboxypeptidase N catalytic chain is an enzyme that in humans is encoded by the CPN1 gene.

<span class="mw-page-title-main">BAAT</span> Mammalian protein found in Homo sapiens

Bile acid-CoA:amino acid N-acyltransferase is an enzyme that in humans is encoded by the BAAT gene.

<span class="mw-page-title-main">PHKB</span> Protein-coding gene in the species Homo sapiens

Phosphorylase b kinase regulatory subunit beta is an enzyme that in humans is encoded by the PHKB gene.

<span class="mw-page-title-main">SPPL2A</span> Protein-coding gene in the species Homo sapiens

Signal peptide peptidase-like 2A, also known as SPPL2A, is a human gene.

<span class="mw-page-title-main">GCSH</span> Protein-coding gene in the species Homo sapiens

Glycine cleavage system H protein, mitochondrial is a protein that in humans is encoded by the GCSH gene. Degradation of glycine is brought about by the glycine cleavage system (GCS), which is composed of 4 protein components: P protein, H protein, T protein, and L protein. The H protein shuttles the methylamine group of glycine from the P protein to the T protein. The protein encoded by GCSH gene is the H protein, which transfers the methylamine group of glycine from the P protein to the T protein. Defects in this gene are a cause of nonketotic hyperglycinemia (NKH). Two transcript variants, one protein-coding and the other probably not protein-coding, have been found for this gene. Also, several transcribed and non-transcribed pseudogenes of this gene exist throughout the genome.

<span class="mw-page-title-main">MASTL</span> Protein-coding gene in the species Homo sapiens

MASTL is an official symbol provided by HGNC for human gene whose official name is micro tubule associated serine/threonine kinase like. This gene is 32,1 kbps long. This gene is also known as GW, GWL, THC2, MAST-L, GREATWALL. This is present in mainly mammalian cells like human, house mouse, cattle, monkey, etc. It is in the 10th chromosome of the mammalian nucleus. Recent studies have been carried on zebrafish and frogs. This gene encodes for the protein micro tubule associated serine/threonine kinase and its sub-classes.

<span class="mw-page-title-main">CCDC113</span> Protein-coding gene in humans

Coiled-coil domain-containing protein 113 also known as HSPC065, GC16Pof6842 and GC16P044152, is a protein that in humans is encoded by the CCDC113 gene. The human CCDC113 gene is located on chromosome 16q21 and encodes 5,304 base pairs of mRNA and 377 amino acids.

<span class="mw-page-title-main">TMEM106A</span> Protein-coding gene in the species Homo sapiens

TMEM106A is a gene that encodes the transmembrane protein 106A (TMEM106A) in Homo sapiens. It is located at 17q21.31 on the plus strand next to cancer-related genes NBR1 and BRCA1. The TMEM106A gene contains a domain of unknown function, DUF1356.

Glycogen phosphorylase, liver form (PYGL), also known as human liver glycogen phosphorylase (HLGP), is an enzyme that in humans is encoded by the PYGL gene on chromosome 14. This gene encodes a homodimeric protein that catalyses the cleavage of alpha-1,4-glucosidic bonds to release glucose-1-phosphate from liver glycogen stores. This protein switches from inactive phosphorylase B to active phosphorylase A by phosphorylation of serine residue 14. Activity of this enzyme is further regulated by multiple allosteric effectors and hormonal controls. Humans have three glycogen phosphorylase genes that encode distinct isozymes that are primarily expressed in liver, brain and muscle, respectively. The liver isozyme serves the glycemic demands of the body in general while the brain and muscle isozymes supply just those tissues. In glycogen storage disease type VI, also known as Hers disease, mutations in liver glycogen phosphorylase inhibit the conversion of glycogen to glucose and results in moderate hypoglycemia, mild ketosis, growth retardation and hepatomegaly. Alternative splicing results in multiple transcript variants encoding different isoforms [provided by RefSeq, Feb 2011].

<span class="mw-page-title-main">C18orf63</span> Protein-coding gene in the species Homo sapiens

Chromosome 18 open reading frame 63 is a protein which in humans is encoded by the C18orf63 gene. This protein is not yet well understood by the scientific community. Research has been conducted suggesting that C18orf63 could be a potential biomarker for early stage pancreatic cancer and breast cancer.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000184530 - Ensembl, May 2017
  2. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  3. 1 2 3 4 5 6 7 8 9 10 11 "Homo sapiens chromosome 6 open reading frame 58 (C6orf58), mRNA". National Center for Biotechnology Information. Retrieved 26 April 2012.
  4. 1 2 Chang C, Hu M, Zhu Z, Lo LJ, Chen J, Peng J (2011). "liver-enriched gene 1a and 1b encode novel secretory proteins essential for normal liver development in zebrafish". PLOS ONE. 6 (8): e22910. Bibcode:2011PLoSO...622910C. doi: 10.1371/journal.pone.0022910 . PMC   3153479 . PMID   21857963.
  5. 1 2 3 Thierry-Mieg, Danielle. "AceView: integrative annotation of cDNA-supported genes in human, mouse, rat, worm and Arabidopsis". NCBI. Retrieved 30 April 2012.
  6. 1 2 "EST Profile Hs.226268". NCBI. Retrieved 30 April 2012.
  7. Dezso Z, Nikolsky Y, Sviridov E, Shi W, Serebriyskaya T, Dosymbekov D, Bugrim A, Rakhmatulin E, Brennan RJ, Guryanov A, Li K, Blake J, Samaha RR, Nikolskaya T (2008). "A comprehensive functional analysis of tissue specificity of human gene expression". BMC Biol. 6: 49. doi: 10.1186/1741-7007-6-49 . PMC   2645369 . PMID   19014478.
  8. Smeester L, Rager JE, Bailey KA, Guan X, Smith N, García-Vargas G, Del Razo LM, Drobná Z, Kelkar H, Stýblo M, Fry RC (2011). "Epigenetic changes in individuals with arsenicosis". Chem. Res. Toxicol. 24 (2): 165–7. doi:10.1021/tx1004419. PMC   3042796 . PMID   21291286.
  9. 1 2 3 4 Wilkins MR, Gasteiger E, Bairoch A, Sanchez JC, Williams KL, Appel RD, Hochstrasser DF (1999). "Protein identification and analysis tools in the ExPASy server". 2-D Proteome Analysis Protocols. Methods Mol. Biol. Vol. 112. pp. 531–52. doi:10.1385/1-59259-584-7:531. ISBN   1-59259-584-7. PMID   10027275 . Retrieved 30 April 2012.
  10. 1 2 Mangum JE, Crombie FA, Kilpatrick N, Manton DJ, Hubbard MJ (October 2010). "Surface integrity governs the proteome of hypomineralized enamel". J. Dent. Res. 89 (10): 1160–5. doi:10.1177/0022034510375824. PMID   20651090. S2CID   21703818.
  11. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990). "Basic local alignment search tool". J. Mol. Biol. 215 (3): 403–10. doi:10.1016/S0022-2836(05)80360-2. PMID   2231712. S2CID   14441902.
  12. Ramachandran P, Boontheung P, Xie Y, Sondej M, Wong DT, Loo JA (June 2006). "Identification of N-linked glycoproteins in human saliva by glycoprotein capture and mass spectrometry". J. Proteome Res. 5 (6): 1493–503. doi:10.1021/pr050492k. PMID   16740002.
  13. Caboche, Michel. "Predotar". Archived from the original on 28 February 2009. Retrieved 7 May 2012.
  14. 1 2 Li M, Zhao H, Zhang X, Wood LD, Anders RA, Choti MA, Pawlik TM, Daniel HD, Kannangai R, Offerhaus GJ, Velculescu VE, Wang L, Zhou S, Vogelstein B, Hruban RH, Papadopoulos N, Cai J, Torbenson MS, Kinzler KW (2011). "Inactivating mutations of the chromatin remodeling gene ARID2 in hepatocellular carcinoma". Nat. Genet. 43 (9): 828–9. doi:10.1038/ng.903. PMC   3163746 . PMID   21822264.
  15. 1 2 Petersen TN, Brunak S, von Heijne G, Nielsen H (2011). "SignalP 4.0: discriminating signal peptides from transmembrane regions". Nat. Methods. 8 (10): 785–6. doi: 10.1038/nmeth.1701 . PMID   21959131. S2CID   16509924.
  16. Zhang L, Villa NY, Rahman MM, Smallwood S, Shattuck D, Neff C, Dufford M, Lanchbury JS, Labaer J, McFadden G (2009). "Analysis of vaccinia virus-host protein-protein interactions: validations of yeast two-hybrid screenings". J. Proteome Res. 8 (9): 4311–8. doi:10.1021/pr900491n. PMC   2738428 . PMID   19637933.
  17. Wu TT, Gong H, Clarke EM (2011). "A transcriptome analysis by lasso penalized Cox regression for pancreatic cancer survival". J Bioinform Comput Biol. 9 (Suppl 1): 63–73. doi:10.1142/s0219720011005744. PMID   22144254.