C5orf46

Last updated
C5orf46
Identifiers
Aliases C5orf46 , SSSP1, chromosome 5 open reading frame 46
External IDs MGI: 2684940 HomoloGene: 19192 GeneCards: C5orf46
Gene location (Human)
Ideogram human chromosome 5.svg
Chr. Chromosome 5 (human) [1]
Human chromosome 5 ideogram.svg
HSR 1996 II 3.5e.svg
Red rectangle 2x18.png
Band 5q32Start147,880,726 bp [1]
End147,906,538 bp [1]
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_206966

NM_001033280

RefSeq (protein)

NP_996849

NP_001028452

Location (UCSC) Chr 5: 147.88 – 147.91 Mb Chr 18: 43.78 – 43.79 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

C5orf46 is a protein coding gene located on chromosome 5 in humans. It is also known as sssp1, or skin and saliva secreted protein 1. There are two known isoforms known in humans, with isoform 2 (analyzed throughout this page) being the longer of the two. The protein encoded is predicted to have one transmembrane domain, and has a predicted molecular weight of 9,692 Da, and a basal isoelectric point of 4.67. [5]

Contents

Gene

Found on the minus strand of chromosome 5, the c5orf46 isoform X2 is 4679 nucleotides in length and has 4 exons.

Evolution and orthologs

C5orf46 orthologs are only found in Chordata, with the earliest instance being found in the Ornithorhynchus anatinus around 177 million years ago. [6] Highly conserved regions include the signal peptide sequence found towards the N-terminus of the protein. There are no paralogs found in humans.

Table 1: C5orf46 Orthologs
Genus/SpeciesOrganism Common NameOrderAccession NumberLength (amino acids)Sequence Identity with HumanSequence Similarity with Human
Homo sapiensHumanPrimates XP_005268503.2 102100100
Pan paniscusBonoboPrimates XP_003829110.1 879898
Octodon degusCommon deguRodentia XP_004631400.1 736882
Mus musculusHouse mouseRodentia NP_001028452.1 936877
Urocitellus parryiiArctic ground squirrelRodentia XP_026240295.1 925972
Leptonychotes weddelliWeddell sealCarnivora XP_006732700.1 1247894
Acinonyx jubatusCheetahCarnivora XP_026897868.1 1477687
Zalophus califronianusCalifornia sea lionCarnivora XP_027462230.1 847586
Sorex araneusCommon shrewEulipotyphla XP_004618219.1 737479
Vicugna pacosAlpacaArtiodactyla XP_006204536.1 887381
Delphinapertus leucasBeluga whaleArtiodactyla XP_030618631.1 887381
Camelus bactrianusBactrian camelArtiodactyla XP_010954370.1 837379
Orcinus orcaKiller WhaleArtiodactyla XP_004280428.1 887183
Sus scrofaWild boarArtiodactyla XP_003354397.2 906782
Manis javanicaSeunda pangolinPholidota XP_017496222.1 1556279
Myotis davidiiBatChiroptera XP_015428095.1 1044150
Elephantulus edwardiiCape elephant shrewMacroscelidea XP_006893786.1 767880
Dasypus novemcinctusNine-banded armadilloCingulata XP_004447160.1 887282
Vombatus urinusCommon wombatDiprotodontia XP_027697780.1 784562
Phascolarctos cinereusKoalaDiprotodontia XP_020854530.1 784462
Ornithorhynchus anatinusPlatypusMonotremata XP_028912384.1 804361

Promoters

A Genomatix ElDorado promoter database search predicted one promoter for c5orf46. This promoter has the ID number of GXP_123762 and transcript ID number GXT_22785522. The promoter is located on the minus strand of chromosome 5, and was predicted to range from nucleotides 147906451 to 147908007, making it 1557 nucleotides in length.

Transcription factors

A total of 428 transcription factor binding sites were predicted to be located within the predicted promoter sequence. The predictions included the following transcription factors: [7]

Expression

C5orf46 is largely expressed in salivary glands and skin tissue, though some expression in heart tissue, testis, and placenta is also observed. [8]

C5orf46 expression microarray data comparisons between healthy and psoriasis patients. C5orf46 Expression Psoriasis Data.png
C5orf46 expression microarray data comparisons between healthy and psoriasis patients.

Microarray data measuring c5orf46 expression in psoriasis patients revealed a trend of low expression in patients with lesional psoriasis. Samples from lesional psoriasis patients had significantly lower c5orf46 expression compared to non-lesional psoriasis patients and healthy control samples. [9]

Protein

Primary structure

C5orf46 is 102 amino acids in length. The protein has a signal peptide sequence at its N-terminus. The signal peptide sequence is highly conserved in orthologs. The amino acid sequence includes a DDKPD sequence that is repeated, with an aspartate and lysine rich region.

Secondary structure

Through prediction software including the Chou and Fasman Secondary Structure Prediction server and Prabi GOR IV Prediction analysis, two alpha-helical segments were predicted. [10] [11]

Tertiary structure

Predictive models made by Phyre2 and SWISS-Model have shown two alpha-helical domains with a bend between them. [12] [13]

Protein regulation

C5orf46 has multiple predicted post-translation modification sites, and one modification identified through mass spectrometry. Mass spectrometry analysis of extracts from a NCI-H2228 lung cancer cell line have identified an acetylation site at K42. [14] C5orf46 has predicted phosphorylation sites at T14, S52, S84, and S86. [15] Predicted sumoylation sites are present at K41, K44, K48, K54, and K57. [16] There are two predicted O-GlcNAcylation sites found at S100 and S101. [17]

Localization

An analysis of the c5orf46 amino acid sequence revealed that the protein is likely to be secreted. [18] Further sequence analyses have predicted that the protein has one transmembrane domain, with an intracellular N-terminal domain. [19] [20]

Interactions

C5orf46 has been predicted to interact with phosphopantothenoylcysteine synthetase (PPCS) and transmembrane BAX inhibitor motif containing 6 (TMBIM6) through affinity purification-mass spectrometry methods. [21]

Clinical significance

C5orf46 has been shown to be a prognostic marker in renal and cervical cancer, with high expression being linked to unfavorable outcomes. These conclusions were based on Human Protein Pathology Atlas gene expression analyses and survival outcomes of 651 and 291 patients with renal and cervical cancer respectively. [22] In these analyses, patients that were classified with high expression of c5orf46 were shown to have a 50% lower survival rate after 10 years than patients with low expression.

Related Research Articles

YIF1A protein-coding gene in the species Homo sapiens

Protein YIF1A is a protein that in humans is encoded by the YIF1A gene.

C5orf34 is a protein that in humans is encoded by the C5orf34 gene (5p12).

C8orf48 protein-coding gene in the species Homo sapiens

C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.

C16orf46 Human gene

Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.

C3orf67 Human gene

Chromosome 3 open reading frame 67 or C3orf67 is a protein that in humans is encoded by the gene C3orf67. The function of C3orf67 is not yet fully understood.

Chromosome 9 open reading frame 43 protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 43 is a protein that in humans is encoded by the C9orf43 gene. The gene is also known as MGC17358 and LOC257169. C9orf43 contains DUF 4647 and a polyglutamine repeat region although protein function is not well understood.

C4orf51 protein-coding gene in the species Homo sapiens

Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.

CFAP299 protein-coding gene in the species Homo sapiens

Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.

TEX55 protein-coding gene in the species Homo sapiens

Testis expressed 55 (TEX55) is a human protein that is encoded by the C3orf30 gene located on the forward strand of human chromosome three, open reading frame 30 (3q13.32). TEX55 is also known as Testis-specific conserved, cAMP-dependent type II PK anchoring protein (TSCPA), and uncharacterized protein C3orf30.

Chromosome 1 open reading frame 141, or C1orf141 is a protein which, in humans, is encoded by gene C1orf141. It is a precursor protein that becomes active after cleavage. The function is not yet well understood, but it is suggested to be active during development

C22orf23 is a protein which in humans is encoded by the C22orf23 gene. Its predicted secondary structure consists of alpha helices and disordered/coil regions. It is expressed in many tissues and highest in the testes and it is conserved across many orthologs.

Uncharacterized Protein C15orf32 is a protein which in humans is encoded by the C15orf32 gene and is located on chromosome 15, location 15q26.1. Variants of C15orf32 have been linked to bipolar disorder, alcohol abuse, and acute myeloid leukemia.

C1orf185 protein-coding gene in the species Homo sapiens

Chromosome 1 open reading frame 185, also known as C1orf185, is a protein that in humans is encoded by the C1orf185 gene. In humans, C1orf185 is a lowly expressed protein that has been found to be occasionally expressed in the circulatory system.

C16orf90 protein-coding gene in the species Homo sapiens

C16orf90 or chromosome 16 open reading frame 90 produces uncharacterized protein C16orf90 in homo sapiens. C16orf90's protein has four predicted alpha-helix domains and is mildly expressed in the testes and lowly expressed throughout the body. While the function of C16orf90 is not yet well understood by the scientific community, it has suspected involvement in the biological stress response and apoptosis based on expression data from microarrays and post-translational modification data.

C20orf202

C20orf202 is a protein that in humans is encoded by the C20orf202 gene. In humans, this gene encodes for a nuclear protein that is primarily expressed in the lung and placenta.

C7orf50 mammalian protein found in Homo sapiens

C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.

C1orf94 protein-coding gene in the species Homo sapiens

Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.

C12orf24 protein-coding gene in the species Homo sapiens

C12Orf24 is a gene in humans that encodes a protein known as FAM216A. This gene is primarily expressed in the testis and brain, but has constitutive expression in 25 other tissues. FAM216A is an intracellular protein that has been predicted to reside within the nucleus of cells. The exact function of C12Orf24 is unknown. FAM216A is highly expressed in Sertoli cells of the testis as well as different stage spermatids.

LSMEM2 protein-coding gene in the species Homo sapiens

Leucine rich single-pass membrane protein 2 is a protein that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, aves, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart and skeletal muscle tissue.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000178776 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000071858 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. "C5orf46 (human)". www.phosphosite.org. Retrieved 2020-03-02.
  6. "Protein BLAST: search protein databases using a protein query". blast.ncbi.nlm.nih.gov. Retrieved 2020-03-02.
  7. "Genomatix - NGS Data Analysis & Personalized Medicine". www.genomatix.de. Retrieved 2020-05-03.
  8. "C5orf46 chromosome 5 open reading frame 46 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-03.
  9. 1 2 "GDS4602 / 1554195_a_at". www.ncbi.nlm.nih.gov. Retrieved 2020-05-03.
  10. "CFSSP: Chou & Fasman Secondary Structure Prediction Server". www.biogem.org. Retrieved 2020-05-03.
  11. "NPS@ : GOR4 secondary structure prediction". npsa-prabi.ibcp.fr. Retrieved 2020-05-03.
  12. "PHYRE Protein Fold Recognition Server". www.sbg.bio.ic.ac.uk. Retrieved 2020-05-03.
  13. "SWISS-MODEL". swissmodel.expasy.org. Retrieved 2020-05-03.
  14. "Lys42". www.phosphosite.org. Retrieved 2020-05-03.
  15. "NetPhos 3.1 Server". www.cbs.dtu.dk. Retrieved 2020-05-03.
  16. "SUMOplot™ Analysis Program | Abcepta". www.abcepta.com. Retrieved 2020-05-03.
  17. "YinOYang 1.2 Server". www.cbs.dtu.dk. Retrieved 2020-05-03.
  18. "Welcome to psort.org!!". www.psort.org. Retrieved 2020-05-03.
  19. "TMHMM Server, v. 2.0". www.cbs.dtu.dk. Retrieved 2020-05-03.
  20. "長浜バイオ大学 学内アクセス". ripple.nagahama-i-bio.ac.jp. Retrieved 2020-05-03.
  21. "C5orf46 (UNQ472/PRO839) Result Summary | BioGRID". thebiogrid.org. Retrieved 2020-05-03.
  22. "Expression of C5orf46 in cancer - Summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2020-03-02.