C1orf159

Last updated
C1orf159
Identifiers
Aliases C1orf159 , chromosome 1 open reading frame 159
External IDs MGI: 2444364 HomoloGene: 51678 GeneCards: C1orf159
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_017891
NM_001330306
NM_001363525

NM_145557
NM_177205

RefSeq (protein)

NP_001317235
NP_060361
NP_001350454

n/a

Location (UCSC) Chr 1: 1.08 – 1.12 Mb Chr 4: 156.19 – 156.21 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

C1orf159 is a protein that in human is encoded by the C1orf159 gene located on chromosome 1. [5] [6] This gene is also found to be an unfavorable prognosis marker for renal and liver cancer, and a favorable prognosis marker for urothelial cancer. [7]

Contents

Gene

The Homo sapiensC1orf159 gene (UniProt ID: Q96HA4) is a gene located on the short arm of chromosome 1 at locus 1p36.33. [6] The gene is 34,247 base pairs in length, located at Chromosome 1 position 1,081,818 to 1,116,089 on the reverse strand. [8]

Transcript

The longest variant of human C1orf159 gene encodes an mRNA that is 2,432 nucleotides in length with 12 exons. [9] A promoter region was predicted using UCSC Genome Browser, [10] which is 762 nucleotides long, including a 434 nucleotide upstream of the transcriptional start site, exon 1, and a 298 nucleotide region of intron 1.[ citation needed ]

Protein

Isoforms

Alternative splicing of the gene creates 5 protein isoforms. [5] The longest isoform is 380 amino acids in length with a molecular mass of 40.382 kDa. [5]

Isoforms of human C1orf159 protein
IsoformUniProt [11] IDLength (aa)
1 Q96HA4-1 380
2 Q96HA4-2 185
3 Q96HA4-3 189
4 Q96HA4-4 198
5 Q96HA4-5 254

Composition

C1orf159 protein is a proline- and arginine-rich, and a lysine- and glutamic acid- poor protein. The isoelectric point of the human C1orf159 protein is 10.07, [12] which is more basic than the average human proteomic protein pI of 7.36. [13]

Domain

The human C1orf159 protein contains a domain of unknown function DUF4501. [5] Although the exact function of the domain is not clear, it is thought to be a single pass-membrane protein with highly conserved cysteine residues.[ citation needed ]

The protein also contains a transmembrane domain at positions 144-169 [11] and a signal peptide at positions 1-18. [9] [11]

Structure

Alphafold predicts the structure of human C1orf159 protein to be mainly composed of alpha-helices. [14]

Post-translational modification

The predicted post-translational modifications of the C1orf159 protein includes N-linked glycosylation on asparagine at positions 104, 111, and 128. [5] [15]

Homology/evolution

Orthologs

Orthologs of human C1orf159 are found in vertebrates including mammals, birds, reptiles, amphibians, and fish [16] with the most distantly related group of organisms being cartilaginous fish, with a date of divergence of approximately 450 million years ago. [17] Orthologs are not found in jawless fish or invertebrates. [16]

Orthologs of C1orf159 Protein
SpeciesGroupTaxonomic

Group

NCBI Protein Accession NumberProtein Sequence

Similarity

(% Relative to Human Protein)

Human MammalsPrimates NP_001317235.1 100.0
Chimpanzee Primates XP_024204744.1 98.4
Bonobo Primates XP_008975653.2 88.9
House Mouse Rodentia NP_796179.1 40.9
Cattle Artiodactyla NP_001026925.1 36.6
Sunda Flying Lemur Dermoptera XP_008567908.1 39.4
Chinese Tree Shrew Scandentia XP_027622332.1 35.8
Cougar Carnivora XP_025768111.1 41.7
Chicken BirdsGalliformes XP_024998437.2 32.8
Rock Pigeon Columbiformes XP_013226562.2 35.7
Hooded Crow Passeriformes XP_039420032.1 29.9
Golden-collared Manakin Passeriformes XP_017934783.1 36.5
Gharial ReptilesCrocodilia XP_019367354.1 36.8
Leatherback Sea Turtle Testudines XP_027584571.1 35.9
Chinese Softshell Turtle Testudines XP_006127168.1 35.2
Western Clawed Frog Anura NP_001039047.1 34.6
Two-lined Caecilian AmphibiansGymnophiona XP_029433955.1 33.9
Asiatic Toad Anura XP_044137731.1 31.6
Zebrafish FishCypriniformes NP_001313355.1 26.4
Sterlet Acipenseriformes XP_034760226.1 32.8
Reedfish Polypteriformes XP_028663678.1 32.9
Small-spotted Catshark Carcharhiniformes XP_038629468.1 28.0
Whale Shark Orectolobiformes XP_020381962.1 32.9
Unrooted phylogenetic tree of C1orf159 orthologs generated by Phylogeny.fr. Phylo tree c1orf159.jpg
Unrooted phylogenetic tree of C1orf159 orthologs generated by Phylogeny.fr.

Evolutionary History

When compared with the evolution rate with cytochrome c and fibrinogen alpha, the C1orf159 protein has a similar evolutionary rate of change to the fast-evolving fibrinogen alpha protein, C1orf159 protein has a relatively fast evolution rate.[ citation needed ]

Evolutionary change of C1orf159 protein compared to the change of Cytochrome C and Fibrinogen Alpha. m in the vertical axis is defined as the total number of amino acid changes occurred in a 100-amino acid segment of a protein. Evolutionary change C1orf159.png
Evolutionary change of C1orf159 protein compared to the change of Cytochrome C and Fibrinogen Alpha. m in the vertical axis is defined as the total number of amino acid changes occurred in a 100-amino acid segment of a protein.

Clinical Significance

The Human Protein Atlas shows that C1orf159 is an unfavorable prognosis marker for renal and liver cancer, and a favorable prognosis marker for urothelial cancer, indicating that a high expression of C1orf159 is associated with a lower survival probability for patients with renal and liver cancer, and is associated with a higher survival probability for patients with urothelial cancer. [7]

Related Research Articles

<span class="mw-page-title-main">C11orf49</span> Protein-coding gene in the species Homo sapiens

C11orf49 is a protein coding gene that in humans encodes for the C11orf49 protein. It is heavily expressed in brain tissue and peripheral blood mononuclear cells, with the latter being an important component of the immune system. It is predicted that the C11orf49 protein acts as a kinase, and has been shown to interact with HTT and APOE2.

<span class="mw-page-title-main">TSR3</span> Hypothetical human protein

TSR3, or TSR3 Ribosome Maturation Factor, is a hypothetical human protein found on chromosome 16. Its protein is 312 amino acids long and its cDNA has 1214 base pairs. It was previously designated C16orf42.

<span class="mw-page-title-main">METTL26</span> Protein-coding gene in the species Homo sapiens

METTL26, previously designated C16orf13, is a protein-coding gene for Methyltransferase Like 26, also known as JFP2. Though the function of this gene is unknown, various data have revealed that it is expressed at high levels in various cancerous tissues. Underexpression of this gene has also been linked to disease consequences in humans.

<span class="mw-page-title-main">C9orf152</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 152 is a protein that in humans is encoded by the C9orf152 gene. The exact function of the protein is not completely understood.

<span class="mw-page-title-main">ANKRD24</span> Protein-coding gene in the species Homo sapiens

Ankyrin repeat domain-containing protein 24 is a protein in humans that is coded for by the ANKRD24 gene. The gene is also known as KIAA1981. The protein's function in humans is currently unknown. ANKRD24 is in the protein family that contains ankyrin-repeat domains.

<span class="mw-page-title-main">ERICH2</span> Protein-coding gene in the species Homo sapiens

Glutamate Rich Protein 2 is a protein in humans encoded by the gene ERICH2. This protein is expressed heavily in male tissues specifically in the testes, and proteins are specifically found in the nucleoli fibrillar center and the vesicles of these testicular cells. The protein has multiple protein interactions which indicate that it may play a role in histone modification and proper histone functioning.

<span class="mw-page-title-main">SMCO3</span> Protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

<span class="mw-page-title-main">C17orf78</span> Mammalian protein found in Homo sapiens

Uncharacterized protein C17orf78 is a protein encoded by the C17orf78 gene in humans. The name denotes the location of the parent gene, being at the 78th open reading frame, on the 17th human chromosome. The protein is highly expressed in the small intestine, especially the duodenum. The function of C17orf78 is not well defined.

<span class="mw-page-title-main">C2orf72</span> Human protein encoding gene

C2orf72 is a gene in humans that encodes a protein currently named after its gene, C2orf72. It is also designated LOC257407 and can be found under GenBank accession code NM_001144994.2. The protein can be found under UniProt accession code A6NCS6.

<span class="mw-page-title-main">C11orf98</span> Protein-coding gene in the species Homo sapiens

C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.

<span class="mw-page-title-main">TBC1D30</span> Protein-coding gene in the species Homo sapiens

TBC1D30 is a gene in the human genome that encodes the protein of the same name. This protein has two domains, one of which is involved in the processing of the Rab protein. Much of the function of this gene is not yet known, but it is expressed mostly in the brain and adrenal cortex.

<span class="mw-page-title-main">C2orf80</span> Gene

C2orf80 is a protein that in humans is encoded by the c2orf80 gene. The gene c2orf80 also goes by the alias GONDA1. In humans, c2orf80 is exclusively expressed in the brain. While relatively little is known about the function of c2orf80, medical studies have shown a strong association between variations in c2orf80 and IDH-mutant gliomas, 46,XY gonadal dysgenesis, and a possible association with blood pressure.

<span class="mw-page-title-main">TMEM212</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of 5 transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates. TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.

<span class="mw-page-title-main">TEDDM1</span> Protein-coding gene in the species Homo sapiens

Transmembrane epididymal protein 1 is a transmembrane protein encoded by the TEDDM1 gene. TEDDM1 is also commonly known as TMEM45C and encodes 273 amino acids that contains six alpha-helix transmembrane regions. The protein contains a 118 amino acid length family of unknown function. While the exact function of TEDDM1 is not understood, it is predicted to be an integral component of the plasma membrane.


TMEM252 or transmembrane protein 252 is a protein that, in humans, is encoded by the TMEM252 gene.

<span class="mw-page-title-main">C20orf144</span> Human protein-encoding gene

Chromosome 20 open reading frame 144 (c20orf144) is a human protein-encoding gene. The human c20orf144 protein consists of 153 amino acids, with the first 150 amino acids being characterized as part of the Bcl-2 like protein of testis (Bclt) family.

<span class="mw-page-title-main">TMEM82</span> Transmembrane Protein 82

Transmembrane protein 82 (TMEM82) is a protein encoded by the TMEM82 gene in humans.

<span class="mw-page-title-main">TMEM271</span> TMEM271 gene and protein

Transmembrane protein 271, or TMEM271 is a protein in Homo sapiens encoded by the TMEM271 gene, located at 4p16.3 on the minus strand. The protein is located on the plasma membrane of cells and highly expressed in several regions of the brain.

<span class="mw-page-title-main">FAM131A</span> Information on the FAM131A gene and the protein it encodes

FAM131A is a protein that is encoded by the FAM131A gene in humans. Aliases for FAM131A include C3orf40, FLAT715, and PRO1378.

<span class="mw-page-title-main">LRRC74A</span> Protein-coding gene

Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000131591 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000059939 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. 1 2 3 4 5 "C1orf159 Gene". GeneCards. Retrieved 2022-06-20.
  6. 1 2 "Gene symbol report". HUGO Gene Nomenclature Committee. Retrieved 2022-06-20.
  7. 1 2 "Expression of C1orf159 in cancer - Summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2022-06-20.
  8. "Gene: C1orf159 (ENSG00000131591) - Summary - Homo_sapiens - Ensembl genome browser 107". uswest.ensembl.org. Retrieved 2022-07-27.
  9. 1 2 "Homo sapiens chromosome 1 open reading frame 159 (C1orf159), transcript variant 1, mRNA". 2022-04-17.
  10. "UCSC Genome Browser Home". genome.ucsc.edu. Retrieved 2022-07-28.
  11. 1 2 3 "UniProt". www.uniprot.org. Retrieved 2022-07-28.
  12. "SIB Swiss Institute of Bioinformatics | Expasy". www.expasy.org. Retrieved 2022-07-28.
  13. Kurotani A, Tokmakov AA, Sato KI, Stefanov VE, Yamada Y, Sakurai T (August 2019). "Localization-specific distributions of protein pI in human proteome are governed by local pH and membrane charge". BMC Molecular and Cell Biology. 20 (1): 36. doi: 10.1186/s12860-019-0221-4 . PMC   6701068 . PMID   31429701.
  14. "AlphaFold Protein Structure Database". alphafold.ebi.ac.uk. Retrieved 2022-07-28.
  15. "C1orf159 - Proteomics". www.nextprot.org. Retrieved 2022-07-30.
  16. 1 2 "C1orf159 orthologs". NCBI. Retrieved 2022-07-28.
  17. Redmond AK, Macqueen DJ, Dooley H (November 2018). "Phylotranscriptomics suggests the jawed vertebrate ancestor could generate diverse helper and regulatory T cell subsets". BMC Evolutionary Biology. 18 (1): 169. Bibcode:2018BMCEE..18..169R. doi: 10.1186/s12862-018-1290-2 . PMC   6238376 . PMID   30442091.
  18. "Phylogeny.fr: "One Click" Mode". www.phylogeny.fr. Retrieved 2022-07-28.