C1orf159 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | C1orf159 , chromosome 1 open reading frame 159 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 2444364 HomoloGene: 51678 GeneCards: C1orf159 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
C1orf159 is a protein that in human is encoded by the C1orf159 gene located on chromosome 1. [5] [6] This gene is also found to be an unfavorable prognosis marker for renal and liver cancer, and a favorable prognosis marker for urothelial cancer. [7]
The Homo sapiensC1orf159 gene (UniProt ID: Q96HA4) is a gene located on the short arm of chromosome 1 at locus 1p36.33. [6] The gene is 34,247 base pairs in length, located at Chromosome 1 position 1,081,818 to 1,116,089 on the reverse strand. [8]
The longest variant of human C1orf159 gene encodes an mRNA that is 2,432 nucleotides in length with 12 exons. [9] A promoter region was predicted using UCSC Genome Browser, [10] which is 762 nucleotides long, including a 434 nucleotide upstream of the transcriptional start site, exon 1, and a 298 nucleotide region of intron 1.[ citation needed ]
Alternative splicing of the gene creates 5 protein isoforms. [5] The longest isoform is 380 amino acids in length with a molecular mass of 40.382 kDa. [5]
Isoform | UniProt [11] ID | Length (aa) |
---|---|---|
1 | Q96HA4-1 | 380 |
2 | Q96HA4-2 | 185 |
3 | Q96HA4-3 | 189 |
4 | Q96HA4-4 | 198 |
5 | Q96HA4-5 | 254 |
C1orf159 protein is a proline- and arginine-rich, and a lysine- and glutamic acid- poor protein. The isoelectric point of the human C1orf159 protein is 10.07, [12] which is more basic than the average human proteomic protein pI of 7.36. [13]
The human C1orf159 protein contains a domain of unknown function DUF4501. [5] Although the exact function of the domain is not clear, it is thought to be a single pass-membrane protein with highly conserved cysteine residues.[ citation needed ]
The protein also contains a transmembrane domain at positions 144-169 [11] and a signal peptide at positions 1-18. [9] [11]
Alphafold predicts the structure of human C1orf159 protein to be mainly composed of alpha-helices. [14]
The predicted post-translational modifications of the C1orf159 protein includes N-linked glycosylation on asparagine at positions 104, 111, and 128. [5] [15]
Orthologs of human C1orf159 are found in vertebrates including mammals, birds, reptiles, amphibians, and fish [16] with the most distantly related group of organisms being cartilaginous fish, with a date of divergence of approximately 450 million years ago. [17] Orthologs are not found in jawless fish or invertebrates. [16]
When compared with the evolution rate with cytochrome c and fibrinogen alpha, the C1orf159 protein has a similar evolutionary rate of change to the fast-evolving fibrinogen alpha protein, C1orf159 protein has a relatively fast evolution rate.[ citation needed ]
The Human Protein Atlas shows that C1orf159 is an unfavorable prognosis marker for renal and liver cancer, and a favorable prognosis marker for urothelial cancer, indicating that a high expression of C1orf159 is associated with a lower survival probability for patients with renal and liver cancer, and is associated with a higher survival probability for patients with urothelial cancer. [7]
C11orf49 is a protein coding gene that in humans encodes for the C11orf49 protein. It is heavily expressed in brain tissue and peripheral blood mononuclear cells, with the latter being an important component of the immune system. It is predicted that the C11orf49 protein acts as a kinase, and has been shown to interact with HTT and APOE2.
TSR3, or TSR3 Ribosome Maturation Factor, is a hypothetical human protein found on chromosome 16. Its protein is 312 amino acids long and its cDNA has 1214 base pairs. It was previously designated C16orf42.
METTL26, previously designated C16orf13, is a protein-coding gene for Methyltransferase Like 26, also known as JFP2. Though the function of this gene is unknown, various data have revealed that it is expressed at high levels in various cancerous tissues. Underexpression of this gene has also been linked to disease consequences in humans.
Chromosome 9 open reading frame 152 is a protein that in humans is encoded by the C9orf152 gene. The exact function of the protein is not completely understood.
Ankyrin repeat domain-containing protein 24 is a protein in humans that is coded for by the ANKRD24 gene. The gene is also known as KIAA1981. The protein's function in humans is currently unknown. ANKRD24 is in the protein family that contains ankyrin-repeat domains.
Glutamate Rich Protein 2 is a protein in humans encoded by the gene ERICH2. This protein is expressed heavily in male tissues specifically in the testes, and proteins are specifically found in the nucleoli fibrillar center and the vesicles of these testicular cells. The protein has multiple protein interactions which indicate that it may play a role in histone modification and proper histone functioning.
Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.
Uncharacterized protein C17orf78 is a protein encoded by the C17orf78 gene in humans. The name denotes the location of the parent gene, being at the 78th open reading frame, on the 17th human chromosome. The protein is highly expressed in the small intestine, especially the duodenum. The function of C17orf78 is not well defined.
C2orf72 is a gene in humans that encodes a protein currently named after its gene, C2orf72. It is also designated LOC257407 and can be found under GenBank accession code NM_001144994.2. The protein can be found under UniProt accession code A6NCS6.
C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.
TBC1D30 is a gene in the human genome that encodes the protein of the same name. This protein has two domains, one of which is involved in the processing of the Rab protein. Much of the function of this gene is not yet known, but it is expressed mostly in the brain and adrenal cortex.
C2orf80 is a protein that in humans is encoded by the c2orf80 gene. The gene c2orf80 also goes by the alias GONDA1. In humans, c2orf80 is exclusively expressed in the brain. While relatively little is known about the function of c2orf80, medical studies have shown a strong association between variations in c2orf80 and IDH-mutant gliomas, 46,XY gonadal dysgenesis, and a possible association with blood pressure.
Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of 5 transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates. TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.
Transmembrane epididymal protein 1 is a transmembrane protein encoded by the TEDDM1 gene. TEDDM1 is also commonly known as TMEM45C and encodes 273 amino acids that contains six alpha-helix transmembrane regions. The protein contains a 118 amino acid length family of unknown function. While the exact function of TEDDM1 is not understood, it is predicted to be an integral component of the plasma membrane.
TMEM252 or transmembrane protein 252 is a protein that, in humans, is encoded by the TMEM252 gene.
Chromosome 20 open reading frame 144 (c20orf144) is a human protein-encoding gene. The human c20orf144 protein consists of 153 amino acids, with the first 150 amino acids being characterized as part of the Bcl-2 like protein of testis (Bclt) family.
Transmembrane protein 82 (TMEM82) is a protein encoded by the TMEM82 gene in humans.
Transmembrane protein 271, or TMEM271 is a protein in Homo sapiens encoded by the TMEM271 gene, located at 4p16.3 on the minus strand. The protein is located on the plasma membrane of cells and highly expressed in several regions of the brain.
FAM131A is a protein that is encoded by the FAM131A gene in humans. Aliases for FAM131A include C3orf40, FLAT715, and PRO1378.
Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.