C17orf50 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | C17orf50 , chromosome 17 open reading frame 50, cholesin | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 1913580; HomoloGene: 11949; GeneCards: C17orf50; OMA:C17orf50 - orthologs | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Uncharacterized protein C17orf50 is a protein which in humans is encoded by the C17orf50 gene.
The gene is located on the long arm of chromosome 17 on the forward strand [5] at position 17q12. C17orf50 spans 4,200 base pairs from 35,760,897 to 35,765,079. In humans, this gene encodes a protein that is 174 amino acids in length [6] and has three exons. [7]
The promoter region for C17orf50 is 1417 base pairs long with an accession number of GXP_123003 from Genomatix. [8] The first half of the promoter is poorly conserved even among primates. [9] [10] [11]
There are many binding sites for transcription factors found in the brain and embryonic tissue, [8] particularly Brn-5 POU domain factor, which has three binding sites within the conserved region of the promoter. This transcription factor is expressed in layer IV of the neocortex of adults and at its highest levels in the developing brain and spinal cord. [12]
Orthologs of this gene exist in eukaryotes, predominantly in mammals. [9] However, some homologs are present in birds, reptiles, and amphibians. There are no paralogs of this gene. The table below shows a short list of orthologs to trace the evolutionary history of C17orf50.
Species | Accession number | Divergence from humans (MYA) [13] | Identity |
---|---|---|---|
Homo sapiens | NP_660315 | 0 | 100% |
Chlorocebus sabaeus | XP_008009267 | 29.44 | 85% |
Mus musculus | NP_079768.2 | 90 | 68% |
Pteropus vampyrus | XP_011385558 | 171 | 70% |
Chelonia mydas | EMP28888 | 312 | 45% |
Corvus brachyrhynchos | XP_017584321 | 312 | 44% |
Anolis carolinensis | XP_003218353 | 312 | 37% |
Xenopus tropicalis | OCA35560 | 352 | 46% |
The most distant ortholog found diverged from humans approximately 352 million years ago, indicating that the protein arose shortly before that. When compared to other proteins, namely cytochrome c and fibrinogen alpha chain, uncharacterized protein C17orf50 is a rapidly evolving protein.
C17orf50 is expressed at low levels in various tissues, such as lung, prostate, thymus, thyroid, trachea, small intestine, and stomach, and it is most highly expressed in the fetal brain. [14]
The unmodified molecular weight of C17orf50 protein is 19.3 kilodaltons. The protein has a negative charge cluster from position 21 to 52; this is a glutamate-rich region. [15] There are three nuclear localization signals with no other retention signals, strongly indicating that the protein is localized to the nucleus. [16]
Characterization of the protein has shown binding to GPR146. Based on a proposed role in regulation of serum cholesterol levels in response to dietary cholesterol intake, the protein has been called cholesin. [17]
Uncharacterized protein C17orf50 contains a domain of unknown function (DUF4673) from position 5 to 172, which makes up the majority of the protein. [7]
Uncharacterized protein C17orf50 contains two potential sumoylation sites at K7 and K12. [18] [19] There are possible threonine and serine glycosylation sites throughout the protein. [20] Potential threonine, serine, and tyrosine phosphorylation sites are also present. [21]
The protein product also called cholesin binds to GPR146. [17] Uncharacterized protein C17orf50 has potential interactions with zinc finger protein 587(ZNF587), [22] [23] which is expressed throughout fetal tissue, including the brain, [24] ZNF587 is expected to regulate transcription. [25]
Transmembrane protein 229b is a protein that in humans is encoded by the TMEM229b gene.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Chromosome 19 open reading frame 18 (c19orf18) is a protein which in humans is encoded by the c19orf18 gene. The gene is exclusive to mammals and the protein is predicted to have a transmembrane domain and a coiled coil stretch. This protein has a function that is not yet fully understood by the scientific community.
C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.
TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.
Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.
Golgin subfamily A member 8H, also known as GOLGA8H, is a protein that in Homo sapiens is encoded by the GOLGA8H gene. Function of the GOLGA8H involves a process that is carried out at the cellular level which results in the assembly, arrangement of constituent parts, or disassembly of the Golgi apparatus.
Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.
Serum amyloid A-like 1 is a protein in humans encoded by the SAAL1 gene.
Transmembrane protein 169 (TMEM169) in humans is encoded by TMEM169 gene. The aliases of TMEM169 include FLJ34263, DKFZp781L2456, and LOC92691. TMEM169 has the highest expression in the brain, particularly the fetal brain. TMEM169 has homologs mammals, reptiles, amphibians, birds, fish, chordates and invertebrates. The most distantly related homolog of TMEM169 is Anopheles albimanus.
Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.
C2orf72 is a gene in humans that encodes a protein currently named after its gene, C2orf72. It is also designated LOC257407 and can be found under GenBank accession code NM_001144994.2. The protein can be found under UniProt accession code A6NCS6.
Chromosome 5 open reading frame forty-nine, also known as C5orf49, is a protein that in humans is encoded by the C5orf49 gene. Aliases for C5orf49 include Chromosome 5 Open Reading Frame 49, Uncharacterized Protein C5orf49 and LOC134121. C5orf49 is predicted to localize to the cilia and have ciliary functions.
Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of five transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates. TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.
C22orf15 is a protein which, in humans, is encoded by the C22orf15 gene.
C11ORF97, or Chromosome 11 Open Reading Frame 97, is a protein which in humans is encoded by the C11ORF97 gene. It is hypothesized to localize to the cytoplasm, and plays a role in the ciliary basal body. Based on its protein interactions, it is thought to have a role in Lemierre's Syndrome and Hepatic Coma.
{{cite journal}}
: CS1 maint: multiple names: authors list (link)