C17orf50

Last updated
C17orf50
Identifiers
Aliases C17orf50 , chromosome 17 open reading frame 50, cholesin
External IDs MGI: 1913580; HomoloGene: 11949; GeneCards: C17orf50; OMA:C17orf50 - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_145272

NM_025492

RefSeq (protein)

NP_660315

NP_079768

Location (UCSC) Chr 17: 35.76 – 35.77 Mb Chr 11: 83.33 – 83.33 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Uncharacterized protein C17orf50 is a protein which in humans is encoded by the C17orf50 gene.

Contents

Gene

The gene is located on the long arm of chromosome 17 on the forward strand [5] at position 17q12. C17orf50 spans 4,200 base pairs from 35,760,897 to 35,765,079. In humans, this gene encodes a protein that is 174 amino acids in length [6] and has three exons. [7]

Regulation of transcription

The promoter region for C17orf50 is 1417 base pairs long with an accession number of GXP_123003 from Genomatix. [8] The first half of the promoter is poorly conserved even among primates. [9] [10] [11]

There are many binding sites for transcription factors found in the brain and embryonic tissue, [8] particularly Brn-5 POU domain factor, which has three binding sites within the conserved region of the promoter. This transcription factor is expressed in layer IV of the neocortex of adults and at its highest levels in the developing brain and spinal cord. [12]

Annotated promoter sequence of C17orf50 showing possible transcription factor binding sites and conserved regions C17orf50 promoter sequence.pdf
Annotated promoter sequence of C17orf50 showing possible transcription factor binding sites and conserved regions

Homology/evolution

Orthologs of this gene exist in eukaryotes, predominantly in mammals. [9] However, some homologs are present in birds, reptiles, and amphibians. There are no paralogs of this gene. The table below shows a short list of orthologs to trace the evolutionary history of C17orf50.

SpeciesAccession numberDivergence from humans (MYA) [13] Identity
Homo sapiens NP_6603150100%
Chlorocebus sabaeus XP_00800926729.4485%
Mus musculus NP_079768.29068%
Pteropus vampyrus XP_01138555817170%
Chelonia mydas EMP2888831245%
Corvus brachyrhynchos XP_01758432131244%
Anolis carolinensis XP_00321835331237%
Xenopus tropicalis OCA3556035246%

The most distant ortholog found diverged from humans approximately 352 million years ago, indicating that the protein arose shortly before that. When compared to other proteins, namely cytochrome c and fibrinogen alpha chain, uncharacterized protein C17orf50 is a rapidly evolving protein.

Expression

C17orf50 is expressed at low levels in various tissues, such as lung, prostate, thymus, thyroid, trachea, small intestine, and stomach, and it is most highly expressed in the fetal brain. [14]

Protein

The unmodified molecular weight of C17orf50 protein is 19.3 kilodaltons. The protein has a negative charge cluster from position 21 to 52; this is a glutamate-rich region. [15] There are three nuclear localization signals with no other retention signals, strongly indicating that the protein is localized to the nucleus. [16]

Characterization of the protein has shown binding to GPR146. Based on a proposed role in regulation of serum cholesterol levels in response to dietary cholesterol intake, the protein has been called cholesin. [17]

Possible structure of Uncharacterized Protein C17orf50.jpg

Domains

Uncharacterized protein C17orf50 contains a domain of unknown function (DUF4673) from position 5 to 172, which makes up the majority of the protein. [7]

Post-translational modifications

Uncharacterized protein C17orf50 contains two potential sumoylation sites at K7 and K12. [18] [19] There are possible threonine and serine glycosylation sites throughout the protein. [20] Potential threonine, serine, and tyrosine phosphorylation sites are also present. [21]

Annotated conceptual translation C17orf50 Annotated conceptual translation C17orf50.pdf
Annotated conceptual translation C17orf50

Interacting proteins

The protein product also called cholesin binds to GPR146. [17] Uncharacterized protein C17orf50 has potential interactions with zinc finger protein 587(ZNF587), [22] [23] which is expressed throughout fetal tissue, including the brain, [24] ZNF587 is expected to regulate transcription. [25]

Related Research Articles

<span class="mw-page-title-main">TMEM229B</span> Gene of the species Homo sapiens

Transmembrane protein 229b is a protein that in humans is encoded by the TMEM229b gene.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

Chromosome 19 open reading frame 18 (c19orf18) is a protein which in humans is encoded by the c19orf18 gene. The gene is exclusive to mammals and the protein is predicted to have a transmembrane domain and a coiled coil stretch. This protein has a function that is not yet fully understood by the scientific community.

<span class="mw-page-title-main">C17orf53</span>

C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

<span class="mw-page-title-main">C16orf46</span> Human gene

Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.

<span class="mw-page-title-main">TMEM44</span> Protein-coding gene in the species Homo sapiens

TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.

<span class="mw-page-title-main">C19orf44</span> Mammalian protein found in Homo sapiens

Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.

<span class="mw-page-title-main">C4orf51</span> Protein-coding gene in the species Homo sapiens

Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.

<span class="mw-page-title-main">C16orf86</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

<span class="mw-page-title-main">GOLGA8H</span>

Golgin subfamily A member 8H, also known as GOLGA8H, is a protein that in Homo sapiens is encoded by the GOLGA8H gene. Function of the GOLGA8H involves a process that is carried out at the cellular level which results in the assembly, arrangement of constituent parts, or disassembly of the Golgi apparatus.

Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.

<span class="mw-page-title-main">SAAL1</span> Protein-coding gene in the species Homo sapiens

Serum amyloid A-like 1 is a protein in humans encoded by the SAAL1 gene.

<span class="mw-page-title-main">TMEM169</span> Gene

Transmembrane protein 169 (TMEM169) in humans is encoded by TMEM169 gene. The aliases of TMEM169 include FLJ34263, DKFZp781L2456, and LOC92691. TMEM169 has the highest expression in the brain, particularly the fetal brain. TMEM169 has homologs mammals, reptiles, amphibians, birds, fish, chordates and invertebrates. The most distantly related homolog of TMEM169 is Anopheles albimanus.

<span class="mw-page-title-main">C9orf85</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.

<span class="mw-page-title-main">C2orf72</span> Human protein encoding gene

C2orf72 is a gene in humans that encodes a protein currently named after its gene, C2orf72. It is also designated LOC257407 and can be found under GenBank accession code NM_001144994.2. The protein can be found under UniProt accession code A6NCS6.

<span class="mw-page-title-main">C5orf49</span> Protein-coding gene in the species Homo sapiens

Chromosome 5 open reading frame forty-nine, also known as C5orf49, is a protein that in humans is encoded by the C5orf49 gene. Aliases for C5orf49 include Chromosome 5 Open Reading Frame 49, Uncharacterized Protein C5orf49 and LOC134121. C5orf49 is predicted to localize to the cilia and have ciliary functions.

<span class="mw-page-title-main">TMEM212</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of five transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates. TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.

<span class="mw-page-title-main">C22orf15</span> Protein-coding gene in the species Homo sapiens

C22orf15 is a protein which, in humans, is encoded by the C22orf15 gene.

<span class="mw-page-title-main">C11ORF97</span> Protein which in humans is encoded by the C11ORF97 gene

C11ORF97, or Chromosome 11 Open Reading Frame 97, is a protein which in humans is encoded by the C11ORF97 gene. It is hypothesized to localize to the cytoplasm, and plays a role in the ciliary basal body. Based on its protein interactions, it is thought to have a role in Lemierre's Syndrome and Hepatic Coma.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000270806 Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000035085 Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. "Gene: C17orf50 ENSG00000270806". Ensembl. EMBL-EBI. Retrieved 6 May 2018.
  6. "uncharacterized protein C17orf50 [Homo sapiens]". NCBI. US National Library of Medicine. Retrieved 25 January 2018.
  7. 1 2 "C17orf50 chromosome 17 open reading frame 50 [Homo sapiens (human) ]". NCBI. US National Library of Medicine. Retrieved 25 January 2018.
  8. 1 2 "GXP_123003". Genomatix. Archived from the original on 24 February 2001. Retrieved 26 March 2018.
  9. 1 2 "NCBI". US National Library of Medicine. Retrieved 19 February 2018.
  10. "Multiple Sequence Alignment". ClustalW. Kyoto University Bioinformatics Center. Retrieved 28 March 2018.
  11. "BoxShade Server". ExPASy. Swiss Institute of Bioinformatics. Archived from the original on 29 October 2020. Retrieved 28 March 2018.
  12. Andersen B, Schonemann MD, Pearse RV, Jenne K, Sugarman J, Rosenfeld MG (November 1993). "Brn-5 is a divergent POU domain factor highly expressed in layer IV of the neocortex". The Journal of Biological Chemistry. 268 (31): 23390–8. doi: 10.1016/S0021-9258(19)49475-1 . PMID   7901208.
  13. "TimeTree". The Timescale of Life. Institute for Genomics and Evolutionary Medicine. Retrieved 22 February 2018.
  14. "Chromosome 17 open reading frame 50". UniGene. NCBI. Retrieved 28 March 2018.
  15. "Results for job saps-I20180327-142242-0812-44488260-p1m". EMBL-EBI. European Molecular Biology Laboratory. Retrieved 17 April 2018.
  16. "PSORTII". PSORTII. Retrieved 17 April 2018.
  17. 1 2 Hu X, Chen F, Jia L, Long A, Peng Y, Li X; et al. (2024). "A gut-derived hormone regulates cholesterol metabolism". Cell. 187 (7): 1685-1700.e18. doi:10.1016/j.cell.2024.02.024. PMID   38503280.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  18. "SUMOplot Analysis Program". Abgent. WuXi AppTec. Retrieved 17 April 2018.
  19. "GPS-SUMO Online Service". GPS. The Cuckoo Workgroup. Archived from the original on 6 May 2018. Retrieved 17 April 2018.
  20. "NetOGlyc 4.0 Server". DTU Bioinformatics. Department of Bio and Health Informatics. Retrieved 3 April 2018.
  21. "NetPhos 3.1 Server". DTU Bioinformatics. Department of Health and Bioinformatics. Retrieved 3 April 2018.
  22. "IntAct". EMBL-EBI. European Molecular Biology Laboratory. Retrieved 19 April 2018.
  23. "Uniprot". UniProt. UniProt Consortium. Retrieved 19 April 2018.
  24. "ensg00000198466 (ZNF587) Homo sapiens zinc finger protein 587". Expression Atlas. EMBL-EBI. Retrieved 25 April 2018.
  25. "UniProtKB - Q96SQ5 (ZN587_HUMAN)". UniProt. UniProt Consortium. Retrieved 25 April 2018.