Putative uncharacterized protein C6orf52

Last updated

Putative uncharacterized protein C6orf52 (C6orf52) is a protein in humans that is encoded by the gene "C6orf52" and has six known isoforms. [1] C6orf52 was identified in 2002 by The National Institutes of Health Mammalian Gene Collection (MGC) Program. [2] C6orf52 has one known paralog, tRNA selenocysteine 1-associated protein 1 (TRNAU1AP). [3]

Contents

Gene

The cytogenetic location of C6orf52 is 6p24.2 on the shorthand of chromosome 6. [4] It is 23,379 nucleotides long, spanning from nucleotide 10671418 to 10694797 and has a molecular weight of 17,383 Da with 9 different exons. C6orf52 has no common aliases although the major protein product is sometimes referred to as "Q5T4I8". [5]

Location of C6orf52 on shorthand of chromosome 6. C6orf52 Chromosomal Location.png
Location of C6orf52 on shorthand of chromosome 6.

mRNA

C6orf52 is known to undergo alternative splicing and has six known isoforms of varying length.

Proteins

Isoforms

Q5T4I8 has six known isoforms of varying amino acid length.

IsoformPolypeptide Length
X1207
X2182
X3177
X4176
X5126
X693

Composition

The protein composition is relatively high in glutamic acid and serine residue levels and is relatively low in tryptophan and arginine when compared to the average human protein composition. [6] [7]

Post-translational modifications

C6orf52 has two commonly predicted post-translational modifications present in the highly conserved domain. [8] [9] The lysine at position 123 (of the major protein) within the highly conserved domain is expected to undergo sumoylation often, while the tyrosine at position 128 is expected to undergo phosphorylation. Sumoylation sites allow for the binding of SUMO (small ubiquitin-like modifier protein) which are known to alter different functional parameters of proteins such as subcellular localization, protein parenting, DNA binding and transactivation functions of transcription factors. [10] Tyrosine phosphorylation is associated with many things, namely growth factor signaling and cell differentiation during development which are recurring aspects of C6orf52. [11]

Structure

The secondary structure of C6orf52 consists mostly of coiled regions, however there is an extended alpha helix region within the highly conserved domain. [12] [13]

Subcellular localization

It is predicted to be a non-transmembrane protein that is located within the nucleus. [14]

Expression

Tissue expression is highest within the oocyte, with high expression in the testes and female gonad. [15]

Tissue expression of Q5T4I8 in humans. Expression is highest in the oocyte. C6orf52 Expression.png
Tissue expression of Q5T4I8 in humans. Expression is highest in the oocyte.

Expression is extremely high (2000-3000 transcripts per million) in the first stages of embryonic development up until the blastocyst.

C6orf52 expression levels during preimplantation embryonic development, measured in transcripts per million (TPM) C6orf52 Embryonic Expression.png
C6orf52 expression levels during preimplantation embryonic development, measured in transcripts per million (TPM)

Clinical Significance

Two proteins in cattle that have been linked to fat or energy metabolism were predicted to be similar to C6orf52, however there is no known clinical study done examining C6orf52. [16]

Homology

Paralogs

C6orf52 has one identified paralog, tRNA selenocysteine 1-associated protein 1 (TRNAU1AP), which is located on chromosome one at 1p35.3. [3] TRNAU1AP is involved selenocysteine biosynthesis, selenoproteins synthesis efficiency enhancement and may be involved in the methylation of tRNA(Sec). [17]

Orthologs

C6orf52 is conserved through many species. It can be found it many mammals, reptiles, and birds, such as the Zebra Finch. [18]

Scientific NameNameAccessionSequence Similarity %Date of Divergence (Estimated MYA) [19]
Sus scrofaWild Boar NP_001138494.1 60.97696
Taeniopygia guttataZebra Finch XP_004175377.1 56.90312
Ailuropoda melanoleucaGiant Panda XP_019651607.1 64.4396
Pelodiscus sinensisChinese Softshell Turtle XP_006138812.1 42.55312
Canis lupus familiarisDog XP_005640089.1 64.2996
Pan troglodytesCommon Chimpanzee XP_009448762.2 98.036.65
Macaca mulattaRhesus macaque NP_001180810.1 94.0829.44

There is a domain of high conservation across species starting near the last third of the polypeptide.

Multiple sequence alignment for multiple orthologs of C6orf52. A high conservation domain begins near the last third of the polypeptide sequence. C6orf52 Ortholog Conservation.png
Multiple sequence alignment for multiple orthologs of C6orf52. A high conservation domain begins near the last third of the polypeptide sequence.

Related Research Articles

<span class="mw-page-title-main">C1orf21</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C1orf21, also known as Proliferation-Inducing Protein 13, is a protein that in humans is encoded by the C1orf21 gene. C1orf21 is an intracellular protein that flows between the nucleus and the cytoplasm in the cell. It has been linked with cell growth and reproduction and there has been strong links with various types of cancers. There are no paralogs for this gene, however, many conserved orthologs have been found in all invertebrates. C1orf21 has low to moderate level of expression in most tissues in humans, however, it has the most expression in the skin, lung and prostate.

<span class="mw-page-title-main">OSER1</span> Protein-coding gene in the species Homo sapiens

Chromosome 20 open reading frame 111, or C20orf111, is the hypothetical protein that in humans is encoded by the C20orf111 gene. C20orf111 is also known as Perit1, HSPC207, and dJ1183I21.1. It was originally located using genomic sequencing of chromosome 20. The National Center for Biotechnology Information, or NCBI, shows that it is located at q13.11 on chromosome 20, however the genome browser at the University of California-Santa Cruz (UCSC) website shows that it is at location q13.12, and within a million base pairs of the adenosine deaminase locus. It was also found to have an increase in expression in cells undergoing hydrogen peroxide(H
2
O
2
)-induced apoptosis. After analyzing the amino acid content of C20orf111, it was found to be rich in serine residues.

Transmembrane protein 251, also known as C14orf109 or UPF0694, is a protein that in humans is encoded by the TMEM251 gene. One notable feature of this protein is the presence of proline residues on one of its predicted transmembrane domains., which is a determinant of the intramitochondrial sorting of inner membrane proteins.

WD repeat-containing protein 90 is a protein that, in humans, is encoded by the WDR90 gene (16p13.3). This human protein is 1750 amino acids, and has a molecular weight of 187.7 kDa. It contains multiple WD40 repeat domains and one domain of unknown function. This protein is conserved all the way back to invertebrates. Proteins containing WD transducin repeating domains have been found to play a role in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control, autophagy and apoptosis.

Chromosome 15 open reading frame 52 is a human protein encoded by the C15orf52 gene, its function is poorly understood.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

<span class="mw-page-title-main">C14orf93</span> Protein-coding gene in the species Homo sapiens

C14orf93 is a protein that is encoded in humans by the C14orf93 gene. It is a globular protein with a conserved C-terminus that is localized to the nucleus. While expressed relatively highly in all tissues except nervous tissue, it is expressed particularly highly in T cells and other immune tissues.

<span class="mw-page-title-main">C8orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 8 open reading frame 58 is an uncharacterised protein that in humans is encoded by the C8orf58 gene. The protein is predicted to be localized in the nucleus.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

<span class="mw-page-title-main">C17orf50</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C17orf50 is a protein which in humans is encoded by the C17orf50 gene.

<span class="mw-page-title-main">C1orf112</span> Protein-coding gene in the species Homo sapiens

Chromosome 1 open reading frame 112, is a protein that in humans is encoded by the C1orf112 gene, and is located at position 1q24.2. C1orf112 encodes for seventeen variants of mRNA, fifteen of which are functional proteins. C1orf112 has a determined precursor molecular weight of 96.6 kDa and an isoelectric point of 5.62. C1orf112 has been experimentally determined to localize to the mitochondria, although it does not contain a mitochondrial targeting sequence.

<span class="mw-page-title-main">C3orf67</span> Human gene

Chromosome 3 open reading frame 67 or C3orf67 is a protein that in humans is encoded by the gene C3orf67. The function of C3orf67 is not yet fully understood.

<span class="mw-page-title-main">C19orf44</span> Mammalian protein found in Homo sapiens

Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.

<span class="mw-page-title-main">C4orf51</span> Protein-coding gene in the species Homo sapiens

Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.

<span class="mw-page-title-main">CFAP299</span> Protein-coding gene in the species Homo sapiens

Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.

Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.

<span class="mw-page-title-main">C7orf50</span> Mammalian protein found in Homo sapiens

C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.

<span class="mw-page-title-main">C6orf136</span> Protein-coding gene in the species Homo sapiens

C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

References

  1. "NCBI(National Center for Biotechnology Information)" . Retrieved 2019-02-22.
  2. "NCBI(National Center for Biotechnology Information)". Virginia Medical. 105 (4): 272–277. April 1978. Retrieved 2019-02-25.
  3. 1 2 "NCBI(National Center for Biotechnology Information)" . Retrieved 2019-02-25.
  4. "HUGO Gene Symbol Report" . Retrieved 2019-02-25.
  5. "UniProt" . Retrieved 2019-02-25.
  6. "The Institute for Environmental Modeling (TIEM)" . Retrieved 2019-05-05.
  7. "ProtScale - Amino Acid Composition (in UniProKB/Swiss-Prot data bank)" . Retrieved 2019-05-05.
  8. "NetPhos 3.1 Server" . Retrieved 2019-05-05.
  9. "SUMOplot Analysis Program - Abgent" . Retrieved 2019-05-05.
  10. Hilgarth, Roland S.; Murphy, Lynea A.; Skaggs, Hollie S.; Wilkerson, Donald C.; Xing, Hongyan; Sarge, Kevin D. (2004). "JBC - Regulation and Function of SUMO Modification". Journal of Biological Chemistry. 279 (52): 53899–53902. doi: 10.1074/jbc.R400021200 . PMID   15448161.
  11. Pasantes-Morales, H.; Franco, R. (2002). "Influence of protein tyrosine kinases on cell volume change-induced taurine release". Cerebellum (London, England). 1 (2): 103–9. doi:10.1080/147342202753671231. PMID   12882359. S2CID   9909209.
  12. "NPS@: Network Protein Sequence Analysis - SOPMA" . Retrieved 2019-05-05.
  13. "I-TASSER server for protein structure and function prediction" . Retrieved 2019-05-05.
  14. "PSORT II Prediction" . Retrieved 2019-05-05.
  15. "UniProt" . Retrieved 2019-02-25.
  16. Chow, Y. W.; Pietranico, R.; Mukerji, A. (1975). "NCBI(National Center for Biotechnology Information)". Biochemical and Biophysical Research Communications. 66 (4): 1424–31. doi:10.1016/0006-291x(75)90518-5. PMID   6.
  17. "UniProt - TRNAU1AP - tRNA selenosysteine 1-associated protein 1 - Homo sapiens" . Retrieved 2019-05-06.
  18. "BLAST NCBI" . Retrieved 2019-02-25.
  19. "TimeTree: The Timescale of Life" . Retrieved 2019-02-25.