C12orf50

Last updated
C12orf50
Identifiers
Aliases C12orf50 , chromosome 12 open reading frame 50
External IDs MGI: 1913855 HomoloGene: 45135 GeneCards: C12orf50
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_152589
NM_001363616

NM_001081246

RefSeq (protein)

NP_689802
NP_001350545

n/a

Location (UCSC) Chr 12: 87.98 – 88.03 Mb Chr 10: 100.43 – 100.45 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Chromosome 12 Open Reading Frame 50 (C12orf50) is a protein-encoding gene which in humans encodes for the C12orf50 protein. The accession id for this gene is NM_152589. The location of C12orf50 is 12q21.32. It covers 55.42 kb, from 88429231 to 88373811 (NCBI 37, August 2010), on the reverse strand. [5] Some of the neighboring genes to C12orf50 are RPS4XP15, LOC107984542, and C12orf29. [6] RPS4XP15 is upstream C12orf50 and is on the same strand. LOC107984542 and C12orf29 are both downstream. LOC107984542 is on the opposite strand while C12orf29 is on the same strand. C12orf50 has six isoforms. This page is focusing on isoform X1. C12orf50 isoform X1 is 1711 nucleotides long and has a protein with a length of 414 aa.

Contents

C12orf50 genomic location with neighboring genes C12orf50 genomic location with neighboring genes.jpg
C12orf50 genomic location with neighboring genes

Function

The ontology points to the function of C12orf50 is to enable mRNA and protein binding. [6] It also is involved in poly(A)+ mRNA export from the nucleus.

Isoforms

The C12orf50 gene has 6 isoforms.

IsoformNCBI AccessionmRNA length (nt)Protein length (aa)Features
X1NM_152589.31711414Longest protein with a zinc finger
X2NM_001363616.21669375
X3XM_017018887.11940468Longest protein without a zinc finger
X4XM_017018888.21550375
X5XM_011537985.13879395Longest mRNA
X6XM_024448868.11577374

Gene expression

HPA RNA-seq on normal tissues to determine tissue-specificity of human C12orf50 gene HPA RNA-seq normal tissues for human C12orf50 gene.jpg
HPA RNA-seq on normal tissues to determine tissue-specificity of human C12orf50 gene

In an analysis of human tissues with specific expression by the genome, RNA-seq was performed on tissue samples from 95 human individuals representing 27 different tissues in order to determine tissue-specificity protein-coding genes found that the expression of C12orf50 is very low in most human tissues with the exception of the testis. [7] C12orf50’s expression was restricted towards testis. [8]

Protein

Uncharacterized protein Chromosome 12 Open Reading Frame 50 is a protein in humans, encoded by the C12orf50 gene. The protein accession id is Q8NA57. The protein has a length of 414aa. The predicted mass of the protein is 47.2 kDa. [9] The protein includes a CCCH-type Zn Finger Domain. [10] The protein has a CCCH-type Zn Finger Domain with a C-X8-C-X5-C-X3-H motif. The domain starts at the beginning of the protein and goes to the 44th amino acid. The protein also has three disordered regions from the 136th amino acid to 168th with of length of 33 aa, 297th to 333rd with a length of 37 aa, and 346th to 414th with a length of 69 aa. [11] The predicted molecular weight is 47.3 kDa and the predicted isoelectric point is 8.79.

Structure

The predicted tertiary structure for C12orf50 has two beta-sheets towards the beginning of the protein in the zinc finger domain and a helix from 106-124aa. [12] These are conserved throughout mammalian orthologs. There is also a large number of coiled regions. The promoter, 3’ UTR region, and 5’ UTR are very well conserved. There is a negative cluster (acidic domain) before and at the beginning of the helix from amino acid 87 to 111. [13]

Localization

There is a 47.8% probability of being in the nucleus and a 30.4% probability of being in the cytoplasm. [14] This was confirmed by immunohistochemistry and immunofluorescence by Sigma-Aldrich showing positivity in both the nucleus and cytoplasm. [15] There is a nuclear location signal and acidic domain. The orthologs also confirm that C12orf50 is localized in the nucleus and cytoplasm.

Protein Interactions

There are two proteins (GAPDHS and GOLGA2) that interact with C12orf50. Glyceraldehyde-3-phosphate dehydrogenase, spermatogenic (GAPDHS) enzyme may play an important role in regulating the switch between different energy-producing pathways, and it is required for sperm motility and male fertility. [16]

Post-translation Modifications

Cartoon of the C12orf50 protein with post-translation modifications Cartoon of the C12orf50 protein with post-translation modifications.jpg
Cartoon of the C12orf50 protein with post-translation modifications

C12orf50 has been predicted to undergo various phosphorylation, c-mannosylation, and O-glycosylations. The phosphorylation sites are at amino acids 262, 349 and 370. The O-glycosylation sites are amino acids 139, 238, and 374. The c-mannosylation sites are amino acids 13, 102, 292, and 388.

Evolution

Graph of c12orf50.jpg

C12orf50 has an evolutionary rate that is close to Fibrinogen alpha, making it relatively quick. Orthologs for C12orf50 have been found in mammals, reptiles, birds, and amphibians caecilians. No orthologs were found for frogs, fish, invertebrates, or fungi. The mammalian orthologs shared the most similarity with humans with the exception of the platypus. The range of divergence from humans from mammals was 6.4-180 million years. The reptilian orthologs were the next similar and diverged around 318 million years ago. Then the birds diverged from humans at the same time as the reptiles. The least similar was the amphibian caecilians and they diverged around 351.7 million years ago.

Homology

A unrooted phylogenetic tree of C12orf50 showing evolutionary descent from a common ancestor. Mammals have shortest lines since those orthologs have the closest divergence from humans. Then the reptiles and birds have medium length lines to indicate their divergence from humans. Then caecilians with the longest line to show that they have been diverged the longest. Unrooted phylogenetic tree of C12orf50.jpg
A unrooted phylogenetic tree of C12orf50 showing evolutionary descent from a common ancestor. Mammals have shortest lines since those orthologs have the closest divergence from humans. Then the reptiles and birds have medium length lines to indicate their divergence from humans. Then caecilians with the longest line to show that they have been diverged the longest.

Orthologs

C12orf50 has orthologs in mammals, aves, reptiles and caecilian amphibians. No orthologs were found in amphibian frogs, invertebrates, plants, fungi, or yeast. The table below shows some of the orthologs that can be found on BLAST. [17]

SpeciesOrganism common nameNCBI AccessionSequence IdentitySequence SimilarityLength(AAs)
Homo sapiens HumanNP_689802.1100%100%414
Pan paniscus BonoboXP_003828373.198.8%99.3%414
Phoca vitulina Harbor sealXP_032272023.187.7%93.0%415
Lipotes vexillifer Daiji dolphinXP_007449184.186.3%92.3%415
Gavialis gangeticus GharialXP_019370370.147.1%63.5%378
Gopherus gangeticus TortoisesXP_030404453.147.1%61.3%414
Alligator sinensis Chinese alligatorXXP_006027976.144.2%60.1%363
Gallus gallus ChickenXP_040518234.138.5%52.3%405
Coturnix japonica Japanese quailXP_032299702.136.8%50.9%403
Geotrypetes seraphini Gaboon caecilianXP_033807710.139.7%55.3%409

Paralogs

C12orf50 has two paralogs: ZC3H11A and ZC3H11B. The zinc the finger domain is considered in both of the paralogs.

Multiple sequence alignment of human protein C12orf50 and its paralogs: ZC3H11A and ZC3H11B MSA of human protein C12orf50 with ZC3H11A and ZC3H11B.jpg
Multiple sequence alignment of human protein C12orf50 and its paralogs: ZC3H11A and ZC3H11B
GeneNCBI AccessionSequence SimilarityLength(AAs)
C12orf50NP_689802.1100%414
ZC3H11A NP_001306167.116.2%810
ZC3H11B NNP_001342386.1 15.9%805

Related Research Articles

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.

<span class="mw-page-title-main">C6orf62</span> Protein-coding gene in the species Homo sapiens

Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.

<span class="mw-page-title-main">C17orf53</span>

C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.

<span class="mw-page-title-main">C16orf46</span> Human gene

Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.

<span class="mw-page-title-main">C15orf39</span>

C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.

<span class="mw-page-title-main">C16orf86</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

<span class="mw-page-title-main">C9orf50</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.

<span class="mw-page-title-main">C1orf122</span> Protein-coding gene in the species Homo sapiens

C1orf122 is a gene in the human genome that encodes the cytosolic protein ALAESM.. ALAESM is present in all tissue cells and highly up-regulated in the brain, spinal cord, adrenal gland and kidney. This gene can be expressed up to 2.5 times the average gene in its highly expressed tissues. Although the function of C1orf122 is unknown, it is predicted to be used for mitochondria localization.

<span class="mw-page-title-main">C17orf78</span> Mammalian protein found in Homo sapiens

Uncharacterized protein C17orf78 is a protein encoded by the C17orf78 gene in humans. The name denotes the location of the parent gene, being at the 78th open reading frame, on the 17th human chromosome. The protein is highly expressed in the small intestine, especially the duodenum. The function of C17orf78 is not well defined.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

<span class="mw-page-title-main">C9orf85</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.

<span class="mw-page-title-main">FAM120AOS</span> Protein-coding gene in the species Homo sapiens

FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

<span class="mw-page-title-main">C2orf72</span> Human protein encoding gene

C2orf72 is a gene in humans that encodes a protein currently named after its gene, C2orf72. It is also designated LOC257407 and can be found under GenBank accession code NM_001144994.2. The protein can be found under UniProt accession code A6NCS6.

<span class="mw-page-title-main">C12orf29</span> Protein-coding gene in humans

C12orf29 is a protein that in humans is encoded by chromosome 12 open reading frame 29. The gene is ubiquitously expressed in various tissues. The protein has 325 amino acids. The biological process of C12orf29 has been annotated as hematopoietic progenitor cell differentiation. The molecular and cellular functions of C12orf29 gene have not yet well understood by the scientific community.

<span class="mw-page-title-main">C5orf22</span> Protein-coding gene in the species Homo sapiens

Chromosome 5 open reading frame 22 (c5orf22) is a protein-coding gene of poorly characterized function in Homo sapiens. The primary alias is unknown protein family 0489 (UPF0489).

<span class="mw-page-title-main">THAP3</span> Protein in Humans

THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

<span class="mw-page-title-main">Chromosome 5 open reading frame 47</span> Human C5ORF47 Gene

Chromosome 5 Open Reading Frame 47, or C5ORF47, is a protein which, in humans, is encoded by the C5ORF47 gene. It also goes by the alias LOC133491. The human C5ORF47 gene is primarily expressed in the testis.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000165805 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000056912 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. "AceView: Gene:C12orf50, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov.
  6. 1 2 "C12orf50 chromosome 12 open reading frame 50 [Homo sapiens (human)". www.ncbi.nlm.nih.gov. National Center for Biotechnology Information.
  7. Fagerberg, Linn; Hallström, Björn M.; Oksvold, Per; Kampf, Caroline; Djureinovic, Dijana; Odeberg, Jacob; Habuka, Masato; Tahmasebpoor, Simin; Danielsson, Angelika; Edlund, Karolina; Asplund, Anna; Sjöstedt, Evelina; Lundberg, Emma; Szigyarto, Cristina Al-Khalili; Skogs, Marie; Takanen, Jenny Ottosson; Berling, Holger; Tegel, Hanna; Mulder, Jan; Nilsson, Peter; Schwenk, Jochen M.; Lindskog, Cecilia; Danielsson, Frida; Mardinoglu, Adil; Sivertsson, Åsa; von Feilitzen, Kalle; Forsberg, Mattias; Zwahlen, Martin; Olsson, IngMarie; Navani, Sanjay; Huss, Mikael; Nielsen, Jens; Ponten, Fredrik; Uhlén, Mathias (February 2014). "Analysis of the Human Tissue-specific Expression by Genome-wide Integration of Transcriptomics and Antibody-based Proteomics". Molecular & Cellular Proteomics. 13 (2): 397–406. doi: 10.1074/mcp.M113.035600 . PMC   3916642 . PMID   24309898.
  8. "AceView: Gene:C12orf50, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov.
  9. "RecName: Full=Uncharacterized Protein C12orf50". www.ncbi.nlm.nih.gov. National Center for Biotechnology Information.
  10. "C12orf50 chromosome 12 open reading frame 50 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov.
  11. "C12orf50 - Uncharacterized protein C12orf50 - Homo sapiens (Human) - C12orf50 gene & protein". www.uniprot.org.
  12. "AlphaFold Protein Structure Database". alphafold.ebi.ac.uk.
  13. "SAPS Results". www.ebi.ac.uk.
  14. https://psort.hgc.jp/cgi-bin/runpsort.pl.{{cite web}}: Missing or empty |title= (help)[ permanent dead link ]
  15. "C12orf50 antibody".
  16. "GAPDHS glyceraldehyde-3-phosphate dehydrogenase, spermatogenic [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov.
  17. "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov.