C12orf50 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | C12orf50 , chromosome 12 open reading frame 50 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 1913855 HomoloGene: 45135 GeneCards: C12orf50 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Chromosome 12 Open Reading Frame 50 (C12orf50) is a protein-encoding gene which in humans encodes for the C12orf50 protein. The accession id for this gene is NM_152589. The location of C12orf50 is 12q21.32. It covers 55.42 kb, from 88429231 to 88373811 (NCBI 37, August 2010), on the reverse strand. [5] Some of the neighboring genes to C12orf50 are RPS4XP15, LOC107984542, and C12orf29. [6] RPS4XP15 is upstream C12orf50 and is on the same strand. LOC107984542 and C12orf29 are both downstream. LOC107984542 is on the opposite strand while C12orf29 is on the same strand. C12orf50 has six isoforms. This page is focusing on isoform X1. C12orf50 isoform X1 is 1711 nucleotides long and has a protein with a length of 414 aa.
The ontology points to the function of C12orf50 is to enable mRNA and protein binding. [6] It also is involved in poly(A)+ mRNA export from the nucleus.
The C12orf50 gene has 6 isoforms.
Isoform | NCBI Accession | mRNA length (nt) | Protein length (aa) | Features |
---|---|---|---|---|
X1 | NM_152589.3 | 1711 | 414 | Longest protein with a zinc finger |
X2 | NM_001363616.2 | 1669 | 375 | |
X3 | XM_017018887.1 | 1940 | 468 | Longest protein without a zinc finger |
X4 | XM_017018888.2 | 1550 | 375 | |
X5 | XM_011537985.1 | 3879 | 395 | Longest mRNA |
X6 | XM_024448868.1 | 1577 | 374 | |
In an analysis of human tissues with specific expression by the genome, RNA-seq was performed on tissue samples from 95 human individuals representing 27 different tissues in order to determine tissue-specificity protein-coding genes found that the expression of C12orf50 is very low in most human tissues with the exception of the testis. [7] C12orf50’s expression was restricted towards testis. [8]
Uncharacterized protein Chromosome 12 Open Reading Frame 50 is a protein in humans, encoded by the C12orf50 gene. The protein accession id is Q8NA57. The protein has a length of 414aa. The predicted mass of the protein is 47.2 kDa. [9] The protein includes a CCCH-type Zn Finger Domain. [10] The protein has a CCCH-type Zn Finger Domain with a C-X8-C-X5-C-X3-H motif. The domain starts at the beginning of the protein and goes to the 44th amino acid. The protein also has three disordered regions from the 136th amino acid to 168th with of length of 33 aa, 297th to 333rd with a length of 37 aa, and 346th to 414th with a length of 69 aa. [11] The predicted molecular weight is 47.3 kDa and the predicted isoelectric point is 8.79.
The predicted tertiary structure for C12orf50 has two beta-sheets towards the beginning of the protein in the zinc finger domain and a helix from 106-124aa. [12] These are conserved throughout mammalian orthologs. There is also a large number of coiled regions. The promoter, 3’ UTR region, and 5’ UTR are very well conserved. There is a negative cluster (acidic domain) before and at the beginning of the helix from amino acid 87 to 111. [13]
There is a 47.8% probability of being in the nucleus and a 30.4% probability of being in the cytoplasm. [14] This was confirmed by immunohistochemistry and immunofluorescence by Sigma-Aldrich showing positivity in both the nucleus and cytoplasm. [15] There is a nuclear location signal and acidic domain. The orthologs also confirm that C12orf50 is localized in the nucleus and cytoplasm.
There are two proteins (GAPDHS and GOLGA2) that interact with C12orf50. Glyceraldehyde-3-phosphate dehydrogenase, spermatogenic (GAPDHS) enzyme may play an important role in regulating the switch between different energy-producing pathways, and it is required for sperm motility and male fertility. [16]
C12orf50 has been predicted to undergo various phosphorylation, c-mannosylation, and O-glycosylations. The phosphorylation sites are at amino acids 262, 349 and 370. The O-glycosylation sites are amino acids 139, 238, and 374. The c-mannosylation sites are amino acids 13, 102, 292, and 388.
C12orf50 has an evolutionary rate that is close to Fibrinogen alpha, making it relatively quick. Orthologs for C12orf50 have been found in mammals, reptiles, birds, and amphibians caecilians. No orthologs were found for frogs, fish, invertebrates, or fungi. The mammalian orthologs shared the most similarity with humans with the exception of the platypus. The range of divergence from humans from mammals was 6.4-180 million years. The reptilian orthologs were the next similar and diverged around 318 million years ago. Then the birds diverged from humans at the same time as the reptiles. The least similar was the amphibian caecilians and they diverged around 351.7 million years ago.
C12orf50 has orthologs in mammals, aves, reptiles and caecilian amphibians. No orthologs were found in amphibian frogs, invertebrates, plants, fungi, or yeast. The table below shows some of the orthologs that can be found on BLAST. [17]
Species | Organism common name | NCBI Accession | Sequence Identity | Sequence Similarity | Length(AAs) |
---|---|---|---|---|---|
Homo sapiens | Human | NP_689802.1 | 100% | 100% | 414 |
Pan paniscus | Bonobo | XP_003828373.1 | 98.8% | 99.3% | 414 |
Phoca vitulina | Harbor seal | XP_032272023.1 | 87.7% | 93.0% | 415 |
Lipotes vexillifer | Daiji dolphin | XP_007449184.1 | 86.3% | 92.3% | 415 |
Gavialis gangeticus | Gharial | XP_019370370.1 | 47.1% | 63.5% | 378 |
Gopherus gangeticus | Tortoises | XP_030404453.1 | 47.1% | 61.3% | 414 |
Alligator sinensis | Chinese alligator | XXP_006027976.1 | 44.2% | 60.1% | 363 |
Gallus gallus | Chicken | XP_040518234.1 | 38.5% | 52.3% | 405 |
Coturnix japonica | Japanese quail | XP_032299702.1 | 36.8% | 50.9% | 403 |
Geotrypetes seraphini | Gaboon caecilian | XP_033807710.1 | 39.7% | 55.3% | 409 |
C12orf50 has two paralogs: ZC3H11A and ZC3H11B. The zinc the finger domain is considered in both of the paralogs.
Gene | NCBI Accession | Sequence Similarity | Length(AAs) |
---|---|---|---|
C12orf50 | NP_689802.1 | 100% | 414 |
ZC3H11A | NP_001306167.1 | 16.2% | 810 |
ZC3H11B | NNP_001342386.1 | 15.9% | 805 |
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.
Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.
C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.
Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.
C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.
Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.
Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.
C1orf122 is a gene in the human genome that encodes the cytosolic protein ALAESM.. ALAESM is present in all tissue cells and highly up-regulated in the brain, spinal cord, adrenal gland and kidney. This gene can be expressed up to 2.5 times the average gene in its highly expressed tissues. Although the function of C1orf122 is unknown, it is predicted to be used for mitochondria localization.
Uncharacterized protein C17orf78 is a protein encoded by the C17orf78 gene in humans. The name denotes the location of the parent gene, being at the 78th open reading frame, on the 17th human chromosome. The protein is highly expressed in the small intestine, especially the duodenum. The function of C17orf78 is not well defined.
TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.
Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.
FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
C2orf72 is a gene in humans that encodes a protein currently named after its gene, C2orf72. It is also designated LOC257407 and can be found under GenBank accession code NM_001144994.2. The protein can be found under UniProt accession code A6NCS6.
C12orf29 is a protein that in humans is encoded by chromosome 12 open reading frame 29. The gene is ubiquitously expressed in various tissues. The protein has 325 amino acids. The biological process of C12orf29 has been annotated as hematopoietic progenitor cell differentiation. The molecular and cellular functions of C12orf29 gene have not yet well understood by the scientific community.
Chromosome 5 open reading frame 22 (c5orf22) is a protein-coding gene of poorly characterized function in Homo sapiens. The primary alias is unknown protein family 0489 (UPF0489).
THAP domain-containing protein 3 (THAP3) is a protein that, in Homo sapiens (humans), is encoded by the THAP3 gene. The THAP3 protein is as known as MGC33488, LOC90326, and THAP domain-containing, apoptosis associated protein 3. This protein contains the Thanatos-associated protein (THAP) domain and a host-cell factor 1C binding motif. These domains allow THAP3 to influence a variety of processes, including transcription and neuronal development. THAP3 is ubiquitously expressed in H. sapiens, though expression is highest in the kidneys.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.
Chromosome 5 Open Reading Frame 47, or C5ORF47, is a protein which, in humans, is encoded by the C5ORF47 gene. It also goes by the alias LOC133491. The human C5ORF47 gene is primarily expressed in the testis.
{{cite web}}
: Missing or empty |title=
(help)[ permanent dead link ]