Uncharacterized protein C16orf78(NP_653203.1) is a protein that in humans is encoded by the chromosome 16 open reading frame 78 gene. [1]
The C16orf78 gene(123970) is located at 16q12.1 on the plus strand, spanning 25,609 bp from 49,407,734 to 49,433,342. [2]
There is one mRNA transcript (NM_144602.3) and no other known splice isoforms. There are 5 exons, totaling a length of 1068 base pairs. [2]
C16orf78 is 265 amino acids long with a predicted molecular weight of 30.8 kDal and pI of 9.8. [3] It is rich in both methionine and lysine, composed of 6.4% methionine and 13.6% lysine. [4] This methionine richness has been hypothesized to serve as a mitochondrial antioxidant. [5]
There are four verified ubiquitination sites and three verified phosphorylation sites. [6] [7]
Predictions of C16orf78's secondary structure consist primarily of alpha helices and coiled coils. [9] [10] [11] Phyre2 also predicted C16orf78 is primarily helical, but 253 of 265 amino acids were modeled ab initio so the confidence of the model is low. [12]
C16orf78 is predicted to be localized to the cell nucleus. [13] There is also a predicted bipartite nuclear localization signal. [14]
C16orf78 has restricted expression toward the testis, with much lower expression in other tissues. [15]
C16orf78 has a physical association with DNA/RNA-binding protein KIN17 (NP_036443.1), suggesting C16orf78 may also play a role in DNA repair. [17] C16orf78 was found to be phosphorylated by SRPK1(NP_003128.3) and SPRK2 (AAH68547.1). [6]
Deletion of the C16orf78 gene has been identified as a determinant of prostate cancer. [18] A SNP in C16orf78 interacts with a SNP in LMTK2 and is associated with risk of prostate cancer. [19]
Amplification of the C16orf78 gene has been linked to metabolically adaptive cancer cells. [20] A duplication of the C16orf78 gene was associated with at least one case of Rolandic Epilepsy. [21]
C16orf78 has over 80 orthologs, including animals as distant Ciona intestinalis (XP_002132057.1), which is estimated to have diverged from humans 676 million years ago. [2] [23] C16orf78 has orthologs in many types of mammals, reptiles, bony fish, and even some invertebrates, but has no known orthologs in amphibians or birds. [22] Below is a table with samples of orthologs, with divergence dates from TimeTree and similarity calculated by pairwise sequence alignment. [24]
Species Name | NCBI Accession | Divergence (mya) (estimated) | Length (aa) | % Identity | % Similarity |
Homo sapiens | NP_653203.1 | 0 | 265 | 100% | 100% |
Gorilla gorilla gorilla | XP_004057673.2 | 9.06 | 265 | 96% | 98% |
Macaca mulatta | XP_001082258.1 | 29.44 | 267 | 89% | 93% |
Galeopterus variegatus | XP_008591134.1 | 76 | 266 | 65% | 77% |
Oryctolagus cuniculus | XP_008273281.1 | 90 | 255 | 62% | 76% |
Mus musculus | NP_808569.1 | 90 | 270 | 57% | 69% |
Lipotes vexillifer | XP_007459548.1 | 96 | 266 | 65% | 77% |
Capra hircus | XP_017918754.1 | 96 | 276 | 63% | 74% |
Callorhinus ursinus | XP_025708226.1 | 96 | 250 | 62% | 74% |
Pteropus vampyrus | XP_011358492.1 | 96 | 263 | 60% | 74% |
Loxodonta africana | XP_023411324.1 | 105 | 285 | 48% | 55% |
Sarcophilus harrisii | XP_003757266.1 | 159 | 270 | 38% | 53% |
Vombatus ursinus | XP_027723426.1 | 159 | 275 | 38% | 54% |
Pogona vitticeps | XP_020643996.1 | 312 | 315 | 26% | 43% |
Gekko japonicus | XP_015263322.1 | 312 | 261 | 25% | 47% |
Python bivittatus | XP_025030465.1 | 312 | 313 | 23% | 37% |
Latimeria chalumnae | XP_014344069.1 | 413 | 310 | 19% | 42% |
Acipenser ruthenus | RXM34621.1 | 435 | 202 | 15% | 37% |
Ciona intestinalis | XP_002132057.1 | 676 | 396 | 10% | 32% |
Apostichopus japonicus | PIK46940.1 | 684 | 292 | 9% | 33% |
C12orf66 is a protein that in humans is encoded by the C12orf66 gene. The C12orf66 protein is one of four proteins in the KICSTOR protein complex which negatively regulates mechanistic target of rapamycin complex 1 (mTORC1) signaling.
C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.
Chromosome 9 open reading frame 43 is a protein that in humans is encoded by the C9orf43 gene. The gene is also known as MGC17358 and LOC257169. C9orf43 contains DUF 4647 and a polyglutamine repeat region although protein function is not well understood.
Chromosome 9 open reading frame 25 (C9orf25) is a domain that encodes the FAM219A gene. The terms FAM219A and C9orf25 are aliases and can be used interchangeably. The function of this gene is not yet completely understood.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.
Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.
C2orf16 is a protein that in humans is encoded by the C2orf16 gene. Isoform 2 of this protein is 1,984 amino acids long. The gene contains 1 exon and is located at 2p23.3. Aliases for C2orf16 include Open Reading Frame 16 on Chromosome 2 and P-S-E-R-S-H-H-S Repeats Containing Sequence.
Chromosome 1 open reading frame 141, or C1orf141 is a protein which, in humans, is encoded by gene C1orf141. It is a precursor protein that becomes active after cleavage. The function is not yet well understood, but it is suggested to be active during development
c7orf26 is a gene in humans that encodes a protein known as c7orf26. Based on properties of c7orf26 and its conservation over a long period of time, its suggested function is targeted for the cytoplasm and it is predicted to play a role in regulating transcription.
Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.
Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.
Chromosome 1 open reading frame 185, also known as C1orf185, is a protein that in humans is encoded by the C1orf185 gene. In humans, C1orf185 is a lowly expressed protein that has been found to be occasionally expressed in the circulatory system.
C20orf202 is a protein that in humans is encoded by the C20orf202 gene. In humans, this gene encodes for a nuclear protein that is primarily expressed in the lung and placenta.
Uncharacterized protein C17orf78 is a protein encoded by the C17orf78 gene in humans. The name denotes the location of the parent gene, being at the 78th open reading frame, on the 17th human chromosome. The protein is highly expressed in the small intestine, especially the duodenum. The function of C17orf78 is not well defined.
Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.
Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.
C12orf29 is a protein that in humans is encoded by chromosome 12 open reading frame 29. The gene is ubiquitously expressed in various tissues. The protein has 325 amino acids. The biological process of C12orf29 has been annotated as hematopoietic progenitor cell differentiation. The molecular and cellular functions of C12orf29 gene have not yet well understood by the scientific community.