Chromosome 16 open reading frame 95 (C16orf95) is a gene which in humans encodes the protein C16orf95. It has orthologs in mammals, and is expressed at a low level in many tissues. C16orf95 evolves quickly compared to other proteins.
C16orf95 is a Homo sapiens gene oriented on the minus strand of chromosome 16. It is located on the cytogenic band 16q24.2 and spans 14.62 kilobases. [1] The gene contains 6 introns and 7 exons. [1]
There are no known paralogs of C16orf95.
Orthologs of C16orf95 exist only in mammals (identified with BLAST). [3] The most distant orthologs are found in opossums and Tasmanian devils.
Genus and species | Common name | NCBI accession | Date of divergence | Sequence identity |
Homo sapiens | Human | NP_001182053 | 0 mya | 100% |
Pan paniscus | Bonobo | XP_008972565 | 6.2 mya | 92% |
Gorilla gorilla gorilla | Gorilla | XP_004058157 | 8.3 mya | 95% |
Nomascus leucogenys | White-cheeked gibbon | XP_003272503 | 19.3 mya | 88% |
Mandrillus leucophaeus | Drill | XP_011827052 | 27.3 mya | 78% |
Propithecus coquereli | Lemur | XP_012513111 | 77.1 mya | 62% |
Tupaia chinensis | Tree shrew | XP_006152612 | 86.5 mya | 58% |
Oryctolagus cuniculus | European rabbit | XP_008250325 | 90.1 mya | 56% |
Mus musculus | Mouse | NP_083873 | 90.1 mya | 54% |
Rattus norvegicus | Rat | XP_006222844 | 90.1 mya | 51% |
Camelus bactrianus | Camel | XP_010966555 | 95 mya | 63% |
Canis lupus familiaris | Dog | XP_005620646 | 95 mya | 63% |
Equus caballus | Horse | XP_005608538 | 95 mya | 60% |
Felis catus | Cat | XP_011288582 | 95 mya | 60% |
Bos taurus | Cattle | XP_015331266 | 95 mya | 60% |
Lipotes vexillifer | Yangtze river dolphin | XP_007468528 | 95 mya | 50% |
Myotis lucifugus | Brown bat | XP_014318589 | 95 mya | 56% |
Trichechus manatus latirostris | Manatee | XP_004377854 | 102 mya | 66% |
Loxodonta africana | Elephant | XP_003418190 | 102 mya | 59% |
Orycteropus afer afer | Aardvark | XP_007937409 | 102 mya | 54% |
Monodelphis domestica | Opossum | XP_007477328 | 162.4 mya | 42% |
Sarcophilus harrisii | Tasmanian devil | XP_012395810 | 162.4 mya | 41% |
There are three splice variants of C16orf95. [6] The longest transcript contains 1156 base pairs and 7 exons. [7] Compared to variant 1, the second transcript variant lacks exons 4 and 5. [8] This alternative splicing results in a frameshift of the 3' coding region, and a shorter, unique C-terminus. The third transcript variant lacks exons 4 and 5, and uses an alternate 5' exon and start codon. [9] The resulting peptide has unique N- and C-termini compared to variant 1.
Size (base pairs) | |||
---|---|---|---|
Exon # | Variant 1 | Variant 2 | Variant 3 |
1 | 330 | 330 | 334 |
2 | 52 | 52 | 52 |
3 | 126 | 126 | 126 |
4 | 147 | – | – |
5 | 37 | – | – |
6 | 187 | 187 | 187 |
7 | 277 | 278 | 278 |
Total | 1,156 | 973 | 977 |
The 3' untranslated region of the C16orf95 mRNA contains binding sites for KH domain-containing, RNA-binding, signal transduction-associated protein 3 (KHDRBS3) within an internal loop structure. KHDRBS3 regulates mRNA splicing and may act as a negative regulator of cell growth. [12]
The expression of C16orf95 is not well characterized. However, it has been detected at low levels in the following tissue types: bone, brain, ear, eye, intestine, kidney, lung, lymph nodes, prostate, testes, tonsils, skin, and uterus. [13]
The longest isoform of the C16orf95 protein has 239 amino acids. [14] It has a conserved domain of unknown function spanning residues 76 to 239. [14] C16orf95 has a calculated molecular weight of 26.5 kDa, and a predicted isoelectric point of 9.8. [5] Compared to other human proteins, C16orf95 has more cysteine, arginine, and glutamine residues. [5] It has fewer aspartate, glutamate, and asparagine. [5] The high ratio of basic to acidic amino acids contributes to the protein's higher isoelectric point.
C16orf95 is predicted to have several alpha-helices in its C-terminus. [5] This is true for the human and mouse proteins. The N-terminus does not have significant cross-program consensus for secondary structure.
The tools available at ExPASy were used to predict post-translational modification sites on C16orf95. [16] The following modifications are predicted: palmitoylation, phosphorylation, and O-linked glycosylation. Bolded residues in the table indicate sites that are conserved in more than one species.
Predicted modification | Sites - Homo sapiens | Sites - Mus musculus | Sites - Canis lupus familiaris | Tool |
---|---|---|---|---|
Palmitoylation | C77, C80, C126, C178, C187 | C24, C41, C90 | C64, C113, C174 | CSS-Palm [17] |
Phosphorylation | S6, S9, S53, T57, S68, S91, S111, T122, S166 | S30, S76, S89, S120, T134, S141 | S15, S35, T39, S153 | NetPhos 2.0 [18] |
O-β-GlcNAc | S4, S6, S9, T57, S111 | None | None | NetOGlyc 4.0 [19] |
C16orf95 has a large number of amino acid changes over time, indicating it is a quickly evolving protein.
There are no proteins known to interact with C16orf95.
Deletions of C16orf95 have been associated with hydronephrosis, microcephaly, distichiasis, vesicoureteral reflux, and intellectual impairment. [21] [22] However, the deletions included coding regions of the following genes: F-box Protein 31 (FBXO31), Microtubule-Associated Protein 1 Light Chain 3 Beta (MAP1LC3B), and Zinc Finger CCHC Type 14 (ZCCHC14). The contributions of each of these genes to the observed phenotypes has yet to be scientifically determined.
TSR3, or TSR3 Ribosome Maturation Factor, is a hypothetical human protein found on chromosome 16. Its protein is 312 amino acids long and its cDNA has 1214 base pairs. It was previously designated C16orf42.
HIKESHI is a protein important in lung and multicellular organismal development that, in humans, is encoded by the HIKESHI gene. HIKESHI is found on chromosome 11 in humans and chromosome 7 in mice. Similar sequences (orthologs) are found in most animal and fungal species. The mouse homolog, lethal gene on chromosome 7 Rinchik 6 protein is encoded by the l7Rn6 gene.
METTL26, previously designated C16orf13, is a protein-coding gene for Methyltransferase Like 26, also known as JFP2. Though the function of this gene is unknown, various data have revealed that it is expressed at high levels in various cancerous tissues. Underexpression of this gene has also been linked to disease consequences in humans.
Transmembrane protein 268 is a protein that in humans is encoded by TMEM268 gene. The protein is a transmembrane protein of 342 amino acids long with eight alternative splice variants. The protein has been identified in organisms from the common fruit fly to primates. To date, there has been no protein expression found in organisms simpler than insects.
C20orf96 is a protein-coding gene in humans. It codes for an unknown protein known as uncharacterized protein C20orf96, predicted to be a nuclear protein. The function and biological processes of the gene is not well understood by the scientific community yet.
Chromosome 11 open reading frame 86, also known as C11orf86, is a protein-coding gene in humans. It encodes for a protein known as uncharacterized protein C11orf86, which is predicted to be a nuclear protein. The function of this protein is currently unknown.
CXorf49 is a protein, which in humans is encoded by the gene chromosome X open reading frame 49(CXorf49).
Ankyrin repeat domain-containing protein 24 is a protein in humans that is coded for by the ANKRD24 gene. The gene is also known as KIAA1981. The protein's function in humans is currently unknown. ANKRD24 is in the protein family that contains ankyrin-repeat domains.
The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.
FAM210B is a gene that which in Homo sapiens encodes the protein FAM210B. It has been conserved throughout evolutionary history, and is highly expressed in multiple tissues within the human body. FAM210B's primary location is the endoplasmic reticulum.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
OCC-1 is a protein, which in humans is encoded by the gene C12orf75. The gene is approximately 40,882 bp long and encodes 63 amino acids. OCC-1 is ubiquitously expressed throughout the human body. OCC-1 has shown to be overexpressed in various colon carcinomas. Novel splice variant of this gene was also detected in various human cancer types; in addition to encoding a novel smaller protein, OCC-1 gene produces a non-protein coding RNA splice variant lncRNA.
Coiled-coil domain containing protein 180 (CCDC180) is a protein that in humans is encoded by the CCDC180 gene. This protein is known to localize to the nucleus and is thought to be involved in regulation of transcription as are many proteins containing coiled-coil domains. As it is expressed most highly in the testes and is regulated by SRY and SOX transcription factors, it could be involved in sex determination.
Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.
Cardiac-enriched FHL2-interacting protein (CEFIP) is a protein encoded by the gene C10orf71 on chromosome 10 open reading frame 71. It is primarily understood that this gene is moderately expressed in muscle tissue and cardiac tissue.
BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.
Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.
Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.
C1orf122 is a gene in the human genome that encodes the cytosolic protein ALAESM.. ALAESM is present in all tissue cells and highly up-regulated in the brain, spinal cord, adrenal gland and kidney. This gene can be expressed up to 2.5 times the average gene in its highly expressed tissues. Although the function of C1orf122 is unknown, it is predicted to be used for mitochondria localization.
NADP-dependent oxidoreductase domain-containing protein 1 is a protein that in humans is encoded by the NOXRED1 gene. An alias of this gene is Chromosome 14 Open Reading Frame 148 (c14orf148). This gene is located on chromosome 14, at 14q24.3. NOXRED1 is predicted to be involved in pyrroline-5-carboxylate reductase activity as part of the L-proline biosynthetic pathway. It is expressed in a wide variety of tissues at a relatively low level, including the testes, thyroid, skin, small intestine, brain, kidney, colon, and more.
{{cite journal}}
: Cite journal requires |journal=
(help)