C16orf95

Last updated

Chromosome 16 open reading frame 95 (C16orf95) is a gene which in humans encodes the protein C16orf95. It has orthologs in mammals, and is expressed at a low level in many tissues. C16orf95 evolves quickly compared to other proteins.

Contents

Gene

C16orf95 is a Homo sapiens gene oriented on the minus strand of chromosome 16. It is located on the cytogenic band 16q24.2 and spans 14.62 kilobases. [1] The gene contains 6 introns and 7 exons. [1]

Diagram showing the location of C16orf95 on chromosome 16. Image retrieved from the GeneCards entry on C16orf95. Diagram of Chromosome 16.png
Diagram showing the location of C16orf95 on chromosome 16. Image retrieved from the GeneCards entry on C16orf95.

Homology

Paralogs

There are no known paralogs of C16orf95.

Orthologs

Orthologs of C16orf95 exist only in mammals (identified with BLAST). [3] The most distant orthologs are found in opossums and Tasmanian devils.

Genus and speciesCommon nameNCBI accessionDate of divergenceSequence identity
Homo sapiensHumanNP_0011820530 mya100%
Pan paniscusBonoboXP_0089725656.2 mya92%
Gorilla gorilla gorillaGorillaXP_0040581578.3 mya95%
Nomascus leucogenysWhite-cheeked gibbonXP_00327250319.3 mya88%
Mandrillus leucophaeusDrillXP_01182705227.3 mya78%
Propithecus coquereliLemurXP_01251311177.1 mya62%
Tupaia chinensisTree shrewXP_00615261286.5 mya58%
Oryctolagus cuniculusEuropean rabbitXP_00825032590.1 mya56%
Mus musculusMouseNP_08387390.1 mya54%
Rattus norvegicusRatXP_00622284490.1 mya51%
Camelus bactrianusCamelXP_01096655595 mya63%
Canis lupus familiarisDogXP_00562064695 mya63%
Equus caballusHorseXP_00560853895 mya60%
Felis catusCatXP_01128858295 mya60%
Bos taurusCattleXP_01533126695 mya60%
Lipotes vexilliferYangtze river dolphinXP_00746852895 mya50%
Myotis lucifugusBrown batXP_01431858995 mya56%
Trichechus manatus latirostrisManateeXP_004377854102 mya66%
Loxodonta africanaElephantXP_003418190102 mya59%
Orycteropus afer aferAardvarkXP_007937409102 mya54%
Monodelphis domesticaOpossumXP_007477328162.4 mya42%
Sarcophilus harrisiiTasmanian devilXP_012395810162.4 mya41%
The percent identity of several sequences to the human C16orf95 protein were graphed with respect to approximate time of divergence. Data points are labeled with the appropriate species name. Median dates of divergence were found using TimeTree. Percent identity versus approximate time of divergence..png
The percent identity of several sequences to the human C16orf95 protein were graphed with respect to approximate time of divergence. Data points are labeled with the appropriate species name. Median dates of divergence were found using TimeTree.
A time-calibrated phylogenetic tree showing the evolutionary relationships among a subset of orthologs. The primates, rodents, and carnivores are grouped together based on the similarity of their protein sequences. The unrooted tree was made using the ClustalW application in SDSC Biology Workbench. Time-calibrated phylogenetic tree of C16orf95 orthologs.png
A time-calibrated phylogenetic tree showing the evolutionary relationships among a subset of orthologs. The primates, rodents, and carnivores are grouped together based on the similarity of their protein sequences. The unrooted tree was made using the ClustalW application in SDSC Biology Workbench.

mRNA

Alternative splicing

There are three splice variants of C16orf95. [6] The longest transcript contains 1156 base pairs and 7 exons. [7] Compared to variant 1, the second transcript variant lacks exons 4 and 5. [8] This alternative splicing results in a frameshift of the 3' coding region, and a shorter, unique C-terminus. The third transcript variant lacks exons 4 and 5, and uses an alternate 5' exon and start codon. [9] The resulting peptide has unique N- and C-termini compared to variant 1.

Size (base pairs)
Exon #Variant 1Variant 2Variant 3
1330330334
2525252
3126126126
4147
537
6187187187
7277278278
Total1,156973977
The binding sites for KHDRBS3 in the 3' untranslated region (UTR) are highlighted in green. Secondary structure was predicted with the mfold Web Server, and likely sites for RNA-binding proteins were found with RBPDB. Binding sites for KHDRBS3.png
The binding sites for KHDRBS3 in the 3' untranslated region (UTR) are highlighted in green. Secondary structure was predicted with the mfold Web Server, and likely sites for RNA-binding proteins were found with RBPDB.

Secondary structure

The 3' untranslated region of the C16orf95 mRNA contains binding sites for KH domain-containing, RNA-binding, signal transduction-associated protein 3 (KHDRBS3) within an internal loop structure. KHDRBS3 regulates mRNA splicing and may act as a negative regulator of cell growth. [12]

Expression

The expression of C16orf95 is not well characterized. However, it has been detected at low levels in the following tissue types: bone, brain, ear, eye, intestine, kidney, lung, lymph nodes, prostate, testes, tonsils, skin, and uterus. [13]

Protein

Structure

Primary

The longest isoform of the C16orf95 protein has 239 amino acids. [14] It has a conserved domain of unknown function spanning residues 76 to 239. [14] C16orf95 has a calculated molecular weight of 26.5 kDa, and a predicted isoelectric point of 9.8. [5] Compared to other human proteins, C16orf95 has more cysteine, arginine, and glutamine residues. [5] It has fewer aspartate, glutamate, and asparagine. [5] The high ratio of basic to acidic amino acids contributes to the protein's higher isoelectric point.

Secondary

C16orf95 is predicted to have several alpha-helices in its C-terminus. [5] This is true for the human and mouse proteins. The N-terminus does not have significant cross-program consensus for secondary structure.

PELE compiles secondary structure predictions from multiple programs based on the amino acid sequence. Predictions for the C-termini of the human and mouse proteins are shown. There is cross-program consensus that C16orf95 has alpha-helices in its C-terminal tail. This is seen in both the human and mouse proteins. C16orf95 secondary structure prediction- human and mouse.png
PELE compiles secondary structure predictions from multiple programs based on the amino acid sequence. Predictions for the C-termini of the human and mouse proteins are shown. There is cross-program consensus that C16orf95 has alpha-helices in its C-terminal tail. This is seen in both the human and mouse proteins.

Post-translational modifications

The tools available at ExPASy were used to predict post-translational modification sites on C16orf95. [16] The following modifications are predicted: palmitoylation, phosphorylation, and O-linked glycosylation. Bolded residues in the table indicate sites that are conserved in more than one species.

Predicted modificationSites - Homo sapiensSites - Mus musculusSites - Canis lupus familiarisTool
Palmitoylation C77, C80, C126, C178,

C187

C24, C41, C90C64, C113, C174CSS-Palm [17]
Phosphorylation S6, S9, S53, T57, S68,

S91, S111, T122, S166

S30, S76, S89, S120,

T134, S141

S15, S35, T39, S153NetPhos 2.0 [18]
O-β-GlcNAc S4, S6, S9, T57, S111NoneNoneNetOGlyc 4.0 [19]

Evolution

C16orf95 has a large number of amino acid changes over time, indicating it is a quickly evolving protein.

Graph of the corrected number of amino acid changes versus the approximate time of divergence. The corrected number of amino acid substitutions was calculated with the formula: - natural log (1 - observed number of substitutions) x 100. Data points are included for fibrinogen, a quickly evolving protein, and cytochrome c, a slowly evolving protein. Corrected number of amino acid changes versus approximate time of divergence..png
Graph of the corrected number of amino acid changes versus the approximate time of divergence. The corrected number of amino acid substitutions was calculated with the formula: – natural log (1 – observed number of substitutions) × 100. Data points are included for fibrinogen, a quickly evolving protein, and cytochrome c, a slowly evolving protein.

Interacting proteins

There are no proteins known to interact with C16orf95.

Clinical significance

Deletions of C16orf95 have been associated with hydronephrosis, microcephaly, distichiasis, vesicoureteral reflux, and intellectual impairment. [21] [22] However, the deletions included coding regions of the following genes: F-box Protein 31 (FBXO31), Microtubule-Associated Protein 1 Light Chain 3 Beta (MAP1LC3B), and Zinc Finger CCHC Type 14 (ZCCHC14). The contributions of each of these genes to the observed phenotypes has yet to be scientifically determined.

Related Research Articles

<span class="mw-page-title-main">TSR3</span> Hypothetical human protein

TSR3, or TSR3 Ribosome Maturation Factor, is a hypothetical human protein found on chromosome 16. Its protein is 312 amino acids long and its cDNA has 1214 base pairs. It was previously designated C16orf42.

<span class="mw-page-title-main">HIKESHI</span> Protein-coding gene in the species Homo sapiens

HIKESHI is a protein important in lung and multicellular organismal development that, in humans, is encoded by the HIKESHI gene. HIKESHI is found on chromosome 11 in humans and chromosome 7 in mice. Similar sequences (orthologs) are found in most animal and fungal species. The mouse homolog, lethal gene on chromosome 7 Rinchik 6 protein is encoded by the l7Rn6 gene.

<span class="mw-page-title-main">METTL26</span> Protein-coding gene in the species Homo sapiens

METTL26, previously designated C16orf13, is a protein-coding gene for Methyltransferase Like 26, also known as JFP2. Though the function of this gene is unknown, various data have revealed that it is expressed at high levels in various cancerous tissues. Underexpression of this gene has also been linked to disease consequences in humans.

<span class="mw-page-title-main">Transmembrane protein 268</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 268 is a protein that in humans is encoded by TMEM268 gene. The protein is a transmembrane protein of 342 amino acids long with eight alternative splice variants. The protein has been identified in organisms from the common fruit fly to primates. To date, there has been no protein expression found in organisms simpler than insects.

C20orf96 is a protein-coding gene in humans. It codes for an unknown protein known as uncharacterized protein C20orf96, predicted to be a nuclear protein. The function and biological processes of the gene is not well understood by the scientific community yet.

<span class="mw-page-title-main">C11orf86</span> Protein-coding gene in the species Homo sapiens

Chromosome 11 open reading frame 86, also known as C11orf86, is a protein-coding gene in humans. It encodes for a protein known as uncharacterized protein C11orf86, which is predicted to be a nuclear protein. The function of this protein is currently unknown.

CXorf49 is a protein, which in humans is encoded by the gene chromosome X open reading frame 49(CXorf49).

<span class="mw-page-title-main">ANKRD24</span> Protein-coding gene in the species Homo sapiens

Ankyrin repeat domain-containing protein 24 is a protein in humans that is coded for by the ANKRD24 gene. The gene is also known as KIAA1981. The protein's function in humans is currently unknown. ANKRD24 is in the protein family that contains ankyrin-repeat domains.

The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.

<span class="mw-page-title-main">FAM210B</span> Protein-coding gene in the species Homo sapiens

FAM210B is a gene that which in Homo sapiens encodes the protein FAM210B. It has been conserved throughout evolutionary history, and is highly expressed in multiple tissues within the human body. FAM210B's primary location is the endoplasmic reticulum.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

OCC-1 is a protein, which in humans is encoded by the gene C12orf75. The gene is approximately 40,882 bp long and encodes 63 amino acids. OCC-1 is ubiquitously expressed throughout the human body. OCC-1 has shown to be overexpressed in various colon carcinomas. Novel splice variant of this gene was also detected in various human cancer types; in addition to encoding a novel smaller protein, OCC-1 gene produces a non-protein coding RNA splice variant lncRNA.

Coiled-coil domain containing protein 180 (CCDC180) is a protein that in humans is encoded by the CCDC180 gene. This protein is known to localize to the nucleus and is thought to be involved in regulation of transcription as are many proteins containing coiled-coil domains. As it is expressed most highly in the testes and is regulated by SRY and SOX transcription factors, it could be involved in sex determination.

Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.

Cardiac-enriched FHL2-interacting protein (CEFIP) is a protein encoded by the gene C10orf71 on chromosome 10 open reading frame 71. It is primarily understood that this gene is moderately expressed in muscle tissue and cardiac tissue.

BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.

<span class="mw-page-title-main">C2orf73</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.

<span class="mw-page-title-main">C6orf62</span> Protein-coding gene in the species Homo sapiens

Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.

<span class="mw-page-title-main">C1orf122</span> Protein-coding gene in the species Homo sapiens

C1orf122 is a gene in the human genome that encodes the cytosolic protein ALAESM.. ALAESM is present in all tissue cells and highly up-regulated in the brain, spinal cord, adrenal gland and kidney. This gene can be expressed up to 2.5 times the average gene in its highly expressed tissues. Although the function of C1orf122 is unknown, it is predicted to be used for mitochondria localization.

<span class="mw-page-title-main">NOXRED1</span> Human gene

NADP-dependent oxidoreductase domain-containing protein 1 is a protein that in humans is encoded by the NOXRED1 gene. An alias of this gene is Chromosome 14 Open Reading Frame 148 (c14orf148). This gene is located on chromosome 14, at 14q24.3. NOXRED1 is predicted to be involved in pyrroline-5-carboxylate reductase activity as part of the L-proline biosynthetic pathway. It is expressed in a wide variety of tissues at a relatively low level, including the testes, thyroid, skin, small intestine, brain, kidney, colon, and more.

References

  1. 1 2 "C16orf95 chromosome 16 open reading frame 95 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2016-05-03.
  2. "C16orf95 Gene". GeneCards. Weizmann Institute of Science. Retrieved May 8, 2016.
  3. "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2016-05-03.
  4. "TimeTree :: The Timescale of Life". timetree.org. Retrieved 2016-05-03.
  5. 1 2 3 4 5 "SDSC Biology Workbench". workbench.sdsc.edu. Retrieved 2016-05-08.
  6. "c16orf95 - Nucleotide - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2016-05-05.
  7. "Homo sapiens chromosome 16 open reading frame 95 (C16orf95), transcrip - Nucleotide - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2016-05-05.
  8. "Homo sapiens chromosome 16 open reading frame 95 (C16orf95), transcrip - Nucleotide - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2016-05-07.
  9. "Homo sapiens chromosome 16 open reading frame 95 (C16orf95), transcrip - Nucleotide - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2016-05-07.
  10. "RNA Folding Form". The RNA Institute, College of Arts and Sciences, State University of New York at Albany. Retrieved 2016-05-09.
  11. "RBPDB: The database of RNA-binding specificities". rbpdb.ccbr.utoronto.ca. Retrieved 2016-05-09.
  12. "KHDRBS3 - KH domain-containing, RNA-binding, signal transduction-associated protein 3 - Homo sapiens (Human) - KHDRBS3 gene & protein". www.uniprot.org. Retrieved 2016-05-09.
  13. "EST Profile - Hs.729380". www.ncbi.nlm.nih.gov. Retrieved 2016-05-08.
  14. 1 2 "uncharacterized protein C16orf95 isoform 1 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2016-05-08.
  15. "SDSC Biology Workbench". workbench.sdsc.edu. Retrieved 2016-05-09.
  16. "ExPASy: SIB Bioinformatics Resource Portal - Home". www.expasy.org. Retrieved 2016-05-09.
  17. "CSS-Palm - Palmitoylation Site Prediction". csspalm.biocuckoo.org. Archived from the original on 2009-02-15. Retrieved 2016-05-09.
  18. "NetPhos 2.0 Server". www.cbs.dtu.dk. Retrieved 2016-05-09.
  19. "NetOGlyc 4.0 Server". www.cbs.dtu.dk. Retrieved 2016-05-09.
  20. Griffiths, Anthony JF; Miller, Jeffrey H.; Suzuki, David T.; Lewontin, Richard C.; Gelbart, William M. (2000-01-01). "Rate of molecular evolution".{{cite journal}}: Cite journal requires |journal= (help)
  21. Handrigan, G. R., Chitayat, D., Lionel, A. C., Pinsk, M., Vaags, A. K., Marsall, C. R., ... Rosenblum, N. D. (2013). Deletions in 16q24.2 are associated with autism spectrum disorder, intellectual disability and congenital renal malformation. Journal of Medical Genetics, 50(4), 163-73. doi : 10.1136/jmedgenet-2012-101288
  22. Butler, M. G., Dagenais, S. L., Garcia-Perez, J. L., Brouillard, P., Vikkula, M., Strouse, P., Innis, J. W., & Grover, T. W. (2012). Microcephaly, intellectual impairment, bilateral vesicoureteral reflux, distichiasis, and glomuvenous malformations associated with a 16q24.3 contiguous gene deletion and a Glomulin mutation. American Journal of Medical Genetics Part A, 158A(4), 839-49. doi : 10.1002/ajmg.a.35229