C16orf82

Last updated
C16orf82
Identifiers
Aliases C16orf82 , TNT, chromosome 16 open reading frame 82
External IDs HomoloGene: 82387 GeneCards: C16orf82
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_182831
NM_001145545

n/a

RefSeq (protein)

NP_001139017

n/a

Location (UCSC) Chr 16: 27.07 – 27.07 Mb n/a
PubMed search [2] n/a
Wikidata
View/Edit Human

C16orf82 is a protein that, in humans, is encoded by the C16orf82 gene. [3] C16orf82 encodes a 2285 nucleotide mRNA transcript which is translated into a 154 amino acid protein using a non-AUG (CUG) start codon. The gene has been shown to be largely expressed in the testis, tibial nerve, and the pituitary gland, although expression has been seen throughout a majority of tissue types. [4] [5] [6] The function of C16orf82 is not fully understood by the scientific community. [7]

Contents

Gene

Locus

C16orf82 is located in humans at locus 16p12.1 on the positive strand.

General features

The gene encodes for a 2285 nucleotide mRNA transcript that is intronless. Human intronless genes represent a unique subset of the genome that are often involved in signaling, sperm formation, immune responses, or development. [8] C16orf82 being such a gene indicates it may play a role in one of these processes. Translation of C16orf82 initiates at a non-AUG (CUG) start codon. The presence of the non-canonical start codon suggests possible increased regulation of C16orf82 translation and/or possibly could allow for the translation of protein products that start with leucine instead of methionine as seen in proteins coded for by some genes present in the major histocompatibility complex. [9] [10]

DNA level regulation

Promoter

The C16orf82 promoter region has been predicted to contain a number of transcription factor binding sites including binding sites for transcription factors within the SOX family. [11] The presence of the SOX family transcription binding sites suggests that C16orf82 may play a role in sex determination. [12] Actual transcription factor functional studies show binding of the C16ORF82 promoter by ARNT, ELF5, SMAD4, and STAT3. [13]

Expression

C16orf82 expression in humans has been observed in major organ systems including the heart, liver, brain, and kidney at a constant level. [14] The tissue in which C16orf82 has been seen to be most highly expressed has been the testis, both by microarray experiments as well as RNA-seq. [4] [5] C16orf82 expression is also highly variable between individuals, with some expressing the gene in large amounts while others barely express the gene within the same tissue type. [6] [15] Micro RNA (miR-483) over expression has been shown to knock down C16orf82 expression. [16]

Protein

General features

The C16orf82 protein is 154 amino acids in length with an approximate molecular weight of 16.46 kDa with a predicted isoelectric point of 6.06. [17] There are no known variants or isoforms of C16orf82.

Domains

C16orf82 contains one domain, DUF4694, which currently has a function that is uncharacterized. The domain spans from amino acid 8 to amino acid 153. [18] DUF4694 contains a SSGY (serine-serine-glycine-tyrosine) sequence motif that is found in a majority of the protein's orthologs. [19] [20] There is no presence of a transmembrane domain thus the protein is not a transmembrane protein. [21]

Cellular localization

A conceptual diagram of C16orf82's structure. The flags represent sites of predicted phosphorylation and O-linked glycosylation. Grey flags represent phosphorylation sites and red flags represent sites of overlap between phosphorylation and O-linked glycosylation. C16orf82 Diagram With PTM.png
A conceptual diagram of C16orf82's structure. The flags represent sites of predicted phosphorylation and O-linked glycosylation. Grey flags represent phosphorylation sites and red flags represent sites of overlap between phosphorylation and O-linked glycosylation.

The localization of C16orf82 within a cell has been predicted to be nuclear. [21] A bipartite nuclear localization signal can be found starting at Arg107.

One of the predicted 3D models of the human C16orf82 protein. Predicted 3D Structure of human C16orf82 by I-TASSER software.png
One of the predicted 3D models of the human C16orf82 protein.

Post-translational modifications

The human C16orf82 protein has been predicted to be phosphorylated at a number of serine residues. [26] O-linked glycosylation has also been predicted to happen at a number of sites, including some that overlap with the aforementioned phosphorylation sites. [27] The sites of overlap between the two types of post-translational modifications could play important regulatory roles in the activity and lifespan of the human C16orf82 protein. [28]

Secondary structure

The secondary structure of the human C16orf82 protein has been predicted to be largely disordered by a number of modeling programs. [29] [30] [31] [32]

Evolution/homology

Paralogs

No paralogs of C16orf82 exist within humans. [20]

Orthologs

C16orf82 has over 100 predicted orthologs, which all reside in the class mammalia and more precisely the subclass eutheria. [33] [20] All of the orthologs contained the domain DUF4964. [33] The most distant ortholog detected was within the nine-banded armadillo (Dasypus novemcinctus) within the order Cingluata. Below is a table of 20 orthologs from various orders within the subclass eutheria with the sequence identity and time since divergence in relation to humans.

Genus and SpeciesCommon NameDate of divergence (Mya) [34] Accession number [35] Protein Sequence length [35] Sequence Identity (%)
Homo sapiens Human0 NP_001139017.1 154100
Gorilla gorilla gorilla Gorilla9.06 XP_004057433.1 21797
Saimiri boliviensis boliviensis Bolivian squirrel monkey43.2 XP_003945340.1 21781
Carlito syrichta Philippine tarsier67.1 XP_008059656.1 19454
Tupaia chinensis Chinese tree shrew82 XP_006148346.2 21154
Ochotona princeps American pika90 XP_004587173.1 18446
Oryctolagus cuniculus Rabbit90 XP_008256138.1 20749
Microtus ochrogaster Prairie Vole90 XP_005372535.1 18048
Fukomys damarensis Damara mole-rat90 XP_010621795.1 18847
Enhydra lutris kenyoni Northern Sea Otter96 XP_022382137.1 16846
Mustela putorius furo domestic ferret96 XP_012901961.1 17346
Canis lupus familiaris Dog96 NP_001139232.1 15850
Condylura cristata star-nosed mole96 XP_004696008.1 19940
Bos taurus Cattle96 NP_001139230.1 15656
Bison bison bison American Bison96 XP_010835728.1 19755
Capra hircus Goat96 XP_013830092.1 20154
Balaenoptera acutorostrata scammoniMinke Whale96 XP_007187042.1 20652
Equus Caballus Horse96N/A15347
Hipposideros armiger Great Roundleaf Bat96 XP_019505352.1 19263
Loxodonta africana African savanna elephant105 XP_023414770.1 18353
Dasypus novemcinctus nine-banded armadillo105 XP_012377635.1 23849

Rate of evolution

Graph of the adjusted protein sequence divergence of C16orf82, cytochrome C, and fibrinogen using species which contain C16orf82 orthologs. Evolution Graph of C16orf82 Fixed.png
Graph of the adjusted protein sequence divergence of C16orf82, cytochrome C, and fibrinogen using species which contain C16orf82 orthologs.

C16orf82's rate of evolution was determined to be relatively fast even in comparison to fibrinogen, a gene that has been shown to evolve quickly. [36]

Clinical significance

Behavioral disorders

C16orf82 has been associated with Schizophrenia through a genome-wide association study and autism based on copy number variation analysis. [37] [38] Currently, research has not shown if C16orf82 plays any direct role in either of these disorders.

Related Research Articles

<span class="mw-page-title-main">DGLUCY</span> Protein-coding gene in the species Homo sapiens

DGLUCY is a protein that in humans is encoded by the DGLUCY gene.

C5orf34 is a protein that in humans is encoded by the C5orf34 gene (5p12).

C6orf222 is a protein that in humans is encoded by the C6orf222 gene (6p21.31). C6orf222 is conserved in mammals, birds and reptiles with the most distant ortholog being the green sea turtle, Chelonia mydas. The C6orf222 protein contains one mammalian conserved domain: DUF3293. The protein is also predicted to contain a BH3 domain, which has predicted conservation in distant orthologs from the clade Aves.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.

Cardiac-enriched FHL2-interacting protein (CEFIP) is a protein encoded by the gene C10orf71 on chromosome 10 open reading frame 71. It is primarily understood that this gene is moderately expressed in muscle tissue and cardiac tissue.

<span class="mw-page-title-main">C12orf60</span> Protein-coding gene in humans

Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.

UPF0575 protein C19orf67 is a protein which in humans is encoded by the C19orf67 gene. Orthologs of C19orf67 are found in many mammals, some reptiles, and most jawed fish. The protein is expressed at low levels throughout the body with the exception of the testis and breast tissue. Where it is expressed, the protein is predicted to be localized in the nucleus to carry out a function. The highly conserved and slowly evolving DUFF3314 region is predicted to form numerous alpha helices and may be vital to the function of the protein.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

<span class="mw-page-title-main">SHLD1</span> Protein-coding gene in the species Homo sapiens

SHLD1 or shieldin complex subunit 1 is a gene on chromosome 20. The C20orf196 gene encodes an mRNA that is 1,763 base pairs long, and a protein that is 205 amino acids long.

<span class="mw-page-title-main">C19orf44</span> Mammalian protein found in Homo sapiens

Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.

<span class="mw-page-title-main">CFAP299</span> Protein-coding gene in the species Homo sapiens

Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.

<span class="mw-page-title-main">C1orf198</span> Protein-coding gene in the species Homo sapiens

Chromosome 1 open reading frame 198 (C1orf198) is a protein that in humans is encoded by the C1orf198 gene. This particular gene does not have any paralogs in Homo sapiens, but many orthologs have been found throughout the Eukarya domain. C1orf198 has high levels of expression in all tissues throughout the human body, but is most highly expressed in lung, brain, and spinal cord tissues. Its function is most likely involved in lung development and hypoxia-associated events in the mitochondria, which are major consumers of oxygen in cells and are severely affected by decreases in available cellular oxygen.

<span class="mw-page-title-main">C2orf16</span> Protein-coding gene in the species Homo sapiens

C2orf16 is a protein that in humans is encoded by the C2orf16 gene. Isoform 2 of this protein is 1,984 amino acids long. The gene contains 1 exon and is located at 2p23.3. Aliases for C2orf16 include Open Reading Frame 16 on Chromosome 2 and P-S-E-R-S-H-H-S Repeats Containing Sequence.

C2orf74, also known as LOC339804, is a protein encoding gene located on the short arm of chromosome 2 near position 15 (2p15). Isoform 1 of the gene is 19,713 base pairs long. C2orf74 has orthologs in 135 different species, including primarily placental mammals and some marsupials.

<span class="mw-page-title-main">C6orf136</span> Protein-coding gene in the species Homo sapiens

C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

<span class="mw-page-title-main">C11orf98</span> Protein-coding gene in the species Homo sapiens

C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.

<span class="mw-page-title-main">C4orf19</span> Human C4orf19 gene

C4orf19 is a protein which in humans is encoded by the C4orf19 gene.

<span class="mw-page-title-main">C13orf42</span> C13orf42 gene page

C13orf42 is a protein which, in humans, is encoded by the gene chromosome 13 open reading frame 42 (C13orf42). RNA sequencing data shows low expression of the C13orf42 gene in a variety of tissues. The C13orf42 protein is predicted to be localized in the mitochondria, nucleus, and cytosol. Tertiary structure predictions for C13orf42 indicate multiple alpha helices.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000234186 - Ensembl, May 2017
  2. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  3. 1 2 "C16orf82 chromosome 16 open reading frame 82 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2018-02-05.
  4. 1 2 Sato T, Kaneda A, Tsuji S, Isagawa T, Yamamoto S, Fujita T, Yamanaka R, Tanaka Y, Nukiwa T, Marquez VE, Ishikawa Y, Ichinose M, Aburatani H (2013-05-29). "PRC2 overexpression and PRC2-target gene repression relating to poorer prognosis in small cell lung cancer". Scientific Reports. 3: 1911. Bibcode:2013NatSR...3E1911S. doi:10.1038/srep01911. PMC   3665955 . PMID   23714854.
  5. 1 2 Ardlie KG, Deluca DS, Segrè AV, Sullivan TJ, Young TR, Gelfand ET, et al. (GTEx Consortium) (May 2015). "Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans". Science. 348 (6235): 648–60. doi:10.1126/science.1262110. PMC   4547484 . PMID   25954001.
  6. 1 2 Jelinsky SA, Rodeo SA, Li J, Gulotta LV, Archambault JM, Seeherman HJ (May 2011). "Regulation of gene expression in human tendinopathy". BMC Musculoskeletal Disorders. 12: 86. doi:10.1186/1471-2474-12-86. PMC   3095578 . PMID   21539748.
  7. Database, GeneCards Human Gene. "C16orf82 Gene - GeneCards | TNT Protein | TNT Antibody". www.genecards.org. Retrieved 2018-02-19.
  8. Grzybowska EA (July 2012). "Human intronless genes: functional groups, associated diseases, evolution, and mRNA processing in absence of splicing". Biochemical and Biophysical Research Communications. 424 (1): 1–6. doi:10.1016/j.bbrc.2012.06.092. PMID   22732409.
  9. Glass NL (November 2017). "Near-Cognate Codons Contribute Complexity to Translation Regulation". mBio. 8 (6): e01820–17. doi:10.1128/mbio.01820-17. PMC   5676045 . PMID   29114030.
  10. Starck SR, Jiang V, Pavon-Eternod M, Prasad S, McCarthy B, Pan T, Shastri N (June 2012). "Leucine-tRNA initiates at CUG start codons for protein synthesis and presentation by MHC class I". Science. 336 (6089): 1719–23. Bibcode:2012Sci...336.1719S. doi:10.1126/science.1220270. PMID   22745432. S2CID   206540614.
  11. "Genomatix: Login Page". www.genomatix.de. Retrieved 2018-04-22.
  12. Barrionuevo F, Scherer G (March 2010). "SOX E genes: SOX9 and SOX8 in mammalian testis development". The International Journal of Biochemistry & Cell Biology. 42 (3): 433–6. doi:10.1016/j.biocel.2009.07.015. PMID   19647095.
  13. Lachmann A, Xu H, Krishnan J, Berger SI, Mazloom AR, Ma'ayan A (October 2010). "ChEA: transcription factor regulation inferred from integrating genome-wide ChIP-X experiments". Bioinformatics. 26 (19): 2438–44. doi:10.1093/bioinformatics/btq466. PMC   2944209 . PMID   20709693.
  14. Yanai I, Benjamin H, Shmoish M, Chalifa-Caspi V, Shklar M, Ophir R, Bar-Even A, Horn-Saban S, Safran M, Domany E, Lancet D, Shmueli O (March 2005). "Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification". Bioinformatics. 21 (5): 650–9. doi: 10.1093/bioinformatics/bti042 . PMID   15388519.
  15. Goh SH, Josleyn M, Lee YT, Danner RL, Gherman RB, Cam MC, Miller JL (July 2007). "The human reticulocyte transcriptome". Physiological Genomics. 30 (2): 172–8. doi:10.1152/physiolgenomics.00247.2006. PMID   17405831.
  16. Liu M, Roth A, Yu M, Morris R, Bersani F, Rivera MN, Lu J, Shioda T, Vasudevan S, Ramaswamy S, Maheswaran S, Diederichs S, Haber DA (December 2013). "The IGF2 intronic miR-483 selectively enhances transcription from IGF2 fetal promoters and enhances tumorigenesis". Genes & Development. 27 (23): 2543–8. doi:10.1101/gad.224170.113. PMC   3861668 . PMID   24298054.
  17. Walker, John M. (2005). Walker, John M (ed.). The Proteomics Protocols Handbook | SpringerLink. pp.  571–607. doi:10.1385/1592598900. ISBN   978-1-58829-343-5. S2CID   43080491.
  18. "protein TNT [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2018-02-19.
  19. group, NIH/NLM/NCBI/IEB/CDD. "NCBI CDD Conserved Protein Domain DUF4694". www.ncbi.nlm.nih.gov. Retrieved 2018-02-19.
  20. 1 2 3 4 "Protein BLAST: search protein databases using a protein query". blast.ncbi.nlm.nih.gov. Retrieved 2018-02-19.
  21. 1 2 Nakai K, Horton P (January 1999). "PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization". Trends in Biochemical Sciences. 24 (1): 34–6. doi:10.1016/S0968-0004(98)01336-X. PMID   10087920.
  22. Sigrist CJ, de Castro E, Cerutti L, Cuche BA, Hulo N, Bridge A, Bougueleret L, Xenarios I (January 2013). "New and continuing developments at PROSITE". Nucleic Acids Research. 41 (Database issue): D344-7. doi:10.1093/nar/gks1067. PMC   3531220 . PMID   23161676.
  23. Zhang Y (January 2008). "I-TASSER server for protein 3D structure prediction". BMC Bioinformatics. 9: 40. doi:10.1186/1471-2105-9-40. PMC   2245901 . PMID   18215316.
  24. Roy A, Kucukural A, Zhang Y (April 2010). "I-TASSER: a unified platform for automated protein structure and function prediction". Nature Protocols. 5 (4): 725–38. doi:10.1038/nprot.2010.5. PMC   2849174 . PMID   20360767.
  25. Yang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y (January 2015). "The I-TASSER Suite: protein structure and function prediction". Nature Methods. 12 (1): 7–8. doi:10.1038/nmeth.3213. PMC   4428668 . PMID   25549265.
  26. Blom N, Gammeltoft S, Brunak S (December 1999). "Sequence and structure-based prediction of eukaryotic protein phosphorylation sites". Journal of Molecular Biology. 294 (5): 1351–62. doi:10.1006/jmbi.1999.3310. PMID   10600390.
  27. Steentoft C, Vakhrushev SY, Joshi HJ, Kong Y, Vester-Christensen MB, Schjoldager KT, Lavrsen K, Dabelsteen S, Pedersen NB, Marcos-Silva L, Gupta R, Bennett EP, Mandel U, Brunak S, Wandall HH, Levery SB, Clausen H (May 2013). "Precision mapping of the human O-GalNAc glycoproteome through SimpleCell technology". The EMBO Journal. 32 (10): 1478–88. doi:10.1038/emboj.2013.79. PMC   3655468 . PMID   23584533.
  28. Funakoshi Y, Suzuki T (February 2009). "Glycobiology in the cytosol: the bitter side of a sweet world". Biochimica et Biophysica Acta (BBA) - General Subjects. 1790 (2): 81–94. doi: 10.1016/j.bbagen.2008.09.009 . PMID   18952151.
  29. Zhou Y, Kloczkowski A, Faraggi E, Yang Y (2016-10-28). Prediction of protein secondary structure. New York, NY. ISBN   9781493964048. OCLC   961911230.{{cite book}}: CS1 maint: location missing publisher (link)
  30. Drozdetskiy A, Cole C, Procter J, Barton GJ (July 2015). "JPred4: a protein secondary structure prediction server". Nucleic Acids Research. 43 (W1): W389-94. doi:10.1093/nar/gkv332. PMC   4489285 . PMID   25883141.
  31. Kelley LA, Mezulis S, Yates CM, Wass MN, Sternberg MJ (June 2015). "The Phyre2 web portal for protein modeling, prediction and analysis". Nature Protocols. 10 (6): 845–58. doi:10.1038/nprot.2015.053. PMC   5298202 . PMID   25950237.
  32. Biasini M, Bienert S, Waterhouse A, Arnold K, Studer G, Schmidt T, Kiefer F, Gallo Cassarino T, Bertoni M, Bordoli L, Schwede T (July 2014). "SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information". Nucleic Acids Research. 42 (Web Server issue): W252-8. doi:10.1093/nar/gku340. PMC   4086089 . PMID   24782522.
  33. 1 2 "ortholog_gene_162083[group] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2018-02-19.
  34. Kumar S, Stecher G, Suleski M, Hedges SB (July 2017). "TimeTree: A Resource for Timelines, Timetrees, and Divergence Times". Molecular Biology and Evolution. 34 (7): 1812–1819. doi: 10.1093/molbev/msx116 . PMID   28387841.
  35. 1 2 "Home - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2018-04-23.
  36. 1 2 Zhang J, Yang JR (July 2015). "Determinants of the rate of protein sequence evolution". Nature Reviews. Genetics. 16 (7): 409–20. doi:10.1038/nrg3950. PMC   4523088 . PMID   26055156.
  37. McCarthy MJ, Nievergelt CM, Kelsoe JR, Welsh DK (2012-02-22). "A survey of genomic studies supports association of circadian clock genes with bipolar disorder spectrum illnesses and lithium response". PLOS ONE. 7 (2): e32091. Bibcode:2012PLoSO...732091M. doi: 10.1371/journal.pone.0032091 . PMC   3285204 . PMID   22384149.
  38. Wang LS, Hranilovic D, Wang K, Lindquist IE, Yurcaba L, Petkovic ZB, Gidaya N, Jernej B, Hakonarson H, Bucan M (September 2010). "Population-based study of genetic variation in individuals with autism spectrum disorders from Croatia". BMC Medical Genetics. 11: 134. doi:10.1186/1471-2350-11-134. PMC   2954843 . PMID   20858243.