C15orf39

Last updated
C15orf39
Chromosomes of C15orf39.png
Chromosome 15
Identifiers
SymbolC15orf39
NCBI gene 56905
HGNC 24497
RefSeq NP_056307.2
UniProt Q6ZRI6
Search for
Structures Swiss-model
Domains InterPro

C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.

Contents

Gene

Location

C15orf39 is located on chromosome 15 (15q24.2), spanning 16.53kb from 75487985 to 75504515 on the plus DNA strand. [1] C15orf39 has three exons, and seven introns. [1] [2]

Location of C15orf39 on chromosome 15. C15orf39 Location.png
Location of C15orf39 on chromosome 15.

mRNA

Isoforms

The coding sequence for the C15orf39 mRNA is 4443 base pairs long. [4] The C15orf39 gene produces seven mRNA transcripts, with the longest coding isoform being 1047 amino acids long, and the shortest being 27 amino acids which has a truncated 3' end. [5]

Expression

Expression of C15orf39 in human tissues. GDS596 Tissues.png
Expression of C15orf39 in human tissues.

C15orf39 is highly expressed in the trigeminal ganglion, superior cervical ganglion, whole blood, and the heart. Low expression levels of C15orf39 were found in the occipital lobe and PB-CD19+ B-cells. [6]

.

.

In-situ hybridization of C15orf39 in fetal and adult reticulocytes. GEO c15orf39.png
In-situ hybridization of C15orf39 in fetal and adult reticulocytes.

C15orf39 expression levels in fetal and adult reticulocytes showed significantly different levels of expression (P < 0.0001), with adult reticulocytes expressing more C15orf39 than fetal cells. [7]

.

.

.

.

Protein

General Properties

C15orf39 has an unmodified molecular mass of 110.6 kDA. [2] [8] The modified molecular mass is 110.7 kDA. [9] C15orf39 is composed of an above average level of proline (≈17%), and is deficient in isoleucine (≈1%) and asparagine (≈1%). [10] Both close (Thirteen-lined ground squirrel) and distant (Crested-Ibis) orthologs contained above average levels of proline, and low levels of isoleucine, and asparagine.

Domains and Motifs

Domains of C15orf39. P = Phosphorylation, A = Acetylation, SUMO = Sumoylation, O = O-glycosylation. Domains of C15orf39 gene..png
Domains of C15orf39. P = Phosphorylation, A = Acetylation, SUMO = Sumoylation, O = O-glycosylation.

C15orf39 has four predicted domains. Two of which, are the proline rich and alanine rich domains. The large tegument protein UL36 domain is important in the regulation of the viral cycle of Human Herpes Virus 1 (HHV-1), including transporting the viral capsid to the nuclear pore complex, and linking the inner and outer viral tegument capsids together. [11] Lastly, the WH2 domain, WASP-homology domain 2, is approximately 18 amino acids long, and serves as an actin binding domain. [12] WH2 binds actin monomers enabling the production of actin filaments.

Post-Translational Modifications

The predicted post-translational modifications for C15orf39 include phosphorylation, acetylation, sumoylation, and o-glycosylation. An amino acid of importance is K17, which has an acetyl and sumo-group covalently attached. [2] [13] Also, T970, which is phosphorylated and has an o-glycosyl group attached. [14] [15] All predicted post-translational modifications were conserved in distant and strict orthologs.

Conceptual translation of C15orf39 C-terminal showing predicted PTM and secondary structure. Conceptual translation c15orf39.png
Conceptual translation of C15orf39 C-terminal showing predicted PTM and secondary structure.
PTMAmino Acid Location
Phosphorylation [14] S208, S322, S467, S496, S497, T970
Acetylation [2] K17
Sumoylation [13] K17, K57, K154, K358, K569, K975
Sumoylation Interaction [13] 462-466
O-Glycosylation [15] S497, T970

.

.

.

.

.

Structure

Alpha helices predicted in the C15orf39 protein are colored red, and random coils are represented as tan. No beta sheets were predicted to be part of the secondary structure for C15orf39. The amino acids not modeled were predicted to be random coils. [16]

Predicted tertiary structure for C-terminal end of C15orf39. 3D Structure C15orf39.png
Predicted tertiary structure for C-terminal end of C15orf39.

Sub-cellular Localization

C15orf39 is predicted to be located in the cytosol of the cell. [18]

Protein Interactions

Protein interaction screenings have showed C15orf39 to interact with many proteins, including RPLP1 and EIF4ENIF1. C15orf39 was discovered to interact with RPLP1 (Large Ribosomal Subunit Protein P1), a cytoplasmic protein, in a high-output yeast two-hybrid screening. RPLP1 is an acidic ribosomal subunit that is important in the elongation step of transcription. [19] [20] EIF4ENIF1 (Eukaryotic Translation Initiation Factor 4E Transporter), is a nucleocytoplasmic protein that shuttles the translation initiation factor eIF4E between the nucleus and cytoplasm. [21] The protein interaction between C15orf39 and EIF4ENIF1 was discovered through affinity capture. [22]

Homology

Paralogs

There are no known paralogs for the human C15orf39 gene. [23]

Orthologs

The ortholog space for C15orf39 includes relatives as distant as the cartilaginous fish like Rhincodon typus (whale shark), and as strict as closely related mammals like the Gorilla, which has 99% sequence identity to the human protein. [24] [25] The phylogenetic tree below, shows the evolutionary relationship of the C15orf39 protein sequence in its orthologs. [26]

Phylogenetic tree for select C15orf39 orthologs. Phylogenetic tree100.png
Phylogenetic tree for select C15orf39 orthologs.
Scientific NameCommon NameMYAProtein Accession #Length (AA)% Identity
Homo sapiens Human0NP_0563071,047100
Gorilla gorilla gorilla Gorilla9.06XP_004056588.11,04799
Ictidomys tridecemlineatus Thirteen-lined ground squirrel90XP_005316869.11,03280
Equus caballus Horse96XP_023509136.11,03379
Delphinapterus leucas Beluga Whale96XP_022435768.11,04178
Loxodonta africana African Bush Elephant105XP_003413993.11,07275
Omithorhynchus anatinus Platypus177XP_007656779.11,11937
Gekko japonicus Gekko Japonicus312XP_015267003.11,38751
Nipponia Nippon Crested Ibis312XP_009468021.11,04632
Xenopus laevis African Clawed Frog352XP_018111022.11,47540
Rhincodon typus Whale Shark473XP_020392571.11,49131

Divergence

Rate of sequence divergence for C15orf39, Fibrinogen, and Cytochrome C in orthologs. Evolutionary rate for c15orf39.png
Rate of sequence divergence for C15orf39, Fibrinogen, and Cytochrome C in orthologs.

The graph displays that the C15orf39 protein is quickly evolving. C15orf39's sequence has diverged at a quicker rate than the quickly evolving fibrinogen protein in humans. [27]

.

.

.

.

.

.

.

.References

  1. 1 2 Thierry-Mieg, Danielle; Thierry-Mieg, Jean. "AceView: Gene:C15orf39, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov. Retrieved 2018-02-19.
  2. 1 2 3 4 "uncharacterized protein C15orf39 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2018-02-19.
  3. "C15orf39 chromosome 15 open reading frame 39 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2018-05-05.
  4. "Homo sapiens chromosome 15 open reading frame 39 (C15orf39), mRNA - Nucleotide - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2018-05-05.
  5. "Gene: C15orf39 (ENSG00000167173) - Summary - Homo sapiens - Ensembl genome browser 92". useast.ensembl.org. Retrieved 2018-05-05.
  6. 1 2 "GDS596 / 204495_s_at". www.ncbi.nlm.nih.gov. Retrieved 2018-05-05.
  7. 1 2 "GDS2655 / 204494_s_at". www.ncbi.nlm.nih.gov. Retrieved 2018-05-05.
  8. "C15orf39 Gene". www.genecards.org. Retrieved 2018-02-19.
  9. "C15orf39 - Antibodies - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2018-05-05.
  10. "PSORT II Prediction". psort.hgc.jp. Retrieved 2018-05-05.
  11. "UL36 - Large tegument protein deneddylase - Human herpesvirus 1 (strain 17) (HHV-1) - UL36 gene & protein". www.uniprot.org. Retrieved 2018-05-05.
  12. "ELM - search the eukaryotic linear motif resource". elm.eu.org. Retrieved 2018-05-05.
  13. 1 2 3 "GPS-SUMO Online Service". GPS. April 20, 2018.[ permanent dead link ]
  14. 1 2 "GPS 3.0 - Kinase-specific Phosphorylation Site Prediction". gps.biocuckoo.org. Retrieved 2018-05-05.
  15. 1 2 "5ADC962700006D04C7682FA1 expired". www.cbs.dtu.dk. Retrieved 2018-05-05.
  16. "NPS@ : GOR4 secondary structure prediction". npsa-prabi.ibcp.fr. Retrieved 2018-05-06.
  17. "Submit a Prediction Job". raptorx.uchicago.edu. Retrieved 2018-05-05.
  18. "C15orf39 - Uncharacterized protein C15orf39 - Homo sapiens (Human) - C15orf39 gene & protein". www.uniprot.org. Retrieved 2018-02-19.
  19. "RPLP1 Gene". www.genecards.org. Retrieved 2018-05-06.
  20. Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, Goehler H, Stroedicke M, Zenkner M, Schoenherr A, Koeppen S, Timm J, Mintzlaff S, Abraham C, Bock N, Kietzmann S, Goedde A, Toksöz E, Droege A, Krobitsch S, Korn B, Birchmeier W, Lehrach H, Wanker EE (September 2005). "A human protein-protein interaction network: a resource for annotating the proteome". Cell. 122 (6): 957–68. doi: 10.1016/j.cell.2005.08.029 . hdl: 11858/00-001M-0000-0010-8592-0 . PMID   16169070.
  21. "EIF4ENIF1 Gene". www.genecards.org. Retrieved 2018-05-06.
  22. Boldt K, van Reeuwijk J, Lu Q, Koutroumpas K, Nguyen TM, Texier Y, et al. (May 2016). "An organelle-specific protein landscape identifies novel diseases and molecular mechanisms". Nature Communications. 7: 11491. Bibcode:2016NatCo...711491B. doi:10.1038/ncomms11491. PMC   4869170 . PMID   27173435.
  23. "Gene: C15orf39 (ENSG00000167173) - Summary - Homo sapiens - Ensembl genome browser 91". useast.ensembl.org. Retrieved 2018-02-26.
  24. "uncharacterized protein C15orf39 homolog isoform X1 [Rhincodon typus] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2018-05-06.
  25. "PREDICTED: uncharacterized protein C15orf39 homolog [Gorilla gorilla g - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2018-05-06.
  26. 1 2 "Multiple Sequence Alignment - CLUSTALW". www.genome.jp. Retrieved 2018-05-06.
  27. 1 2 "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 2018-05-06.

Related Research Articles

<span class="mw-page-title-main">C8orf48</span> Protein-coding gene in the species Homo sapiens

C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.

TMEM156 is a gene that encodes the transmembrane protein 156 (TMEM156) in Homo sapiens. It has the clone name of FLJ23235.

<span class="mw-page-title-main">C10orf67</span> Protein-coding gene in the species Homo sapiens

Chromosome 10 open reading frame 67 (C10orf67), also known as C10orf115, LINC01552, and BA215C7.4, is an un-characterized human protein-coding gene. Several studies indicate a possible link between genetic polymorphisms of this and several other genes to chronic inflammatory barrier diseases such as Crohn's Disease and sarcoidosis.

Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.

Cardiac-enriched FHL2-interacting protein (CEFIP) is a protein encoded by the gene C10orf71 on chromosome 10 open reading frame 71. It is primarily understood that this gene is moderately expressed in muscle tissue and cardiac tissue.

<span class="mw-page-title-main">C6orf62</span> Protein-coding gene in the species Homo sapiens

Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.

<span class="mw-page-title-main">C17orf53</span>

C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.

<span class="mw-page-title-main">C16orf46</span> Human gene

Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.

<span class="mw-page-title-main">C19orf44</span> Mammalian protein found in Homo sapiens

Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.

<span class="mw-page-title-main">C7orf26</span> Human protein-encoding gene on chromosome 7

c7orf26 is a gene in humans that encodes a protein known as c7orf26. Based on properties of c7orf26 and its conservation over a long period of time, its suggested function is targeted for the cytoplasm and it is predicted to play a role in regulating transcription.

Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.

<span class="mw-page-title-main">C16orf90</span> Protein-coding gene in the species Homo sapiens

C16orf90 or chromosome 16 open reading frame 90 produces uncharacterized protein C16orf90 in homo sapiens. C16orf90's protein has four predicted alpha-helix domains and is mildly expressed in the testes and lowly expressed throughout the body. While the function of C16orf90 is not yet well understood by the scientific community, it has suspected involvement in the biological stress response and apoptosis based on expression data from microarrays and post-translational modification data.

<span class="mw-page-title-main">C17orf78</span> Mammalian protein found in Homo sapiens

Uncharacterized protein C17orf78 is a protein encoded by the C17orf78 gene in humans. The name denotes the location of the parent gene, being at the 78th open reading frame, on the 17th human chromosome. The protein is highly expressed in the small intestine, especially the duodenum. The function of C17orf78 is not well defined.

<span class="mw-page-title-main">C1orf94</span> Protein-coding gene in the species Homo sapiens

Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.

<span class="mw-page-title-main">C9orf85</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.

<span class="mw-page-title-main">C2orf72</span> Human protein encoding gene

C2orf72 is a gene in humans that encodes a protein currently named after its gene, C2orf72. It is also designated LOC257407 and can be found under GenBank accession code NM_001144994.2. The protein can be found under UniProt accession code A6NCS6.

<span class="mw-page-title-main">C12orf50</span> Protein-coding gene in humans

Chromosome 12 Open Reading Frame 50 (C12orf50) is a protein-encoding gene which in humans encodes for the C12orf50 protein. The accession id for this gene is NM_152589. The location of C12orf50 is 12q21.32. It covers 55.42 kb, from 88429231 to 88373811, on the reverse strand. Some of the neighboring genes to C12orf50 are RPS4XP15, LOC107984542, and C12orf29. RPS4XP15 is upstream C12orf50 and is on the same strand. LOC107984542 and C12orf29 are both downstream. LOC107984542 is on the opposite strand while C12orf29 is on the same strand. C12orf50 has six isoforms. This page is focusing on isoform X1. C12orf50 isoform X1 is 1711 nucleotides long and has a protein with a length of 414 aa.

Chromosome 20 open reading frame 85, or most commonly known as C20orf85 is a gene that encodes for the C20orf85 Protein. This gene is not yet well understood by the scientific community.

<span class="mw-page-title-main">Chromosome 5 open reading frame 47</span> Human C5ORF47 Gene

Chromosome 5 Open Reading Frame 47, or C5ORF47, is a protein which, in humans, is encoded by the C5ORF47 gene. It also goes by the alias LOC133491. The human C5ORF47 gene is primarily expressed in the testis.

<span class="mw-page-title-main">C12orf54</span> Protein-coding gene in humans

C12orf54 is a protein in humans that is encoded by the C12orf54 gene.