C15orf39 | |||||||
---|---|---|---|---|---|---|---|
Identifiers | |||||||
Symbol | C15orf39 | ||||||
NCBI gene | 56905 | ||||||
HGNC | 24497 | ||||||
RefSeq | NP_056307.2 | ||||||
UniProt | Q6ZRI6 | ||||||
|
C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.
C15orf39 is located on chromosome 15 (15q24.2), spanning 16.53kb from 75487985 to 75504515 on the plus DNA strand. [1] C15orf39 has three exons, and seven introns. [1] [2]
The coding sequence for the C15orf39 mRNA is 4443 base pairs long. [4] The C15orf39 gene produces seven mRNA transcripts, with the longest coding isoform being 1047 amino acids long, and the shortest being 27 amino acids which has a truncated 3' end. [5]
C15orf39 is highly expressed in the trigeminal ganglion, superior cervical ganglion, whole blood, and the heart. Low expression levels of C15orf39 were found in the occipital lobe and PB-CD19+ B-cells. [6]
.
.
C15orf39 expression levels in fetal and adult reticulocytes showed significantly different levels of expression (P < 0.0001), with adult reticulocytes expressing more C15orf39 than fetal cells. [7]
.
.
.
.
C15orf39 has an unmodified molecular mass of 110.6 kDA. [2] [8] The modified molecular mass is 110.7 kDA. [9] C15orf39 is composed of an above average level of proline (≈17%), and is deficient in isoleucine (≈1%) and asparagine (≈1%). [10] Both close (Thirteen-lined ground squirrel) and distant (Crested-Ibis) orthologs contained above average levels of proline, and low levels of isoleucine, and asparagine.
C15orf39 has four predicted domains. Two of which, are the proline rich and alanine rich domains. The large tegument protein UL36 domain is important in the regulation of the viral cycle of Human Herpes Virus 1 (HHV-1), including transporting the viral capsid to the nuclear pore complex, and linking the inner and outer viral tegument capsids together. [11] Lastly, the WH2 domain, WASP-homology domain 2, is approximately 18 amino acids long, and serves as an actin binding domain. [12] WH2 binds actin monomers enabling the production of actin filaments.
The predicted post-translational modifications for C15orf39 include phosphorylation, acetylation, sumoylation, and o-glycosylation. An amino acid of importance is K17, which has an acetyl and sumo-group covalently attached. [2] [13] Also, T970, which is phosphorylated and has an o-glycosyl group attached. [14] [15] All predicted post-translational modifications were conserved in distant and strict orthologs.
PTM | Amino Acid Location |
---|---|
Phosphorylation [14] | S208, S322, S467, S496, S497, T970 |
Acetylation [2] | K17 |
Sumoylation [13] | K17, K57, K154, K358, K569, K975 |
Sumoylation Interaction [13] | 462-466 |
O-Glycosylation [15] | S497, T970 |
.
.
.
.
.
Alpha helices predicted in the C15orf39 protein are colored red, and random coils are represented as tan. No beta sheets were predicted to be part of the secondary structure for C15orf39. The amino acids not modeled were predicted to be random coils. [16]
C15orf39 is predicted to be located in the cytosol of the cell. [18]
Protein interaction screenings have showed C15orf39 to interact with many proteins, including RPLP1 and EIF4ENIF1. C15orf39 was discovered to interact with RPLP1 (Large Ribosomal Subunit Protein P1), a cytoplasmic protein, in a high-output yeast two-hybrid screening. RPLP1 is an acidic ribosomal subunit that is important in the elongation step of transcription. [19] [20] EIF4ENIF1 (Eukaryotic Translation Initiation Factor 4E Transporter), is a nucleocytoplasmic protein that shuttles the translation initiation factor eIF4E between the nucleus and cytoplasm. [21] The protein interaction between C15orf39 and EIF4ENIF1 was discovered through affinity capture. [22]
There are no known paralogs for the human C15orf39 gene. [23]
The ortholog space for C15orf39 includes relatives as distant as the cartilaginous fish like Rhincodon typus (whale shark), and as strict as closely related mammals like the Gorilla, which has 99% sequence identity to the human protein. [24] [25] The phylogenetic tree below, shows the evolutionary relationship of the C15orf39 protein sequence in its orthologs. [26]
Scientific Name | Common Name | MYA | Protein Accession # | Length (AA) | % Identity |
---|---|---|---|---|---|
Homo sapiens | Human | 0 | NP_056307 | 1,047 | 100 |
Gorilla gorilla gorilla | Gorilla | 9.06 | XP_004056588.1 | 1,047 | 99 |
Ictidomys tridecemlineatus | Thirteen-lined ground squirrel | 90 | XP_005316869.1 | 1,032 | 80 |
Equus caballus | Horse | 96 | XP_023509136.1 | 1,033 | 79 |
Delphinapterus leucas | Beluga Whale | 96 | XP_022435768.1 | 1,041 | 78 |
Loxodonta africana | African Bush Elephant | 105 | XP_003413993.1 | 1,072 | 75 |
Omithorhynchus anatinus | Platypus | 177 | XP_007656779.1 | 1,119 | 37 |
Gekko japonicus | Gekko Japonicus | 312 | XP_015267003.1 | 1,387 | 51 |
Nipponia Nippon | Crested Ibis | 312 | XP_009468021.1 | 1,046 | 32 |
Xenopus laevis | African Clawed Frog | 352 | XP_018111022.1 | 1,475 | 40 |
Rhincodon typus | Whale Shark | 473 | XP_020392571.1 | 1,491 | 31 |
The graph displays that the C15orf39 protein is quickly evolving. C15orf39's sequence has diverged at a quicker rate than the quickly evolving fibrinogen protein in humans. [27]
.
.
.
.
.
.
.
Ankyrin repeat domain-containing protein 24 is a protein in humans that is coded for by the ANKRD24 gene. The gene is also known as KIAA1981. The protein's function in humans is currently unknown. ANKRD24 is in the protein family that contains ankyrin-repeat domains.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Cardiac-enriched FHL2-interacting protein (CEFIP) is a protein encoded by the gene C10orf71 on chromosome 10 open reading frame 71. It is primarily understood that this gene is moderately expressed in muscle tissue and cardiac tissue.
CRACD-like protein. previously known as KIAA1211L is a protein that in humans is encoded by the CRACDL gene. It is highly expressed in the cerebral cortex of the brain. Furthermore, it is localized to the microtubules and the centrosomes and is subcellularly located in the nucleus. Finally, CRACDL is associated with certain mental disorders and various cancers.
Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.
C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.
Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.
Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.
TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.
Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.
Chromosome 3 open reading frame 38 (C3orf38) is a protein which in humans is encoded by the C3orf38 gene.
C4orf19 is a protein which in humans is encoded by the C4orf19 gene.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.
Chromosome 5 Open Reading Frame 47, or C5ORF47, is a protein which, in humans, is encoded by the C5ORF47 gene. It also goes by the alias LOC133491. The human C5ORF47 gene is primarily expressed in the testis.