C1orf27

Last updated

Uncharacterized protein Chromosome 1 Open Reading Frame 27 is a protein in humans, encoded by the C1orf27 gene. It is accession number NM_017847. [1] This is a membrane protein that is 3926 base pairs long with the most extensive string of amino acids being 454aa long. C1orf27 exhibits cytoplasmic expression in epidermal tissues. [2] Predicted associated biological processes of the gene include cell fate specification and developmental properties.[ citation needed ]

Contents

Gene

Locus

This gene is located on chromosome 1 at 1q31.1. [3] It is encoded on the plus strand of DNA spanning from 186,344,406 to 186,390,514.

Diagram of C1orf27 and nearby genes on human chromosome 1. Diagram of C1orf27 and nearby genes on human chromosome 1.png
Diagram of C1orf27 and nearby genes on human chromosome 1.

mRNA

Alternative splicing

There appear to be four isoforms due to splicing. [4] Two of those are truncated on the 3' end of the protein from 266aa and 396aa. Additional location of alternative splice sites are from 79aa to 102aa and 246aa to 260aa.

Protein

General properties

The primary encoded protein of C1orf27 consists of 454 amino acid residues and is 3926 base pairs long. [1] It consists of 14 total exons. The predicted molecular weight of the primary, unmodified protein is approximately 51.1 kdal.

Aliases

As with many other genes, there are some common aliases found with this gene.[ citation needed ] Those aliases are Lymphocyte-Activation Gene-1 (LAG1) Interacting Protein, Transparent Testa Glabra 1 (TTG1), and Odorant Response Abnormal 4 (ODR4). The most common alias for C1orf27 is ODR4, and this is what most readily appears when searching the gene.

Composition

Computational analysis revealed the most abundant amino acid to be leucine at 10.1% of the total protein. [5] The second most abundant was serine which contributes to 8.6% of the total protein. Glutamic acid was third most abundant and contributes to 7.7% of the protein. This analysis also revealed that the protein appears to be deficient in tryptophan as it only contributes to 1.1% of the protein. [5] Based on the distribution of other amino acid types, there were five high scoring hydrophobic segments. There were also two transmembrane domains located at 82-98aa and 432-449aa.

Post-translational modifications

InteractorNumber of Predicted SitesFunction
N-myristoylation8Key components of signaling pathways, and typically promotes membrane binding essential for protein localization and/ or biological function [6]
N-glycosylation4Increase protein stability by decreasing protein dynamics. [7]
Protein Kinase C Phosphorylation7Enzymatic activity regulation. [8]
Casein Kinase II Phosphorylation7Epidermal growth factor role. [9]
Tyrosine Kinase Phosphorylation2Alterations to the structural conformation. [10]
cAMP and cGMP Dependent Phosphorylation2Coordination of the active site conformation and enzymatic activity.

C1orf27 is predicted to undergo multiple post translational modifications such as glycosylation, myristoylation, and phosphorylation. [11]

Interactions

There were eight interactions identified by Mentha. [12] The first one was UFSP2 which hydrolyzes the peptide bond at the C-term gly of UFM1, a ubiquitin-like modifier protein bound to a number of target proteins. The second one was HSCB which acts as a co-chaperone in iron-sulfur cluster assembly in mitochondria. The third was GRB2 which is an adapter protein that provides a critical link between cell surface growth factor receptors and the Ras signaling pathway. The fourth was CYLD which is a protease that cleaves Lys-63-linked polyubiquitin chains, controls regulation of cell survival, proliferation, and differentiation, and is required for normal cell cycle progress. The fifth was ATM which activates checkpoint signaling upon double strand breaks, apoptosis, and genotoxic stress. The sixth was FAM177A1, the function of which is unknown. The last two were THID2 and Q81kP6 which are both in bacillus anthracis.

Subcellular localization

The c1orf27 protein is likely cytoplasmic. [13] This was found with 55.5 reliability. The K-NN prediction was k=9/23 and the protein was found to be 55.6% cytoplasmic, 11.1% mitochondrial, 11.1% vacuolar, 11.1% cytoskeletal, and 11.1% golgi.

Structure

Predicted tertiary structure of c1orf27. Tertiary structure of c1orf27.png
Predicted tertiary structure of c1orf27.

Alpha helices predicted in the c1orf27 protein are colored blue in the above picture. Beta sheets are pictured by the red arrows. Random coils are the purple strands between structures.

Expression

Expression of C1orf27 in human tissues. Expression of c1orf27 in human tissues.jpg
Expression of C1orf27 in human tissues.

Overall, expression of c1orf27 seems to be ubiquitous. [16] Highest expression body sites (>50 TPM) were bladder, bone marrow, kidney, liver, pancreas, parathyroid, and vascular. Highest expression health sites (>50 TPM) were adrenal tumors, cervical tumors, and liver tumors. While both of these observations had relatively high TPM scores, there was still relatively low occurrence. This validates the assumption that expression is ubiquitous. There was moderate expression (>25 TPM) in the human fetus, and expression increased with age. [16] Expression was completely absent in the ears, esophagus, lymph, nerve, salivary glands, thyroid, tonsils, and umbilical cord. There was no expression in bladder carcinoma despite expression being elevated in the bladder itself. There was high expression in endothelial cells and neuronal cells but was undetectable in glial cells and neuropil cells. Expression was also localized to the nucleoplasm and plasma membrane in humans but is localized to the cytosol in mice.

Homology

Paralogs

There were no paralogs of C1orf27 identified in the human genome. [4]

Orthologs

There were orthologs identified in most animals for which there were complete genome data. [4] The most distant, yet still relevant, orthologs identified were invertebrates from phylum Cnidaria.

Molecular Evolution

Rate of sequence divergence for C1orf27 in comparison to Fibrinogen and Cytochrome C in orthologs. Divergence of c1orf27.png
Rate of sequence divergence for C1orf27 in comparison to Fibrinogen and Cytochrome C in orthologs.

The m value, or number of corrected amino acid changes per 100 residues, for the C1orf27 gene was graphed against the species divergence in millions of years. When compared to divergence graphs of fibrinogen and cytochrome C, it was determined that this gene closely resembles the evolutionary pattern observed in fibrinogen, suggesting a more rapid rate of evolution. M values for C1orf27 were calculated using the percentage of identity, when compared to humans, observed in the mRNA sequences of the orthologs using the formula derived from the Molecular Clock Hypothesis.

Related Research Articles

<span class="mw-page-title-main">C2orf73</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.

<span class="mw-page-title-main">C17orf98</span> Protein-coding gene in the species Homo sapiens

C17orf98 is a protein which in humans is coded by the gene c17orf98. The protein is derived from Homo sapiens chromosome 17. The C17orf98 gene consists of a 6,302 base sequence. Its mRNA has three exons and no alternative splice sites. The protein has 154 amino acids, with no abnormal amino acid levels. C17orf98 has a domain of unknown function (DUF4542) and is 17.6kDa in weight. C17orf98 does not belong to any other families nor does it have any isoforms. The protein has orthologs with high percent similarity in mammals and reptiles. The protein has additional distantly related orthologs across the metazoan kingdom, culminating with the sponge family.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

<span class="mw-page-title-main">C15orf39</span>

C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.

<span class="mw-page-title-main">C19orf44</span> Mammalian protein found in Homo sapiens

Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.

<span class="mw-page-title-main">PROB1</span> Protein-coding gene in the species Homo sapiens

Proline-rich basic protein 1(PROB1) is a protein encoded by the PROB1 gene located on human chromosome 5, open reading frame 65. PROB1 is also known as C5orf65 and weakly similar to basic proline-rich protein.

<span class="mw-page-title-main">C4orf51</span> Protein-coding gene in the species Homo sapiens

Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.

<span class="mw-page-title-main">C16orf86</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

C11orf42 is an uncharacterized protein in Homo sapiens that is encoded by the C11orf42 gene. It is also known as chromosome 11 open reading frame 42 and uncharacterized protein C11orf42, with no other aliases. The gene is mostly conserved in mammals, but it has also been found in rodents, reptiles, fish and worms.

<span class="mw-page-title-main">TMEM125</span> Protein

Transmembrane protein 125 is a protein that, in humans, is encoded by the TMEM125 gene. It has 4 transmembrane domains and is expressed in the lungs, thyroid, pancreas, intestines, spinal cord, and brain. Though its function is currently poorly understood by the scientific community, research indicates it may be involved in colorectal and lung cancer networks. Additionally, it was identified as a cell adhesion molecule in oligodendrocytes, suggesting it may play a role in neuron myelination.

<span class="mw-page-title-main">Fam89A</span> Human protein and gene

ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.

C3orf56 is a protein encoding gene found on chromosome 3. Although, the structure and function of the protein is not well understood, it is known that the C3orf56 protein is exclusively expressed in metaphase II of oocytes and degrades as the oocyte develops towards the blastocyst stage. Degradation of the C3orf56 protein suggests that this gene plays a role in the progression from maternal to embryonic genome and in embryonic genome activation.

<span class="mw-page-title-main">C9orf85</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.

<span class="mw-page-title-main">PANO1</span> Mammalian protein found in Homo sapiens

PANO1 is a protein which in humans is encoded by the PANO1 gene. PANO1 is an apoptosis inducing protein that is able to regulate the function of tumor suppressor. More specifically, P14ARF is a protein in which in humans is modulated by the PANO1 gene. P14ARF is known to function as a tumor suppressor. When PANO1 is highly expressed in the cells, it is able to modulate p14ARF by stabilizing it and protecting it from degradation. With a confidence level of 5 out of 5, PANO1 has been theorized to be expressed in the nucleolus of the cell. PANO1 is an intron-less gene. Intron-less genes only make up about 3% of the human genome. A functional analysis of these types of genes revealed that they often have tissue-specific expression in tissues such as the nervous system and testis. This kind of expression is commonly associated with neuropathies, disease, and cancer. The tissue types that PANO1 has the highest expression in, are the cerebellum regions of the brain as well as pituitary and testis tissues.

<span class="mw-page-title-main">C11orf98</span> Protein-coding gene in the species Homo sapiens

C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.

<span class="mw-page-title-main">C4orf36</span> Draft for page on C4orf36 gene/protein

C4orf36 is a protein that in humans is encoded by the c4orf36 gene.

<span class="mw-page-title-main">TMEM104</span> TMEM104 protein encoding gene

Transmembrane protein 104 (TMEM104) is a protein that in humans is encoded by the TMEM104 gene. The aliases of TMEM104 are FLJ00021 and FLJ20255. Humans have a 163,255 base pair long gene coding sequence, 4703 base pair long mRNA, and 496 amino acid long protein sequence. In Eukaryotes, the TMEM104 gene is conserved.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

<span class="mw-page-title-main">TMEM271</span> TMEM271 gene and protein

Transmembrane protein 271, or TMEM271 is a protein in Homo sapiens encoded by the TMEM271 gene, located at 4p16.3 on the minus strand. The protein is located on the plasma membrane of cells and highly expressed in several regions of the brain.

<span class="mw-page-title-main">SCRN3</span> Protein-coding gene in the species Homo sapiens

Secernin-3 (SCRN3) is a protein that is encoded by the human SCRN3 gene. SCRN3 belongs to the peptidase C69 family and the secernin subfamily. As a part of this family, the protein is predicted to enable cysteine-type exopeptidase activity and dipeptidase activity, as well as be involved in proteolysis. It is ubiquitously expressed in the brain, thyroid, and 25 other tissues. Additionally, SCRN3 is conserved in a variety of species, including mammals, birds, fish, amphibians, and invertebrates. SCRN3 is predicted to be an integral component of the cytoplasm.

References

  1. 1 2 "Homo sapiens odr-4 GPCR localization factor homolog (ODR4), transcript - Nucleotide - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2018-02-05.
  2. "Tissue expression of C1orf27 - Summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2018-02-05.
  3. 1 2 "ODR4 odr-4 GPCR localization factor homolog [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2018-02-19.
  4. 1 2 3 "Protein BLAST: search protein databases using a protein query". blast.ncbi.nlm.nih.gov. Retrieved 2018-02-25.
  5. 1 2 EMBL-EBI. "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2018-04-22.
  6. Wright MH, Heal WP, Mann DJ, Tate EW (March 2010). "Protein myristoylation in health and disease". Journal of Chemical Biology. 3 (1): 19–35. doi:10.1007/s12154-009-0032-8. PMC   2816741 . PMID   19898886.
  7. "Proteomics/Post-translational Modification/Glycosylation - Wikibooks, open books for an open world". en.wikibooks.org. Retrieved 2018-05-06.
  8. Robles-Flores M, Meléndez L, García W, Mendoza-Hernández G, Lam TT, Castañeda-Patlán C, González-Aguilar H (May 2008). "Posttranslational modifications on protein kinase c isozymes. Effects of epinephrine and phorbol esters". Biochimica et Biophysica Acta (BBA) - Molecular Cell Research. 1783 (5): 695–712. doi: 10.1016/j.bbamcr.2007.07.011 . PMID   18295358.
  9. Siomi MC, Higashijima K, Ishizuka A, Siomi H (December 2002). "Casein kinase II phosphorylates the fragile X mental retardation protein and modulates its biological properties". Molecular and Cellular Biology. 22 (24): 8438–47. doi:10.1128/MCB.22.24.8438-8447.2002. PMC   139871 . PMID   12446764.
  10. Rahimi N, Costello CE (January 2015). "Emerging roles of post-translational modifications in signal transduction and angiogenesis". Proteomics. 15 (2–3): 300–9. doi:10.1002/pmic.201400183. PMC   4297243 . PMID   25161153.
  11. "ExPASy - PROSITE". prosite.expasy.org. Retrieved 2018-04-22.
  12. "mentha: the interactome browser". mentha.uniroma2.it. Retrieved 2018-04-22.
  13. "PSORT II Prediction". psort.hgc.jp. Retrieved 2018-04-22.
  14. Kelley, Lawrence. "PHYRE2 Protein Fold Recognition Server". www.sbg.bio.ic.ac.uk. Retrieved 2018-05-06.
  15. "GDS1402 / NM_017847.1_PROBE1". www.ncbi.nlm.nih.gov. Retrieved 2018-05-06.
  16. 1 2 Group, Schuler. "EST Profile - Hs.371210". www.ncbi.nlm.nih.gov. Retrieved 2018-05-06.