C15orf54

Last updated
LINC02915
Identifiers
Aliases LINC02915 , chromosome 15 open reading frame 54, chromosome 15 open reading frame 54 (putative), chromosome 15 putative open reading frame 54, long intergenic non-protein coding RNA 2915, C15orf54
External IDs HomoloGene: 131352; GeneCards: LINC02915; OMA:LINC02915 - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_207445
NM_001302797

n/a

RefSeq (protein)

n/a

n/a

Location (UCSC) Chr 15: 39.25 – 39.25 Mb n/a
PubMed search [2] n/a
Wikidata
View/Edit Human

C15orf54 (Chromosome 15 Open Reading Frame 54) is a protein in humans that is encoded by the C6orf54 gene. This gene is mostly conserved in mammals, primarily primates. While the function of the gene is currently unknown, the gene has shown high expression in the prostate, thymus, appendix, bone marrow, and lungs. [3]

Contents

Gene

C15orf54 is located on chromosome 15 from 39542870 to 39547048 on the direct strand. This gene is 4,180 bases in length. The gene is otherwise known as LOC400360 or FLJ39531. The gene contains 2 distinct gt-ag introns and two exons with two alternatively spliced mRNAs, both encoding the same protein. [3] The NCBI accession number is NC_000015.10. [4]

Location of C15orf54 on chromosome 15 C15orf54 location.png
Location of C15orf54 on chromosome 15

mRNA

Isoforms

C15orf54 has a total of 2 isoforms: variant 1 and variant 2. Variant 1 represents the longer transcript and variant 2 uses an alternate splice site in the 3' exon compared to variant 1. [3]

Variant 1

The complete mRNA is 3095 bp long and contains 2 exons. The 5' UTR contains 383 bp with an in frame stop 48 bp before the Met. The 3' UTR contains 2160 bp followed by the polyA. The standard AATAAA polyadenylation signal is seen about 23 bp before the polyA. The predicted protein product has 183 aa. [3]

Protein

General properties

The sequence for the C15orf54 protein is as follows: [3]

MEVKFITGKHGGRRPQRAEPQRICRALWLTPWPSLILKLLSWIILSNLFLHLRATHHMTE

LPLRFLYIALSEMTFREQTSHQIIQQMSLSNKLEQNQLYGEVINKETDNPVISSGLTLLF

AQKPQSPGWKNMSSTKRVCTILADSCRAQAHAADRGERGHFGVQILHHFIEVFNVMAVRS

NPF

The dominant protein product is 183 amino acids long and has a predicted molecular weight of 21 kDa. The isoelectric point is 9.87. [3] C15orf54 has a relatively high frequency of leucine at 12.0% and a relatively low frequency of tyrosine at 1.1%. [5] The number of multiplets in this sequence is 12. There are no unusual spacings in this protein. [5]

Domains and Motifs

Analysis of C15orf54 showed a globular domain with multiple motif functional sites. One site is the MAPK-docking motif, which consists of one or more basic and two to four hydrophobic residues in adjacent groups. These motifs regulate specific interactions in the MAPK cascade. Another such site is the LIR motif which is a part of the Atg8 protein family ligands and plays a role in selective autophagy by recruiting specific adaptors bound to ubiquitylated proteins, organelles, or pathogens for degradation. [6]

Post-translational modification

C15orf54 is non-myristoylated. There was also no sulfinated sites found in this protein. One motif with a high probability of post translational modification sumoylation sites were found. Sumoylation sites are involved in nuclear-cytosolic transport, transcriptional regulation and protein stability. [7]

Secondary structure

C15orf54 is composed of both alpha helices and beta sheets, as well as turns and some coils. Alpha helices constituted the majority of the protein. [8]

Sub-cellular Localization

The membrane topology was determined to be type 1b with a cytoplasmic tail from 34 to 183, indicating that the C-terminal side will be inside. There was a transmembrane region located from 34 to 50. There were dileucine motifs found in the tail at 39 and 118. [9]

Interacting Proteins

Two interacting proteins were found, lsd2_drome and npfr_drome. Lsd2_drome is a lipid storage droplet surface binding protein and npfr_drome is a neuropeptide F receptor.

Regulation

Gene regulation

Promoter

C15orf54 has one predicted promoter sequence. GXP_6084 is located from 39249718 to 39250757 on the plus strand of chromosome 15 and is composed of 1040 bp. [10]

Transcription factor binding sites

The following table displays the transcription factors most likely to bind to the GXP_6084 promoter for C15orf54. [10]

Matrix FamilyDetailed Family Information
TALETG-interacting factor belonging to TALE class of homeodomain factors
CARTBinding site for S8 type homeodomains
HANDT-cell acute lymphocytic leukemia 1, SCL
ZFHXAREB6 (Atp1a1 regulatory element binding factor 6)
TZAPZinc finger and BTB domain containing 48
SAL4Spalt like 4, DRRS, HSAL4, ZNF797
TEAFTEA domain family member 4, TEF-3
RUSHSWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 3
EGRFWilms Tumor Suppressor

Expression pattern

The gene has shown high expression in the prostate, thymus, appendix, bone marrow, and lungs. NCBI AceView shows that the gene is moderately expressed. [3]

Diagram depicting the expression of C15orf54 in tissues throughout the body. Tissue expression.png
Diagram depicting the expression of C15orf54 in tissues throughout the body.

Transcription regulation

miRNA targeting

TargetScan showed that miRNA hsa-miR-375 was highly conserved across various organisms. This miRNA is specifically expressed in the pancreatic islets, brain, and spinal cord. This miRNA has also been shown to be associated with different cancers, including breast and gastric cancer. [11]

Homology

Rate of evolution

Relative mutation rate of C15orf54 (blue) compared to fibrinogen alpha (grey) and cytochrome C (orange) Evolution Graph of C15orf54.png
Relative mutation rate of C15orf54 (blue) compared to fibrinogen alpha (grey) and cytochrome C (orange)

Paralogs

No paralogs of C15orf54 have been detected in the human genome.

Orthologs

Orthologs were primarily found in primates, although many different mammals also exhibited sizeable sequence similarity to the human C15orf54 sequence. Below is a table of selected orthologs sorted by date of divergence for the C15orf54 gene, including closely and distantly related orthologs. [12] [13] C15orf54 was shown to evolve relatively quickly and evenly over time with a faster rate than both Cytochrome C and Fibrinogen Alpha.

Genus and speciesCommon NameTaxonomic GroupDate of Divergence - Est. Time (MYA)Accession NumberSequence length (aa)Sequence Identity (%)Sequence Similarity (%)nm
Homo sapiensHumansPrimates0NP_001027544.180310010000.0
Macaca mulattaRhesus MacaquePrimates29.44AFE75666.1 (extended)76791.993.98.18.4
Fukomys damarensisDamara Mole RatRodentia90XP_010621546.275373.679.326.430.7
Camelus ferusWild Bactrian CamelArtiodactyla94XP_006175095.280281.888.618.220.1
Odobenus rosmarus divergensPacific WalrusCarnivora96XP_012418040.180684.590.315.516.8
Mirounga leoninaSouthern Elephant SealCarnivora96XP_034842573.180683.789.816.317.8
Manis javanicaMalayan PangolinPholidota96XP_017502667.158449.453.050.670.5
Echinops telfairiLesser Hedgehog TenrecAfrosoricida102XP_030742207.141931.938.968.1114.3
Denticeps clupeoidesDenticle herringActinoptergyii/Clupeiformes435XP_028809248.1303712.416.787.6208.7
Beroe forskaliiCigar comb jelliesCtenophora/Beroida540AHA51259.121212.018.288.0212.0
Araneus ventricosusOrb weaving SpiderAraneae736GBN07005.154330.238.969.8119.7
Capitella teletaSegmented annelid wormAnnelida797ELT92884.153731.343.368.7116.2
Thelazia callipaedaParasitic nematodeNematoda/Rhabditida797VDN04867.141825.232.974.8137.8
Drosophila melanogasterFruit fliesDiptera797NP_650197.150124.434.875.6141.1
Octopus sinensisCommon octopusOctopada/Mollusca797XP_029652221.150456.89.293.2268.8
Nematostella vectensisStarlet Sea AnemoneCnidaria/Anthozoa824EDO31838.148230.742.669.3118.1
Macrostomum lignanoFlatwormPlatyhelminthes/Macrostomida824PAA81016.147727.035.673.0130.9
Salpingoeca rosettaChoanoflagellatesChoanoflagelletes1023XP_004989424.148020.528.379.5158.5
Rhizophagus clarusArbuscular mycorrhizal fungiFungi/Glomerales1105GBB86324.171730.141.869.9120.1
Salmonella entericaGram Negative BacteriaSalmonella/Enterobacterales4290EDQ2188565.131012.818.287.2205.6

Clinical significance

C15orf54 was associated with hypertrophy-associated polymorphisms in heart failure risk [14] and Atherosclerosis risk. [15] C15orf54 was also positively correlated with higher survival rates in patients with gastric cancer. [16] It was also shown to be a locus of interest in determining the glomerular filtration rate in a pool of individuals with Mongolian ancestry [17]

Related Research Articles

<span class="mw-page-title-main">C11orf49</span> Protein-coding gene in the species Homo sapiens

C11orf49 is a protein coding gene that in humans encodes for the C11orf49 protein. It is heavily expressed in brain tissue and peripheral blood mononuclear cells, with the latter being an important component of the immune system. It is predicted that the C11orf49 protein acts as a kinase, and has been shown to interact with HTT and APOE2.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

<span class="mw-page-title-main">C2orf73</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.

<span class="mw-page-title-main">C19orf44</span> Mammalian protein found in Homo sapiens

Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.

<span class="mw-page-title-main">C4orf51</span> Protein-coding gene in the species Homo sapiens

Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.

<span class="mw-page-title-main">CFAP299</span> Protein-coding gene in the species Homo sapiens

Cilia- and flagella-associated protein 299 (CFAP299), is a protein that in humans is encoded by the CFAP299 gene. CFAP299 is predicted to play a role in spermatogenesis and cell apoptosis.

<span class="mw-page-title-main">TEX55</span> Protein-coding gene in the species Homo sapiens

Testis expressed 55 (TEX55) is a human protein that is encoded by the C3orf30 gene located on the forward strand of human chromosome three, open reading frame 30 (3q13.32). TEX55 is also known as Testis-specific conserved, cAMP-dependent type II PK anchoring protein (TSCPA), and uncharacterized protein C3orf30.

<span class="mw-page-title-main">ZNF337</span> Protein-coding gene in the species Homo sapiens

ZNF337, also known as zinc finger protein 337, is a protein that in humans is encoded by the ZNF337 gene. The ZNF337 gene is located on human chromosome 20 (20p11.21). Its protein contains 751 amino acids, has a 4,237 base pair mRNA and contains 6 exons total. In addition, alternative splicing results in multiple transcript variants. The ZNF337 gene encodes a zinc finger domain containing protein, however, this gene/protein is not yet well understood by the scientific community. The function of this gene has been proposed to participate in a processes such as the regulation of transcription (DNA-dependent), and proteins are expected to have molecular functions such as DNA binding, metal ion binding, zinc ion binding, which would be further localized in various subcellular locations. While there are no commonly associated or known aliases, an important paralog of this gene is ZNF875.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

C2orf74, also known as LOC339804, is a protein encoding gene located on the short arm of chromosome 2 near position 15 (2p15). Isoform 1 of the gene is 19,713 base pairs long. C2orf74 has orthologs in 135 different species, including primarily placental mammals and some marsupials.

<span class="mw-page-title-main">SMIM19</span> Protein-coding gene in the species Homo sapiens

SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association

<span class="mw-page-title-main">C9orf85</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.

<span class="mw-page-title-main">FAM214B</span> Protein-coding gene in the species Homo sapiens

The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.

<span class="mw-page-title-main">C6orf136</span> Protein-coding gene in the species Homo sapiens

C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.

<span class="mw-page-title-main">FAM120AOS</span> Protein-coding gene in the species Homo sapiens

FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

<span class="mw-page-title-main">FAM166C</span>

Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.

<span class="mw-page-title-main">C11orf98</span> Protein-coding gene in the species Homo sapiens

C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.

<span class="mw-page-title-main">C4orf36</span> Draft for page on C4orf36 gene/protein

C4orf36 is a protein that in humans is encoded by the c4orf36 gene.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000175746 Ensembl, May 2017
  2. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  3. 1 2 3 4 5 6 7 "AceView: Gene:C15orf54, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov. Retrieved 2020-12-19.
  4. 1 2 "C15orf54 chromosome 15 putative open reading frame 54 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-12-19.
  5. 1 2 "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2020-12-19.
  6. "Motif Scan". myhits.sib.swiss. Retrieved 2020-12-19.
  7. "SIB Swiss Institute of Bioinformatics | Expasy". www.expasy.org. Retrieved 2020-12-19.
  8. Prof. T. Ashok Kumar. "CFSSP: Chou & Fasman Secondary Structure Prediction Server". www.biogem.org. Retrieved 2020-12-19.
  9. "PSORT II Prediction". psort.hgc.jp. Retrieved 2020-12-19.
  10. 1 2 "Genomatix" (in German). Archived from the original on 2001-02-24. Retrieved 2020-12-19.
  11. "TargetScanHuman 7.2". www.targetscan.org. Retrieved 2020-12-19.
  12. "BLAST: Basic Local Alignment Search Tool". blast.ncbi.nlm.nih.gov. Retrieved 2020-12-19.
  13. "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 2020-12-19.
  14. Chamaria, Surbhi; Johnson, Kipp W.; Vengrenyuk, Yuliya; Baber, Usman; Shameer, Khader; Divaraniya, Aparna A.; Glicksberg, Benjamin S.; Li, Li; Bhatheja, Samit; Moreno, Pedro; Maehara, Akiko (2017-08-01). "Intracoronary Imaging, Cholesterol Efflux, and Transcriptomics after Intensive Statin Treatment in Diabetes". Scientific Reports. 7 (1): 7001. Bibcode:2017NatSR...7.7001C. doi: 10.1038/s41598-017-07029-7 . ISSN   2045-2322. PMC   5539108 . PMID   28765529.
  15. Yu, Bing (2013). Human metabolome and common complex diseases: A genetic and epidemiological study among African-Americans in the atherosclerosis risk in communities study (Thesis). ProQuest   1503137765.
  16. Zhou, Li-Li; Jiao, Yan; Chen, Hong-Mei; Kang, Li-Hua; Yang, Qi; Li, Jing; Guan, Meng; Zhu, Ge; Liu, Fei-Qi; Wang, Shuang; Bai, Xue (2019-10-21). "Differentially expressed long noncoding RNAs and regulatory mechanism of LINC02407 in human gastric adenocarcinoma". World Journal of Gastroenterology. 25 (39): 5973–5990. doi: 10.3748/wjg.v25.i39.5973 . ISSN   1007-9327. PMC   6815795 . PMID   31660034.
  17. Park, Hansoo; Kim, Hyun-Jin; Lee, Seungbok; Yoo, Yun Joo; Ju, Young Seok; Lee, Jung Eun; Cho, Sung-Il; Sung, Joohon; Kim, Jong-Il; Seo, Jeong-Sun (2013-02-01). "A family-based association study after genome-wide linkage analysis identified two genetic loci for renal function in a Mongolian population". Kidney International. 83 (2): 285–292. doi: 10.1038/ki.2012.389 . hdl: 10371/91067 . ISSN   0085-2538. PMID   23254893.