C1orf94

Last updated
C1orf94
Identifiers
Aliases C1orf94 , chromosome 1 open reading frame 94
External IDs MGI: 3616080 HomoloGene: 57187 GeneCards: C1orf94
Gene location (Human)
Ideogram human chromosome 1.svg
Chr. Chromosome 1 (human) [1]
Human chromosome 1 ideogram.svg
HSR 1996 II 3.5e.svg
Red rectangle 2x18.png
Band 1p35.1Start34,166,883 bp [1]
End34,219,131 bp [1]
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_032884
NM_001134734

NM_001134733

RefSeq (protein)

NP_001128206
NP_116273

n/a

Location (UCSC) Chr 1: 34.17 – 34.22 Mb Chr 4: 127.93 – 127.97 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. [5] The function of this protein is still poorly understood.

Contents

Gene

C1orf94 gene is also known as Q6P1W5; B3KVT1; D3DPR3; E9PJ76 and Q96IC8is; MGC15882.

C1orf94 has the FLJ20508 gene as an alias. [5]

Locus

Location of C1orf94 on Chromosome 1 Location of C1orf94 on Chromosome 1.png
Location of C1orf94 on Chromosome 1

C1orf94 is located on the short arm of chromosome 1 specifically at 1p34.3 chr1:34,166,883-34,219,131 and is situated near HSPD1P14 gene. It is encoded on the sense strand. [6]

This gene has 7 exons (only 6 of them are coding) [7]

ExonStartEndSize
ENSE00001207243 (non transcribed)34,166,88334,167,171289
ENSE0000353068034,197,22534,197,913689
ENSE0000209507734,200,77234,201,032261
ENSE0000213662934,202,08434,202,259176
ENSE0000213644734,208,15734,208,23478
ENSE0000212516134,212,21034,212,406197
ENSE0000146039934,218,68634,219,131446

mRNA

This protein has two isoforms a and b; a being the longest (598 aa). [8]

NameTranscript IDBase pairsProtein typeProtein length
C1orf94-202ENST00000488417.23050Protein Coding598 aa
C1orf94-201ENST00000373374.72136Protein Coding408 aa

Transcription

C1orf94 conceptual translation C1orf94 Annotation of binding sites of TF.png
C1orf94 conceptual translation

There are two promoters predicted for C1orf94. Only one of them is predicted for the transcript used for the analysis. This is the list of transcription factor binding sites that bind transcription factors: [9]

ZF02 (C2H2 zinc finger transcription factors 2)

Cart1 Sequence-specific DNA-binding transcription factor

HTLV-I U5 repressive element-binding protein 1

NKX homeodomain factors

AARE binding factors PREB core-binding element

Protein

DUF4688 is a large region found within C1orf94 protein sequence and in both isoforms a and b. [10] This sequence is conserved in eukaryotes. [11]

C1orf94 is a Protein tissue co-expression partner for RBBP8NL. [12] the isoelectric point is 8.56 and the molecular weight is around 65353 KDa. Proline is the most abundant amino acid in the protein sequence (11.7%) then followed closely by Leucine (10.4%). [13]

Seven PEST motifs were identified in from positions 1 to 598 : PEST domain signatures, rich in proline (P), glutamic acid (E), serine (S), and threonine (T).

Prediction of only one potential PEST motif with 21 amino acids between positions 133 and 155. This sequence is associated with proteins that have a short intracellular half-life. [14]

Post-translational modifications

3D structure of C1orf94 C1orf94 3D structure.png
3D structure of C1orf94

C1orf94 goes through Palmitoylation, [16] phosphorylation [17] and glycation [18] mainly on the N-terminus of C1orf94. Also, Mitochondrial processing peptidase cleavage site is predicted on the first Methionine.

Structure

According to CFSSP, [19] the secondary structure of C1orf94 shows alpha Helix, extended strands, beta turns, and Random coils.

Both Tertiary structures predicted by Phyre2 [20] and the SWISS model [15] show that C1orf94 is a monomer.

According to I-TASSER [21] the closest protein structures and Identified structural analogs to C1orf94 are 3IXZ (Pig gastric H+/K+-ATPase complexed with aluminum fluoride) and 3B8E (Crystal structure of the sodium-potassium pump).

Protein-protein Interactions

Mentha [22] proposed a strong physical interaction with ATXN1 which is a chromatin-binding factor that represses Notch signaling in the absence of the Notch intracellular domain.

According to PSICQUIC, [23] C1orf94 and MMADHC have physical interactions that were demonstrated through affinity chromatography technology. MMADHC is a gene that encodes a mitochondrial protein that is involved in early steps of vitamin B12 metabolism. [24]

RFX2 is possibly a functional partner according to STRING [25] and it is a query protein and involved in first shell of interactors.RFX2 is a Transcription factor that acts as a key regulator of spermatogenesis.

Expression

According to AceView, this gene is well expressed, 0.5 times the average gene in this release. [26]

According to PSORT II [27] C1orf94 is 69.6 % nuclear.

Data from NCBI shows that C1orf94 is primarily expressed in the testis tissues. [28]

According to the human protein Atlas, [29] C1orf94 is slightly expressed in the brain tissue.

According to GEO profiles, [30] the C1orf94 increase of expression is highly correlated with Morbid obesity. Also, C1orf94 increased after related coactivator depletion.

Function

The function of C1orf94 is not yet fully understood and there are no experiments yet that proved otherwise. However, C1orf94 shows higher rates of expression in HPA RNA sequences in normal tissues compared to tissues during fetal development. [28]

Association with diseases

According to GWAS, [31] C1orf94 was identified as an OncoORF (Oncogenic Open Reading frame). According to Colorectal cancer Atlas, [32] C1orf94 is involved in protein-protein interactions with 50 nodes causing colorectal cancer like interactions with AKAP9 kinase anchor protein, which is the most dangerous one as it promotes colorectal cancer development by regulating Cdc42 interacting protein. [33]

Sequence homology

C1orf94 list of orthologs C1orf94 Orthologs.png
C1orf94 list of orthologs

C1orf94 evolved faster than both Cytochrome C and less than fibrinopeptides.

C1orf94 has no paralogs. Orthologs were identified using NCBI BLASTp. [34] Mammalians showed the most conservation and the most distant orthologs were found in fish.

After running SAPS on a group of orthologs (Gorilla, Rat, Dog, and Bat), the protein’s composition only shows minor variations compared to the human sequence: Proline is still the most abundant amino acid followed by leucine and tryptophan remains the least abundant. [13]

Related Research Articles

YIF1A protein-coding gene in the species Homo sapiens

Protein YIF1A is a protein that in humans is encoded by the YIF1A gene.

C20orf96 is a protein-coding gene in humans. It codes for an unknown protein known as uncharacterized protein C20orf96, predicted to be a nuclear protein. The function and biological processes of the gene is not well understood by the scientific community yet.

C8orf48 protein-coding gene in the species Homo sapiens

C8orf48 is a protein that in humans is encoded by the C8orf48 gene. C8orf48 is a nuclear protein specifically predicted to be located in the nuclear lamina. C8orf48 has been found to interact with proteins that are involved in the regulation of various cellular responses like gene expression, protein secretion, cell proliferation, and inflammatory responses. This protein has been linked to breast cancer and papillary thyroid carcinoma.

PRR29 protein-coding gene in the species Homo sapiens

PRR29 is a protein located on human chromosome 17 that in humans is encoded by the PRR29 gene.

Coiled-coil domain containing protein 180 (CCDC180) is a protein that in humans is encoded by the CCDC180 gene. This protein is known to localize to the nucleus and is thought to be involved in regulation of transcription as are many proteins containing coiled-coil domains. As it is expressed most highly in the testes and is regulated by SRY and SOX transcription factors, it could be involved in sex determination.

C2orf73 protein-coding gene in the species Homo sapiens

Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.

C17orf53 protein-coding gene in the species Homo sapiens

C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.

C21orf58 protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

C18orf63 protein-coding gene in the species Homo sapiens

Chromosome 18 open reading frame 63 is a protein which in humans is encoded by the C18orf63 gene. This protein is not yet well understood by the scientific community. Research has been conducted suggesting that C18orf63 could be a potential biomarker for early stage pancreatic cancer and breast cancer.

C15orf39

C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.

C9orf25 protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 25 (C9orf25) is a domain that encodes the FAM219A gene. The terms FAM219A and C9orf25 are aliases and can be used interchangeably. The function of this gene is not yet completely understood.

C16orf86 protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

TEX55 protein-coding gene in the species Homo sapiens

Testis expressed 55 (TEX55) is a human protein that is encoded by the C3orf30 gene located on the forward strand of human chromosome three, open reading frame 30 (3q13.32). TEX55 is also known as Testis-specific conserved, cAMP-dependent type II PK anchoring protein (TSCPA), and uncharacterized protein C3orf30.

SMCO3 protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

C1orf185 protein-coding gene in the species Homo sapiens

Chromosome 1 open reading frame 185, also known as C1orf185, is a protein that in humans is encoded by the C1orf185 gene. In humans, C1orf185 is a lowly expressed protein that has been found to be occasionally expressed in the circulatory system.

C5orf46 protein-coding gene in the species Homo sapiens

C5orf46 is a protein coding gene located on chromosome 5 in humans. It is also known as sssp1, or skin and saliva secreted protein 1. There are two known isoforms known in humans, with isoform 2 being the longer of the two. The protein encoded is predicted to have one transmembrane domain, and has a predicted molecular weight of 9,692 Da, and a basal isoelectric point of 4.67.

C16orf90 protein-coding gene in the species Homo sapiens

C16orf90 or chromosome 16 open reading frame 90 produces uncharacterized protein C16orf90 in homo sapiens.. C16orf90's protein has four predicted alpha-helix domains and is mildly expressed in the testes and lowly expressed throughout the body. While the function of C16orf90 is not yet well understood by the scientific community, it has suspected involvement in the biological stress response and apoptosis based on expression data from microarrays and post-translational modification data.

C20orf202

C20orf202 is a protein that in humans is encoded by the C20orf202 gene. In humans, this gene encodes for a nuclear protein that is primarily expressed in the lung and placenta.

Fam89A protein-coding gene in the species Homo sapiens

ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000142698 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000028813 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. 1 2 "C1orf94 - Uncharacterized protein C1orf94 - Homo sapiens (Human) - C1orf94 gene & protein". www.uniprot.org. Retrieved 2020-05-01.
  6. 1 2 "C1orf94 Gene - GeneCards | CA094 Protein | CA094 Antibody". www.genecards.org. Retrieved 2020-05-01.
  7. "GeneLoc Integrated Map for Chromosome 1: Exon structure for C1orf94". genecards.weizmann.ac.il. Retrieved 2020-05-01.
  8. "Transcript: C1orf94-201 (ENST00000373374.7) - Summary - Homo sapiens - Ensembl genome browser 100". uswest.ensembl.org. Retrieved 2020-05-01.
  9. "Genomatix - NGS Data Analysis & Personalized Medicine". www.genomatix.de. Retrieved 2020-05-01.
  10. "uncharacterized protein C1orf94 isoform b [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
  11. "InterPro". www.ebi.ac.uk. Retrieved 2020-05-01.
  12. "RBBP8NL Gene - GeneCards | RB8NL Protein | RB8NL Antibody". www.genecards.org. Retrieved 2020-05-01.
  13. 1 2 "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2020-05-01.
  14. "PEST sequence", Wikipedia, 2020-04-15, retrieved 2020-05-01
  15. 1 2 "SWISS-MODEL". swissmodel.expasy.org. Retrieved 2020-05-01.
  16. "CSS-Palm - Palmitoylation Site Prediction". csspalm.biocuckoo.org. Retrieved 2020-05-01.
  17. "NetPhos 3.1 Server". www.cbs.dtu.dk. Retrieved 2020-05-01.
  18. "GPS 5.0 - Kinase-specific Phosphorylation Site Prediction". gps.biocuckoo.cn. Retrieved 2020-05-01.
  19. "CFSSP: Chou & Fasman Secondary Structure Prediction Server". www.biogem.org. Retrieved 2020-05-01.
  20. "PHYRE2 Protein Fold Recognition Server". www.sbg.bio.ic.ac.uk. Retrieved 2020-05-01.
  21. "I-TASSER server for protein structure and function prediction". zhanglab.ccmb.med.umich.edu. Retrieved 2020-05-01.
  22. "mentha: the interactome browser". mentha.uniroma2.it. Retrieved 2020-05-01.
  23. "PSICQUIC View". www.ebi.ac.uk. Retrieved 2020-05-01.
  24. "MMADHC Gene - GeneCards | MMAD Protein | MMAD Antibody". www.genecards.org. Retrieved 2020-05-01.
  25. "STRING: functional protein association networks". string-db.org. Retrieved 2020-05-01.
  26. "AceView: Gene:C1orf94, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
  27. "PSORT II Prediction". psort.hgc.jp. Retrieved 2020-05-01.
  28. 1 2 "C1orf94 chromosome 1 open reading frame 94 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
  29. "The Human Protein Atlas". www.proteinatlas.org. Retrieved 2020-05-01.
  30. "Home - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-05-01.
  31. Delgado AP, Brandao P, Chapado MJ, Hamid S, Narayanan R (2014-07-01). "Open reading frames associated with cancer in the dark matter of the human genome". Cancer Genomics & Proteomics. 11 (4): 201–13. PMID   25048349.
  32. "Colorectal Cancer Atlas | C1orf94 Gene summary::Mutations:: Proteomics :: Domains :: Protein Interactions :: PTMs :: Cell lines :: Colon Atlas :: Colorectal Cancer Database::Bowel cancer::Mutations::Proteomics::Genomics::Cancer Atlas". colonatlas.org. Retrieved 2020-05-01.
  33. Hu ZY, Liu YP, Xie LY, Wang XY, Yang F, Chen SY, Li ZG (June 2016). "AKAP-9 promotes colorectal cancer development by regulating Cdc42 interacting protein 4". Biochimica et Biophysica Acta. 1862 (6): 1172–81. doi:10.1016/j.bbadis.2016.03.012. PMC   4846471 . PMID   27039663.
  34. "Protein BLAST: search protein databases using a protein query". blast.ncbi.nlm.nih.gov. Retrieved 2020-05-01.