C9orf135

Last updated
CFAP95
Identifiers
Aliases CFAP95 , chromosome 9 open reading frame 135, C9orf135, cilia and flagella associated protein 95
External IDs MGI: 1914733 HomoloGene: 49850 GeneCards: CFAP95
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001010940
NM_001308084
NM_001308085
NM_001308086

Contents

NM_026188

RefSeq (protein)

NP_001010940
NP_001295013
NP_001295014
NP_001295015

NP_080464

Location (UCSC) Chr 9: 69.82 – 69.91 Mb Chr 19: 23.54 – 23.63 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

C9orf135 is a gene that encodes a 229 amino acid protein. It is located on Chromosome 9 of the Homo sapiens genome at 9q12.21. [5] The protein has a transmembrane domain from amino acids 124-140 and a glycosylation site at amino acid 75. C9orf135 is part of the GRCh37 gene on Chromosome 9 and is contained within the domain of unknown function superfamily 4572. [6] Also, c9orf135 is known by the name of LOC138255 which is a description of the gene location on Chromosome 9.1. [7]

c9orf135 location on chromosome 9 and the neighboring genes. C9orf135 Gene Location.gif
c9orf135 location on chromosome 9 and the neighboring genes.

There is some evidence associating the c9orf135 gene with premature ovarian failure. [8] In affected women, an autosomal recessive microduplication occurs which may be linked to premature ovarian failure. A Single Nucleotide Polymorphism (SNP) the c9orf135 gene has been linked to Parkinson's disease; a statistically significant mutation has been seen on a Manhattan plot. [9] Further research is required to establish whether c9orf135 relates to Parkinson's disease. [9]

mRNA

Splice Variants of c9orf135 based on NCBI AceView Slice variants.png
Splice Variants of c9orf135 based on NCBI AceView

The mRNA of c9orf135 is 906 nucleotides in length. [10] The 5' and 3' Untranslated regions (UTR) contain hairpin loops. [11] The 3' UTR comprises 123 nucleotides and the 5' UTR comprises 18 nucleotides. The mRNA encodes a protein with a secondary structure composed of both beta-sheets and alpha-helices. [12]

Protein

Properties of c9orf135

It is likely that c9orf135 is a nuclear protein because it has properties that match attributes of nuclear proteins rather than secretory pathways. [13] [14] Furthermore, there is a nuclear localization signal (PEKVKKL) from amino acid 67 to 73 on c9orf135. [15] C9orf135 is soluble with an average hydrophobicity of -0.772. The negative hydrophobicity value is due to its slightly acidic properties. [16]

Post-translational modifications

Serine phosphorylation sites were seen at amino acid positions 7, 50, 86, 98, and 194. Threonine phosphorylation occurs at 34, 129, 155, and 201. Tyrosine phosphorylation sites occur at 78, 160, 177, and 209. Also, a N-terminal acetylation site is present at amino acid 3. A Signal cleavage site is present between amino acids 11 and 12. [17]

Protein Interaction

PB2 interacts with c9orf135 which was found from a two-hybrid yeast assay. The information provided about PB2 (Polymerase Basic Protein 2) is that it is a viral protein that is involved with the influenza A virus. It is primarily involved in Cap stealing in which it binds the pre-mRNA cap an ultimately cleaves 10-13 nucleotides off. PB2 is also important for starting the replication of viral genomes. PB2 is also known to inhibit type 1 interferon by inhibiting the mitochondrial antiviral signalling protein MAVS. [18]

Mutations

Eleven different common DNA genome variants of the human c9orf135 gene have been identified. All of the mutations within those genome variants have been compiled into the following table. [19] Mutations that were present at levels of 0.01 frequency or higher have been incorporated into the table; synonymous mutations were excluded.

LocationAmino Acid PositionMutationFrequency
5' UTR57N/A0.386
5' UTR93N/A0.047
5' UTR138N/A0.018
Exon169 Missense K30T0.018
Exon237 Missense R53K0.047
Exon456 Missense E126K0.01

Gene Expression

c9orf135 is expressed in connective tissue and testicular tissue at high levels. [20] It is likely that the expression of c9orf135 is expressed at low levels throughout human cells. It was also found that c9orf135 is found at significantly higher levels in the adult human umbilical cord versus the foetal human umbilical cord. [21] Furthermore, in women with ovarian adenocarcinoma the expression of c9orf135 is much higher in the epithelial cells within the ovaries. [22] Women with polycystic ovarian syndrome have a lower expression of c9orf135 than those people who do not have the condition. [23]

Amino Acid Quantity

A comparison between the c9orf135 from Mus musculus (House Mouse) and Pteropus alecto (Black Flying Fox) is described here. There were no significant amino acids that differed in c9orf135 from the rest of the mouse body. However, in the Black Flying Fox, it was valine poor and tryptophan rich. As seen from the human results, the Black Flying fox only shared the tryptophan surplus results. The House Mouse and Black Flying Fox were both used because they shared 64% and 79% similarity in the c9orf135 genome respectively. Analysis demonstrates that alanine and tyrosine could predict points of interest because they both contained results differing from the rest of the human gene averages. [16]

Amino Acid of InterestCompositional Percentage

Compared to normal protein amounts in H. sapiens

Alanine3.1%Less than Average
Tyrosine5.2%More than Average
Trytophan3.1%More than average

Homology

c9orf135 is conserved through eukaryotes, ranging from mammals, reptiles and Annelida.

Orthologs

The orthologs of c9orf135 were sequenced in BLAST and 20 orthologs were picked. The orthologs were all multicellular organisms and were limited to aquatic animals, reptiles, amphibians, and warm-blooded animals. Also, protists, bacteria, archea, and fungi did not have orthologs. However, no paralogs were found when c9orf135 was sequenced in BLAST. Please refer to the spreadsheet for the complete list of orthologs to c9orf135. Time tree was a program that was used to find the evolutionary branching shown in MYA [24] There were no paralogs found for c9orf135.

Genus/SpeciesCommon nameDivergence from Humans (MYA)Accession numberAmino Acid LengthSequence identitySequence similarity
Homo sapiens Human--Q5VTT2229----
Pongo abelii Sumatran Orangutan15.8XP_00281990420686%87%
Rhinopithecus roxellana Golden Snub-Nosed Monkey29.1XP_01036125022993%95%
Mus musculus House Mouse90.9EDL4160422864%73%
Pteropus alecto Black Flying Fox97.5XP_78596423079%86%
Equus przewalskii Przewalski's horse97.5XP_00850480618377%86%
Panthera tigris altaica Siberian Tiger97.5XP_00707753718773%83%
Ovis aries Sheep97.5XP_01494867020769%77%
Elephantulus edwardii Cape Elephant Shrew105XP_00689448525472%82%
Pelodiscus sinensis Chinese Softshell Turtle320.5XP_00613790221755%68%
Gekko japonicusGekko320.5XP_01527599922152%64%
Alligator mississippiensis American Alligator320.5XP_01446414421251%64%
Ophiophagus hannah King Cobra320.5ETE6172021543%59%
Salmo salar Atlantic Salmon429.6XP_0139988409934%55%
Esox lucius Northern Pike429.6XP_01090169115430%47%
Branchiostoma floridae Lancelet733XP_00259178622145%59%
Strongylocentrotus purpuratus Sea Urchin747.8XP_78596424147%62%
Saccoglossus kowalevskii Acorn Worm747.8XP_00273341015338%58%
Lingula anatina Ocean Clam847XP_01339860522043%59%
Crassostrea gigas Pacific Oyster847XP_01142694421540%57%
Helobdella robusta Leech847XP_00901986125629%44%

Divergence of c9orf135

A divergence comparison of c9orf135 with fast diverging cytochrome C, and slow diverging fibrinogen is shown in the chart. Overall, c9orf135 has diverged significantly quicker than fibrinogen and slightly slower than cytochrome C.

C9orf135 Divergence.png

Related Research Articles

<span class="mw-page-title-main">TSR3</span> Hypothetical human protein

TSR3, or TSR3 Ribosome Maturation Factor, is a hypothetical human protein found on chromosome 16. Its protein is 312 amino acids long and its cDNA has 1214 base pairs. It was previously designated C16orf42.

<span class="mw-page-title-main">HIKESHI</span> Protein-coding gene in the species Homo sapiens

HIKESHI is a protein important in lung and multicellular organismal development that, in humans, is encoded by the HIKESHI gene. HIKESHI is found on chromosome 11 in humans and chromosome 7 in mice. Similar sequences (orthologs) are found in most animal and fungal species. The mouse homolog, lethal gene on chromosome 7 Rinchik 6 protein is encoded by the l7Rn6 gene.

<span class="mw-page-title-main">FAM63A</span> Protein-coding gene in the species Homo sapiens

Family with sequence similarity 63, member A is a protein that, is encoded by the FAM63A gene in humans,. It is located on the minus strand of chromosome 1 at locus 1q21.3.

LOC105377021 is a protein which in humans is encoded by the LOC105377021 gene. LOC105377021 exhibits expressional pathology related to breast cancer, specifically triple negative breast cancer. LOC105377021 contains a serine rich region in addition to predicted alpha helix motifs.

The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

<span class="mw-page-title-main">TMEM176B</span> Protein-coding gene in the species Homo sapiens

Transmembrane Protein 176B, or TMEM176B is a transmembrane protein that in humans is encoded by the TMEM176B gene. It is thought to play a role in the process of maturation of dendritic cells.

<span class="mw-page-title-main">C17orf98</span> Protein-coding gene in the species Homo sapiens

C17orf98 is a protein which in humans is coded by the gene c17orf98. The protein is derived from Homo sapiens chromosome 17. The C17orf98 gene consists of a 6,302 base sequence. Its mRNA has three exons and no alternative splice sites. The protein has 154 amino acids, with no abnormal amino acid levels. C17orf98 has a domain of unknown function (DUF4542) and is 17.6kDa in weight. C17orf98 does not belong to any other families nor does it have any isoforms. The protein has orthologs with high percent similarity in mammals and reptiles. The protein has additional distantly related orthologs across the metazoan kingdom, culminating with the sponge family.

<span class="mw-page-title-main">TMEM44</span> Protein-coding gene in the species Homo sapiens

TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.

<span class="mw-page-title-main">C16orf86</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

<span class="mw-page-title-main">C9orf50</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.

<span class="mw-page-title-main">SMCO3</span> Protein-coding gene in the species Homo sapiens

Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.

Proline-rich protein 16 (PRR16) is a protein coding gene in Homo sapiens. The protein is known by the alias Largen.

<span class="mw-page-title-main">TMEM221</span> Protein

Transmembrane protein 221 (TMEM221) is a protein that in humans is encoded by the TMEM221 gene. The function of TMEM221 is currently not well understood.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

<span class="mw-page-title-main">SMIM19</span> Protein-coding gene in the species Homo sapiens

SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association

<span class="mw-page-title-main">FAM166C</span>

Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.

<span class="mw-page-title-main">C12orf50</span> Protein-coding gene in humans

Chromosome 12 Open Reading Frame 50 (C12orf50) is a protein-encoding gene which in humans encodes for the C12orf50 protein. The accession id for this gene is NM_152589. The location of C12orf50 is 12q21.32. It covers 55.42 kb, from 88429231 to 88373811, on the reverse strand. Some of the neighboring genes to C12orf50 are RPS4XP15, LOC107984542, and C12orf29. RPS4XP15 is upstream C12orf50 and is on the same strand. LOC107984542 and C12orf29 are both downstream. LOC107984542 is on the opposite strand while C12orf29 is on the same strand. C12orf50 has six isoforms. This page is focusing on isoform X1. C12orf50 isoform X1 is 1711 nucleotides long and has a protein with a length of 414 aa.

<span class="mw-page-title-main">TMEM212</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of five transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates. TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.

<span class="mw-page-title-main">LRRC74A</span> Protein-coding gene

Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000204711 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000033053 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. "Q5VTT2.1". NCBI, Protein.
  6. "BLAST protein sequence, c9orf135".
  7. Result Filters. (n.d.). Retrieved February 07, 2016, from https://www.ncbi.nlm.nih.gov/protein/Q5VTT2.1 Archived 2018-06-23 at the Wayback Machine
  8. McGuire MM, Bowden W, Engel NJ, Ahn HW, Kovanci E, Rajkovic A (April 2011). "Genomic analysis using high-resolution single-nucleotide polymorphism arrays reveals novel microdeletions associated with premature ovarian failure". Fertility and Sterility. 95 (5): 1595–600. doi:10.1016/j.fertnstert.2010.12.052. PMC   3062633 . PMID   21256485.
  9. 1 2 Chung SJ, Armasu SM, Biernacka JM, Anderson KJ, Lesnick TG, Rider DN, Cunningham JM, Eric Ahlskog J, Frigerio R, Maraganore DM (August 2012). "Genomic determinants of motor and cognitive outcomes in Parkinson's disease". Parkinsonism & Related Disorders. 18 (7): 881–6. doi:10.1016/j.parkreldis.2012.04.025. PMC   3606821 . PMID   22658654.
  10. NCBI, Nucleotide, https://www.ncbi.nlm.nih.gov/nuccore/809279636?report=fasta Archived 2016-09-15 at the Wayback Machine
  11. Sfold, Srna, http://sfold.wadsworth.org/cgi-bin/showcentroid.pl Archived 2016-05-06 at the Wayback Machine
  12. I-Tasser, http://zhanglab.ccmb.med.umich.edu/I-TASSER/ Archived 2020-02-24 at the Wayback Machine
  13. YLoc http://abi.inf.uni-tuebingen.de/Services/YLoc/webloc.cgi?id=86ac5008fd25bd20a065f49a970fd19c Archived 2016-06-03 at the Wayback Machine
  14. SOSUI Classification and Secondary Structure Prediction of Membrane Proteins, http://harrier.nagahama-i-bio.ac.jp/sosui/ Archived 2016-05-05 at the Wayback Machine
  15. PSORT II, PSORT II Prediction, http://psort.hgc.jp/cgi-bin/runpsort.pl%5B%5D
  16. 1 2 SDSC Biology Workbench, SAPS, http://seqtool.sdsc.edu/CGI/BW.cgi#%5B%5D!
  17. ExPasy, Sibs bioinformatics analysis, http://www.expasy.org/ Archived 2018-01-30 at the Wayback Machine
  18. "1 binary interaction found for search term C9orf135". IntAct Molecular Interaction Database. EMBL-EBI. Archived from the original on 2018-08-25. Retrieved 2018-08-25.
  19. NCBI SNP Geneview https://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?geneId=138255&ctg=NT_008470.20&mrna=XM_011518232.1&prot=XP_011516534.1&orien=forward Archived 2018-06-23 at the Wayback Machine
  20. EST profile, NCBI, https://www.ncbi.nlm.nih.gov/nuccore/NM_001308086.1 Archived 2016-09-16 at the Wayback Machine
  21. Geo Profile, NCBI https://www.ncbi.nlm.nih.gov/geoprofiles/78193260 Archived 2016-09-15 at the Wayback Machine
  22. Geo Profile, Ovarian normal surface epithelia and the ovarian cancer epithelial cells, https://www.ncbi.nlm.nih.gov/geo/tools/profileGraph.cgi?ID=GDS3592:236519_at Archived 2018-08-19 at the Wayback Machine
  23. Geo Profile, Obese women with polycystic ovary syndrome and obese, healthy women: skeletal muscle, https://www.ncbi.nlm.nih.gov/geo/tools/profileGraph.cgi?ID=GDS4133:243610_at Archived 2018-08-19 at the Wayback Machine
  24. Hedges SB, Marin J, Suleski M, Paymer M, Kumar S (April 2015). "Tree of life reveals clock-like speciation and diversification". Molecular Biology and Evolution. 32 (4): 835–45. doi:10.1093/molbev/msv037. PMC   4379413 . PMID   25739733.

Further reading

Chromosome 9