C11orf42

Last updated

C11orf42 is an uncharacterized protein in Homo sapiens that is encoded by the C11orf42 gene. [1] It is also known as chromosome 11 open reading frame 42 and uncharacterized protein C11orf42, with no other aliases. [2] The gene is mostly conserved in mammals, but it has also been found in rodents, reptiles, fish and worms.

Contents

Gene

Location

The gene is located on 11p15.4 and has three exons. [1] C11orf42 starts at 6205568 bp and ends at 6211319 bp. [3] C11orf42 spans 5752 base pairs and encodes in the negative strand of chromosome 11. [3]

C11orf42 direction.png

Neighborhood

On chromosome 11, the genes FAM160A2 (gene) and OR52W1 are neighbors to C11orf42. FAM160A2 encodes in the positive strand of chromosome 11. OR52W1 encodes in the negative strand of chromosome 11.

Expression

C11orf42 is expressed in a total of seventy-four organs. [4] In a study of ninety-five individuals, twenty-seven different tissues had RNA sequencing completed to determine the tissue-specificity of the protein-coding genes. C11orf42 is expressed in a variety of tissues, although it is most broadly expressed in the skin as well as the testes. [1] According to the EST Profile of C11orf42, the protein is abundant in the bladder, brain, and testis. It was associated with bladder carcinoma and seen in the adult developmental stage. [5]

Microarray Expression

This image shows the expression of C11orf42 for different levels of brain aneurysm. The red columns represent the expression levels and the blue squares represent the percentage ranking of expression for C11orf42 among other genes expressed. The light green represents ruptured aneurysm, the medium green represents the unruptured anuerysm, and the dark green represents the superficial temporal artery. Microarray Brain C11orf42.png
This image shows the expression of C11orf42 for different levels of brain aneurysm. The red columns represent the expression levels and the blue squares represent the percentage ranking of expression for C11orf42 among other genes expressed. The light green represents ruptured aneurysm, the medium green represents the unruptured anuerysm, and the dark green represents the superficial temporal artery.

C11orf42 was observed in a RNA microarray that looked at different levels of intercranial aneurysms in order to see the biological heterogeneity of aneurysms. This was completed by splitting samples into groups with similar gene expressions. It was found that Kruppel-like family of transcription factors (KLF2, KLF12, and KLF15), which were anti-inflammatory regulators, were down-regulated. [6] This family of transcription factors is found in C11orf42 and it appears to have an effect on the development of aneurysm walls as a mechanical strengthener. [6] It was found that in ruptured aneurysms, C11orf42 was ranked at 55.5% among other genes and had an expression level of 8.16. In unruptured aneurysms, C11orf42 had a ranking of 50.4% and an expression level of 9.09. For the superficial temporal artery, the ranking was 35% and the expression level was 7.75. This further supports that C11orf42 has an effect in aneurysms and is involved through transcription in strengthening their walls.

Promoter

According to the Genomatix program El Dorado, the 5' UTR region is predicted to be 48 base pairs in length while the 3' UTR is 93 base pairs long. [7] Multiple sequence alignments were formed for reach UTR and there were positions found to be conserved over many orthologs. In the 5' UTR, a predicted stem loop region is found at nucleotide positions 9-25 and in the 3' UTR, the predicted stem loop region is found at nucleotide positions 1070-1084 and 1096-1112.

Transcript

Isoform

C11orf42 has two isoforms known as uncharacterized protein C11orf42 variant X1 and X2. C11orf42 variant X1 is 5629 bp long and has 303 amino acids. [8] C11orf42 variant X2 is 4705 bp long and has 298 amino acids. [9] There wasn't any exons found for either one of the isoforms.

Protein

Length

3D image of C11orf42 with alpha helices in pink and beta sheets in yellow. Tasser.png
3D image of C11orf42 with alpha helices in pink and beta sheets in yellow.

C11orf42 encodes a protein with a length of 333 amino acids. The protein has a weight of 36.7 kilodaltons and is mostly proline rich. [10] The protein has a composition of 13% alpha helices and 27% beta sheets. [11]

Paralogs

There are no paralogs found for the C11orf42 protein after using NCBI BLAST.

Orthologs

DescriptionCommon NameNCBI Accession IDQuery CoverE ValueIdentityDate of Divergent (MYA)
Homo sapiens HumanNP_775796.2100%0.0100%N/A
Pan paniscus Bonobo primateXP_003819114.1100%0.099.10%6.4
Gorilla gorilla gorilla Western Lowland GorillaXP_004050626.2100%0.097.30%8.61
Macaca mulatta Rhesus MonkeyNP_001181413.1100%0.095.20%28.10
Cercocebus atys Sooty MangabeyXP_011891914.1100%0.094.29%28.10
Ochotona princeps American PikaXP_004590032.1100%0.091.29%88
Ictidomys tridecemlineatus Thirteen-Lined Ground SquirrelXP_005341148.1100%0.090.09%88
Callorhinus ursinus Northern Fur SealXP_025704267.1100%0.089.79%94
Pteropus vampyrus Large Flying FoxXP_011383491.1100%0.089.19%94
Trichechus manatus latirostrisFlorida ManateeXP_004389128.1100%0.088.29%102
Carlito syrichta Philippine TarsierXP_008062834.1100%0.087.09%66.7
Hipposideros armiger Great Roundleaf BatXP_019523989.1100%0.086.49%94
Cricetulus griseus Chinese HamsterRLQ71833.1100%0.085.33%88
Manis javanica Sunda PangolinXP_017522375.1100%0.078.38%94
Gekko japonicus Schlegel's Japanese GeckoXP_015277243.163%1e-6153.08%320
Protobothrops mucrosquamatus Brown Spotted PitviperXP_015672776.199%1e-6744.71%320
Terrapene mexicana triunguisThree Toed Box TurtleXP_026514492.199%3e-5340.30%320
Callorhinchus milii Australian GhostsharkXP_007899855.163%7e-2435.27%465
Oncorhynchus tshawytscha Chinook SalmonXP_024270155.151%1e-0732.57%432
Saccoglossus kowalevskii Acorn WormXP_006813712.159%2e-0424.53%627
This image shows the multiple sequence alignment of close and distant related orthologs for C11orf42. MSA of C11orf42.png
This image shows the multiple sequence alignment of close and distant related orthologs for C11orf42.

Conserved Domains

Gray positions represent the start and stop codons. Red positions show important sites of post translational modifications like glycation, amidation, N-myristoylation, nuclear export signal and O-glycosylation. The blue rectangles represent the domain of unknown function and the black region represents the proline rich region of the protein. C11orf42 Protein Schematic.png
Gray positions represent the start and stop codons. Red positions show important sites of post translational modifications like glycation, amidation, N-myristoylation, nuclear export signal and O-glycosylation. The blue rectangles represent the domain of unknown function and the black region represents the proline rich region of the protein.

After searching through [NCBI], it was found that C11orf42 only had a domain of unknown function called DUF4463, which is conserved from humans to worms. [12] The DUF4463 ranges from amino acid positions 4-209 and then positions 313-333. PRR stands for proline-rich region and it is located in amino acid positions 210-312. [13]

Post Translational Modifications

C11orf42 post-translational modifications C11orf42 Conceptual Translation.png
C11orf42 post-translational modifications

C11orf42 is predicted to undergo various types of post translational modifications including Glycation, O-GlcNAc, O-Glycosylation, and Phosphorylation. There was a Leucine Nuclear Export signal found at amino acid position 125. [14]

Cellular Sub Localization

C11orf42 was found to be cytoplasmic in its early life within organisms that were not mammals (Schlegel's Japanese Gecko, Acorn Worm) and then it was found to be nuclear localized as it continued to evolve in Mammalia. [15] No signal peptides, mitochondrial targeting sequences or chloroplast peptides were predicted for the protein and therefore not predicted to localize to a secretory pathway, mitochondria, nor chloroplast. [14]

Interacting Proteins

IGLL1, SNX2, and SNX5 were all found to have interacted with C11orf42. IGLL1 had a "textmining" [16] interaction with this gene while SNX2 and SNX5 had physical interactions with this gene. [17]

Clinical Significance

As seen above in the microarray expression and the tissue expression, the protein of C11orf42 may have a role in the prevalence of bladder carcinomas as well as brain aneurysms. There was another study that looked at the effect of treatment for rheumatoid arthritis that showed an influence level from C11orf42 as well, indicating that it could affect the medication available for treatment therapies like it did for anti-TNF therapy. [18] More research still needs to be done to confirm if C11orf42 has an effect in other diseases or in the treatment process of diseases.

Related Research Articles

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.

Leukocyte Receptor Cluster Member 9 is an uncharacterized protein encoded by the LENG9 gene. In humans, LENG9 is predicted to play a role in fertility and reproductive disorders associated with female endometrium structures.

BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.

<span class="mw-page-title-main">C6orf62</span> Protein-coding gene in the species Homo sapiens

Chromosome 6 open reading frame 62 (C6orf62), also known as X-trans-activated protein 12 (XTP12), is a gene that encodes a protein of the same name. The encoded protein is predicted to have a subcellular location within the cytosol.

<span class="mw-page-title-main">C16orf46</span> Human gene

Chromosome 16 open reading frame 46 is a protein of yet to be determined function in Homo sapiens. It is encoded by the C16orf46 gene with NCBI accession number of NM_001100873. It is a protein-coding gene with an overlapping locus.

<span class="mw-page-title-main">Chromosome 9 open reading frame 43</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 43 is a protein that in humans is encoded by the C9orf43 gene. The gene is also known as MGC17358 and LOC257169. C9orf43 contains DUF 4647 and a polyglutamine repeat region although protein function is not well understood.

<span class="mw-page-title-main">C16orf86</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

<span class="mw-page-title-main">C9orf50</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.

LOC101928193 is a protein which in humans is encoded by the LOC101928193 gene. There are no known aliases for this gene or protein. Similar copies of this gene, called orthologs, are known to exist in several different species across mammals, amphibians, fish, mollusks, cnidarians, fungi, and bacteria. The human LOC101928193 gene is located on the long (q) arm of chromosome 9 with a cytogenic location at 9q34.2. The molecular location of the gene is from base pair 133,189,767 to base pair 133,192,979 on chromosome 9 for an mRNA length of 3213 nucleotides. The gene and protein are not yet well understood by the scientific community, but there is data on its genetic makeup and expression. The LOC101928193 protein is targeted for the cytoplasm and has the highest level of expression in the thyroid, ovary, skin, and testes in humans.

<span class="mw-page-title-main">C7orf50</span> Mammalian protein found in Homo sapiens

C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.

<span class="mw-page-title-main">C12orf24</span> Protein-coding gene in humans

C12orf24 is a gene in humans that encodes a protein known as FAM216A. This gene is primarily expressed in the testis and brain, but has constitutive expression in 25 other tissues. FAM216A is an intracellular protein that has been predicted to reside within the nucleus of cells. The exact function of C12orf24 is unknown. FAM216A is highly expressed in Sertoli cells of the testis as well as different stage spermatids.

TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.

<span class="mw-page-title-main">SMIM19</span> Protein-coding gene in the species Homo sapiens

SMIM19, also known as Small Integral Membrane Protein 19, encodes the SMIM19 protein. SMIM19 is a confirmed single-pass transmembrane protein passing from outside to inside, 5' to 3' respectively. SMIM19 has ubiquitously high to medium expression with among varied tissues or organs. The validated function of SMIM19 remains under review because of on sub-cellular localization uncertainty. However, all linked proteins research to interact with SMIM19 are associated with the endoplasmic reticulum (ER), presuming SMIM19 ER association

<span class="mw-page-title-main">C9orf85</span> Protein-coding gene in the species Homo sapiens

Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13. When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal. Isoelectric point was found to be 9.54. The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.

<span class="mw-page-title-main">FAM214B</span> Protein-coding gene in the species Homo sapiens

The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.

<span class="mw-page-title-main">FAM120AOS</span> Protein-coding gene in the species Homo sapiens

FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.

<span class="mw-page-title-main">TMEM101</span>

Transmembrane protein 101 (TMEM101) is a protein that in humans is encoded by the TMEM101 gene. The TMEM101 protein has been demonstrated to activate the NF-κB signaling pathway. High levels of expression of TMEM101 have been linked to breast cancer.

<span class="mw-page-title-main">C12orf50</span> Protein-coding gene in humans

Chromosome 12 Open Reading Frame 50 (C12orf50) is a protein-encoding gene which in humans encodes for the C12orf50 protein. The accession id for this gene is NM_152589. The location of C12orf50 is 12q21.32. It covers 55.42 kb, from 88429231 to 88373811, on the reverse strand. Some of the neighboring genes to C12orf50 are RPS4XP15, LOC107984542, and C12orf29. RPS4XP15 is upstream C12orf50 and is on the same strand. LOC107984542 and C12orf29 are both downstream. LOC107984542 is on the opposite strand while C12orf29 is on the same strand. C12orf50 has six isoforms. This page is focusing on isoform X1. C12orf50 isoform X1 is 1711 nucleotides long and has a protein with a length of 414 aa.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

References

  1. 1 2 3 "C11orf42 chromosome 11 open reading frame 42 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2019-02-26.
  2. "Gene Cards: C11orf42". www.genecards.org. Retrieved 2019-02-26.
  3. 1 2 "Human BLAT Search". genome.ucsc.edu. Retrieved 2019-02-26.
  4. "Gene: C11orf42 - ENSG00000180878". bgee.org. Retrieved 2019-02-26.
  5. "EST Profile - Hs.278221". www.ncbi.nlm.nih.gov.
  6. 1 2 Nakaoka, Hirofumi; Tajima, Atsushi; Yoneyama, Taku; Hosomichi, Kazuyoshi; Kasuya, Hidetoshi; Mizutani, Tohru; Inoue, Ituro (2014). "Gene Expression Profiling Reveals Distinct Molecular Signatures Associated With the Rupture of Intracranial Aneurysm". Stroke. 45 (8): 2239–2245. doi: 10.1161/strokeaha.114.005851 . ISSN   0039-2499. PMID   24938844.
  7. "Genomatix: TranscriptInfo". www.genomatix.de.
  8. "PREDICTED: Homo sapiens chromosome 11 open reading frame 42 (C11orf42), transcript variant X1, mRNA". 2018-03-26.{{cite journal}}: Cite journal requires |journal= (help)
  9. "PREDICTED: Homo sapiens chromosome 11 open reading frame 42 (C11orf42), transcript variant X2, mRNA". 2018-03-26.{{cite journal}}: Cite journal requires |journal= (help)
  10. "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk.
  11. Kelley, Lawrence A.; Mezulis, Stefans; Yates, Christopher M.; Wass, Mark N.; Sternberg, Michael J. E. (2015). "The Phyre2 web portal for protein modeling, prediction and analysis". Nature Protocols. 10 (6): 845–858. doi:10.1038/nprot.2015.053. OCLC   922582332. PMC   5298202 . PMID   25950237.
  12. "NCBI Conserved Domain Search". www.ncbi.nlm.nih.gov.
  13. "PROSITE". prosite.expasy.org. Retrieved 2019-05-04.
  14. 1 2 "ExPASy: SIB Bioinformatics Resource Portal - Proteomics Tools". www.expasy.org.
  15. "PSORT II Prediction". psort.hgc.jp.
  16. "C11orf42 protein (human) - STRING interaction network". string-db.org.
  17. "C11orf42 Result Summary | BioGRID". thebiogrid.org.
  18. Julià, Antonio; Erra, Alba; Palacio, Carles; Tomas, Carlos; Sans, Xavier; Barceló, Pere; Marsal, Sara (2009-10-22). "An Eight-Gene Blood Expression Profile Predicts the Response to Infliximab in Rheumatoid Arthritis". PLOS ONE. 4 (10): e7556. Bibcode:2009PLoSO...4.7556J. doi: 10.1371/journal.pone.0007556 . ISSN   1932-6203. PMC   2762038 . PMID   19847310.