PROB1

Last updated
PROB1
Identifiers
Aliases PROB1 , C5orf65, proline-rich basic protein 1, proline rich basic protein 1
External IDs MGI: 2686460 HomoloGene: 83773 GeneCards: PROB1
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001161546

NM_001270646

RefSeq (protein)

NP_001155018

n/a

Location (UCSC) Chr 5: 139.39 – 139.4 Mb Chr 18: 35.78 – 35.79 Mb
PubMed search [3] [4]
Wikidata
View/Edit Human View/Edit Mouse

Proline-rich basic protein 1(PROB1) is a protein encoded by the PROB1 gene located on human chromosome 5, open reading frame 65. PROB1 is also known as C5orf65 and weakly similar to basic proline-rich protein. [5] [6]

Contents

Gene

Characteristics

The PROB1 gene is 3251 bp long and contains a single exon. [6]

Location

The PROB1 gene is located on human chromosome 5, cytogenetic band 5q31.2. [7]

mRNA

Expression

PROB1 is expressed in 89 types of tissue in the human body, [8] with highest expression in the skeletal muscle of the leg and cardiac muscle of the heart. [9] While mRNA expression is somewhat ubiquitous and was also elevated in the spinal cord, cerebrum, and lymphocytes, measurable protein expression was only recorded in cardiac and skeletal muscle. [10]

Protein

PROB1 is composed of 1015 amino acids. It contains two proline-rich regions, which compose the majority of the protein, and a domain of unknown function (DUF). [7]

Structure

Predicted tertiary structure of PROB1 generated by I-Tasser and rendered in PyMOL. The dark blue indicates prolines and yellow marks the DUF. The alpha helices are colored green, beta sheets are colored light blue, and random coils are colored pink. Predicted tertiary structure of PROB1 part 2.png
Predicted tertiary structure of PROB1 generated by I-Tasser and rendered in PyMOL. The dark blue indicates prolines and yellow marks the DUF. The alpha helices are colored green, beta sheets are colored light blue, and random coils are colored pink.

Predicted secondary structures for PROB1 reveal that the protein is mostly composed of random coils, with a small percentage of alpha helices and beta sheets present. [13] This is likely due to the properties of proline; its large size, ring structure, and confined phi angle cause it to disrupt secondary structure formation. The DUF, which resides in the second proline-rich region of the protein, is also predicted to be completely composed of random coils. A tertiary structure prediction for PROB1 was generated using I-Tasser [11] and rendered in PyMOL; [12] overall, the protein displays an elongated structure.

Sub-cellular Localization

Analysis of protein structure, post-translational modifications, and localization signals reveals that PROB1 has no transmembrane domains and is an intracellular protein. Immunohistochemistry indicates its localization to the nucleoplasm of the cell. [14]

Post-translational Modifications

An array of post-translational modifications were found for PROB1, including an S-palmitolyation site [15] and a multitude of overlapping O-GlcNAcylation [16] and phosphorylation sites. [17] A representation containing a subset of the predicted modifications was generated using Dog 2.0 [18] and is shown below.

Schematic of the PROB1 protein annotated with predicted post-translational modifications, created using Dog 2.0. PTMS PROB1.png
Schematic of the PROB1 protein annotated with predicted post-translational modifications, created using Dog 2.0.














Interactions

PROB1 has been found to be coexpressed with proteins SPATA24 and JADE2, but no notable functional protein interactions with PROB1 are known at this time. [19]

Homology

A subset of the mammalian orthologs of PROB1 predicted by BLAST. Ortholog Table of Prob1.png
A subset of the mammalian orthologs of PROB1 predicted by BLAST.

Paralogs

There are no known human paralogs of PROB1 to date. [20] [21]

Orthologs

PROB1 has only mammalian orthologs. Its most distant ortholog is the marsupial Vombatus ursinus (common wombat), which is estimated to have diverged about 159 million years ago as dated by TimeTree. [22] A subset of the multitude of orthologs produced by BLAST [20] is shown in the accompanying table.

Clinical Significance

PROB1 is implicated in keratoconus, which causes collagen-related degeneration of the cornea. Variants of PROB1 in the 5q31.1-q35.3 linkage region completely segregated with the keratoconus phenotype in a study utilizing segregation analysis methodology. [23] Additionally, PROB1 expression is shown to be significantly elevated in several disease states, including head and neck cancer [24] and prostate inflammation. [25]

Related Research Articles

<span class="mw-page-title-main">DEPDC1B</span> Protein-coding gene in the species Homo sapiens

DEP Domain Containing Protein 1B also known as XTP1, XTP8, HBV XAg-Transactivated Protein 8, [formerly referred to as BRCC3] is a human protein encoded by a gene of similar name located on chromosome 5.

<span class="mw-page-title-main">Proser2</span> Protein-coding gene in the species Homo sapiens

PROSER2, also known as proline and serine rich 2, is a protein that in humans is encoded by the PROSER2 gene. PROSER2, or c10orf47(Chromosome 10 open reading frame 47), is found in band 14 of the short arm of chromosome 10 (10p14) and contains a highly conserved SARG domain. It is a fast evolving gene with two paralogs, c1orf116 and specifically androgen-regulated gene protein isoform 1. The PROSER2 protein has a currently uncharacterized function however, in humans, it may play a role in cell cycle regulation, reproductive functioning, and is a potential biomarker of cancer.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

<span class="mw-page-title-main">Glutamate rich 5</span> Protein-coding gene in the species Homo sapiens

Glutamate rich protein 5 is a protein in humans encoded by the ERICH5 gene, also known as chromosome 8 open reading frame 47 (C8orf47).

<span class="mw-page-title-main">Proline-rich protein 30</span>

Proline-rich protein 30 is a protein in humans that is encoded for by the PRR30 gene. PRR30 is a member in the family of Proline-rich proteins characterized by their intrinsic lack of structure. Copy number variations in the PRR30 gene have been associated with an increased risk for neurofibromatosis.

<span class="mw-page-title-main">C21orf58</span> Protein-coding gene in the species Homo sapiens

Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.

<span class="mw-page-title-main">C15orf39</span>

C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.

<span class="mw-page-title-main">TMEM171</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 171 (TMEM171) is a protein that in humans is encoded by the TMEM171 gene.

<span class="mw-page-title-main">C4orf51</span> Protein-coding gene in the species Homo sapiens

Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.

Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.

<span class="mw-page-title-main">C1orf94</span> Protein-coding gene in the species Homo sapiens

Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.

<span class="mw-page-title-main">Fam89A</span> Human protein and gene

ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.

<span class="mw-page-title-main">LSMEM2</span> Protein-coding gene in the species Homo sapiens

Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.

<span class="mw-page-title-main">SBK3</span> Protein-coding gene in the species Homo sapiens

SH3 Domain Binding Kinase Family Member 3 is an enzyme that in humans is encoded by the SBK3 gene. SBK3 is a member of the serine/threonine protein kinase family. The SBK3 protein is known to exhibit transferase activity, especially phosphotransferase activity, and tyrosine kinase activity. It is well-conserved throughout mammalian organisms and has two paralogs: SBK1 and SBK2.

<span class="mw-page-title-main">FAM214B</span> Protein-coding gene in the species Homo sapiens

The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.

<span class="mw-page-title-main">C6orf136</span> Protein-coding gene in the species Homo sapiens

C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.

<span class="mw-page-title-main">FAM110A</span> Protein-coding gene in the species Homo sapiens

Protein FAM110A, also known as protein family with sequence similarity 110, A, C20orf55 or BA371L19.3 is encoded by the FAM110A gene. FAM110A is located on chromosome 20 and is a part of the greater FAM110 gene family, consisting of FAM110A, FAM110B, and FAM110C.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

<span class="mw-page-title-main">C11orf98</span> Protein-coding gene in the species Homo sapiens

C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000228672 - Ensembl, May 2017
  2. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000073600 - Ensembl, May 2017
  3. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. European Bioinformatics InstituteProtein Information ResourceSIB Swiss Institute of Bioinformatics. (2019, January 16). Proline-rich basic protein 1. Retrieved from https://www.uniprot.org/uniprot/E7EW31
  6. 1 2 Proline-rich basic protein 1 [Homo sapiens]. (2018, June 24). Retrieved from https://www.ncbi.nlm.nih.gov/protein/NP_001155018.1.
  7. 1 2 "PROB1 Gene". GeneCardsSuite.
  8. CALIPHO Team - SIB - Swiss Institute of Bioinformatics. (2018, September 3). PROB1 - Proteomics. Retrieved from https://www.nextprot.org/entry/NX_E7EW31/proteomics
  9. The Human Protein Atlas. (2015). PROB1. Retrieved from https://www.proteinatlas.org/ENSG00000228672-PROB1/tissue
  10. "Gene expression for PROB1". GTEx Microarray Data. 2019.
  11. 1 2 Zhang, Yang. "I-TASSER server for protein structure and function prediction".
  12. 1 2 The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC.
  13. "Predicted secondary structure of PROB1". SOPMA SECONDARY STRUCTURE PREDICTION METHOD. 2016.
  14. Atlas Antibodies. Anti-PROB1 Antibody (HPA060103). Retrieved from https://atlasantibodies.com/products/PROB1-antibody-HPA060103   
  15. The CUCKOO Workgroup. (2019, May 1). GPS-Lipid - Prediction of Lipid Modifications (S-Palmitoylation, N-Myristoylation, S-Farnesylation, S-Geranylgeranylation). Retrieved from http://lipid.biocuckoo.org/presult.php
  16. DTU Bioinformatics. (2017, January 2). YinOYang 1.2. Retrieved from http://www.cbs.dtu.dk/services/YinOYang/
  17. MyHits. Sigrist CJ, Cerutti L, de Castro E, Langendijk-Genevaux PS, Bulliard V, Bairoch A, Hulo N. PROSITE, a protein domain database for functional characterization and annotation. Nucleic Acids Res. 2010; 38(Database issue):D161-6
  18. 1 2 The CUCKOO Workgroup. (2014, February 26). DOG 2.0 - Protein Domain Structure Visualization. Retrieved from http://dog.biocuckoo.org/
  19. STRING Consortium. (2019). STRING. Retrieved from https://string db.org/cgi/input.pl?sessionId=AQ4nteBhjSOX&input_page_show_search=on EMBL-EBI. (2019).
  20. 1 2 BLAST, https://blast.ncbi.nlm.nih.gov/Blast.cgi
  21. Kent WJ. BLAT - the BLAST-like alignment tool. Genome Res. 2002 Apr;12(4):656-64.
  22. Institute for Genomics and Evolutionary Medicine. The Timescale of Life. Retrieved from http://www.timetree.org/
  23. Karolak, J. A., Gambin, T., Pitarque, J. A., Molinari, A., Jhangiani, S., Stankiewicz, P., . . . Gajecka, M. (2016). Variants in SKP1, PROB1 and IL17B genes at keratoconus 5q31.1–q35.3 susceptibility locus identified by whole-exome sequencing. European Journal of Human Genetics,25(1), 73-78. doi : 10.1038/ejhg.2016.130
  24. Lachmann, A. Retrieved from https://amp.pharm.mssm.edu/archs4/gene/PROB1#tissueexpression
  25. Prob1 - Model for inflammatory signaling in prostate epithelium based on expression of a constitutively active version of IκB kinase 2 (IKK2ca). Retrieved from https://www.ncbi.nlm.nih.gov/geoprofiles/79454671