PROB1 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | PROB1 , C5orf65, proline-rich basic protein 1, proline rich basic protein 1 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 2686460 HomoloGene: 83773 GeneCards: PROB1 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Proline-rich basic protein 1(PROB1) is a protein encoded by the PROB1 gene located on human chromosome 5, open reading frame 65. PROB1 is also known as C5orf65 and weakly similar to basic proline-rich protein. [5] [6]
The PROB1 gene is 3251 bp long and contains a single exon. [6]
The PROB1 gene is located on human chromosome 5, cytogenetic band 5q31.2. [7]
PROB1 is expressed in 89 types of tissue in the human body, [8] with highest expression in the skeletal muscle of the leg and cardiac muscle of the heart. [9] While mRNA expression is somewhat ubiquitous and was also elevated in the spinal cord, cerebrum, and lymphocytes, measurable protein expression was only recorded in cardiac and skeletal muscle. [10]
PROB1 is composed of 1015 amino acids. It contains two proline-rich regions, which compose the majority of the protein, and a domain of unknown function (DUF). [7]
Predicted secondary structures for PROB1 reveal that the protein is mostly composed of random coils, with a small percentage of alpha helices and beta sheets present. [13] This is likely due to the properties of proline; its large size, ring structure, and confined phi angle cause it to disrupt secondary structure formation. The DUF, which resides in the second proline-rich region of the protein, is also predicted to be completely composed of random coils. A tertiary structure prediction for PROB1 was generated using I-Tasser [11] and rendered in PyMOL; [12] overall, the protein displays an elongated structure.
Analysis of protein structure, post-translational modifications, and localization signals reveals that PROB1 has no transmembrane domains and is an intracellular protein. Immunohistochemistry indicates its localization to the nucleoplasm of the cell. [14]
An array of post-translational modifications were found for PROB1, including an S-palmitolyation site [15] and a multitude of overlapping O-GlcNAcylation [16] and phosphorylation sites. [17] A representation containing a subset of the predicted modifications was generated using Dog 2.0 [18] and is shown below.
PROB1 has been found to be coexpressed with proteins SPATA24 and JADE2, but no notable functional protein interactions with PROB1 are known at this time. [19]
There are no known human paralogs of PROB1 to date. [20] [21]
PROB1 has only mammalian orthologs. Its most distant ortholog is the marsupial Vombatus ursinus (common wombat), which is estimated to have diverged about 159 million years ago as dated by TimeTree. [22] A subset of the multitude of orthologs produced by BLAST [20] is shown in the accompanying table.
PROB1 is implicated in keratoconus, which causes collagen-related degeneration of the cornea. Variants of PROB1 in the 5q31.1-q35.3 linkage region completely segregated with the keratoconus phenotype in a study utilizing segregation analysis methodology. [23] Additionally, PROB1 expression is shown to be significantly elevated in several disease states, including head and neck cancer [24] and prostate inflammation. [25]
DEP Domain Containing Protein 1B also known as XTP1, XTP8, HBV XAg-Transactivated Protein 8, [formerly referred to as BRCC3] is a human protein encoded by a gene of similar name located on chromosome 5.
PROSER2, also known as proline and serine rich 2, is a protein that in humans is encoded by the PROSER2 gene. PROSER2, or c10orf47(Chromosome 10 open reading frame 47), is found in band 14 of the short arm of chromosome 10 (10p14) and contains a highly conserved SARG domain. It is a fast evolving gene with two paralogs, c1orf116 and specifically androgen-regulated gene protein isoform 1. The PROSER2 protein has a currently uncharacterized function however, in humans, it may play a role in cell cycle regulation, reproductive functioning, and is a potential biomarker of cancer.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Glutamate rich protein 5 is a protein in humans encoded by the ERICH5 gene, also known as chromosome 8 open reading frame 47 (C8orf47).
Proline-rich protein 30 is a protein in humans that is encoded for by the PRR30 gene. PRR30 is a member in the family of Proline-rich proteins characterized by their intrinsic lack of structure. Copy number variations in the PRR30 gene have been associated with an increased risk for neurofibromatosis.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.
Transmembrane protein 171 (TMEM171) is a protein that in humans is encoded by the TMEM171 gene.
Chromosome 4 open reading frame 51 (C4orf51) is a protein which in humans is encoded by the C4orf51 gene.
Chromosome 1 open reading frame (C1orf167) is a protein which in humans is encoded by the C1orf167 gene. The NCBI accession number is NP_001010881. The protein is 1468 amino acids in length with a molecular weight of 162.42 kDa. The mRNA sequence was found to be 4689 base pairs in length.
Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.
ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.
Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.
SH3 Domain Binding Kinase Family Member 3 is an enzyme that in humans is encoded by the SBK3 gene. SBK3 is a member of the serine/threonine protein kinase family. The SBK3 protein is known to exhibit transferase activity, especially phosphotransferase activity, and tyrosine kinase activity. It is well-conserved throughout mammalian organisms and has two paralogs: SBK1 and SBK2.
The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.
C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.
Protein FAM110A, also known as protein family with sequence similarity 110, A, C20orf55 or BA371L19.3 is encoded by the FAM110A gene. FAM110A is located on chromosome 20 and is a part of the greater FAM110 gene family, consisting of FAM110A, FAM110B, and FAM110C.
Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.
C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.