PRR12 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | PRR12 , KIAA1205, Proline-rich 12, proline rich 12, NOC | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | OMIM: 616633 MGI: 2679002 HomoloGene: 18957 GeneCards: PRR12 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Proline-rich 12 (PRR12) is a protein of unknown function encoded by the gene PRR12.
The Homo sapiensPRR12 gene is 34,785 base pairs long, contains 14 exons, and is located on chromosome 19 at 19q13.33. [5] Known aliases for PRR12 include "proline rich 12" and KIAA1205. [6] Within its gene neighborhood, PRR12 is flanked by PRRG2 and SCAF1 on the sense strand and RRAS and NOSIP on the antisense strand. [5] Nitric oxide synthase interacting protein, NOSIP, regulates the activity and localization of nitric oxide synthase (endothelial and neuronal), controlling nitric oxide production. [7] Proline rich Gla, PRRG2, has a Gla domain which binds hyaluronan and is associated with proteins present in the extracellular matrix involved with cell adhesion and cell migration. [8] [9] Ras-related protein R-Ras, RRAS, belongs to the Ras family and is involved in the organization of actin filaments within the cytoskeleton. SR-Related CTD-Associated [10] Factor 1, SCAF1, is thought to be involved in the splicing of precursor mRNA. [11]
Promoter
The promoter region of PRR12 was predicted using ElDorado at Genomatix. [12] The region starts at position 50094408 and ends at 50095013 of chromosome 19. This promoter set is conserved in the macaque, mouse, rat, horse, cow, pig, dog and zebrafish. No recognizable TATA box, B recognition element (BRE), or CAAT box was found upstream of the predicted transcription start region. Because no clear TATA box was found, it is possible that PRR12 is regulated by a TATA-less promoter containing a downstream promoter element (DPE). However, the predicted DPE is only 15bp downstream of the transcription start region instead of the typical +25 to +32 base pairs. More research will be required to expand the 5' UTR of the PRR12 transcript in order to confirm where the correct promoter region is located.
mRNA sequence
The PRR12 mRNA transcript is 6960 base pairs long and contains several short sequence repeats. The Homo sapiens PRR12 has three isoforms with isoform 3 containing roughly one thousand more amino acid residues than the other isoforms. [13] No 5' UTR is given in the NCBI records for the Homo sapiens PRR12 transcript. However, 7 base pairs of the 5' UTR have been determined in the Papio anubis ortholog. The 3' UTR is 852 base pairs long. [14]
The gene is moderately expressed at even levels in a wide variety of tissue types. [6]
The PRR12 transcript encodes a protein that is 2036 residues long. It has a molecular weight of 211.1 kdal and an isoelectric point around 7.728. [15] [ failed verification ] A number of bioinformatics databases have also predicted PRR12 to be a soluble protein with no transmembrane domains. [15] [16] [17] Jianping Chen lists PRR12 as an "extremely vulnerable protein". [18] These proteins have regions rich in amino acids that are "poor protectors" of hydrogen bonds along the backbone of the protein, inhibiting the ability of these proteins to fold properly and allowing the possibility of protein aggregation. Residues such as G, A, S, Y, and P are listed as poor protectors and PRR12 is rich in both proline and glycine. [15] [18] Many of the proline residues are positioned consecutively in regions of low complexity. These regions may give this protein interesting secondary structure as a cluster of proline can form a polyproline helix. [19] PRR12 contains a possible nuclear import signal starting at P1794. A typical nuclear localization sequence would have the following residues: P-P-K-K-K-R-K-V. [20] PRR12 contains a DUF4211 domain starting at V1836 that shows homology to the pfam13926 domain. [21] This domain is well conserved in PRR12 orthologs. PRR12 also contains well conserved AT-hook binding regions at P1168 and G1202. These regions allow proteins to bind DNA, further supporting the localization of PRR12 to the nucleus.
The glutamine and serine-rich protein 1 (QSER1) is the only closely related paralog to PRR12 (NCBI accession: EAW68214). QSER1 has no known function and, like PRR12, it contains DUF4211 and a nuclear localization signal. QSER1 does not contain the AT binding regions or Epstein-Barr virus antigen that is found in PRR12.
The most distant relative found through BLAST with a significant similarity to PRR12 is the fish Danio rerio. Orthologs were found in fish, amphibians, reptiles, and other mammals. While no PRR12 orthologs were found in birds, birds did have orthologs to the QSER1, which is a close paralog to human PRR12. [13]
One study on the Epstein–Barr virus found close homology between a proline rich region in PRR12 and a 65 amino acid long region at the terminal end of EBNA-2 (a nuclear antigen of the virus). [22] This Epstein-Barr virus antigen is associated with autoimmune systemic connective tissue diseases (CTD) including systemic lupus erythematosus (SLE), primary Sjögren syndrome (SS), rheumatoid arthritis (RA), systemic sclerosis (SSc), and secondary SS. [22] PRR12 is not only proline rich, but it is also rich in glycine, suggesting that there might be a relationship to collagen which is also proline and glycine rich. A relationship between the two might be an explanation for the appearance of autoimmune CTDs after infection of EBV. However, glycine and proline residues in collagen generally follow a G-P-X or G-X-HydroxyP motif, which does not significantly occur in PRR12.
Haploinsufficiency of PRR12 can result in anophthalmia among other abnormalities [23] [24]
KIAA0895 is a protein that in Homo sapiens is encoded by the KIAA0895 gene. The gene encodes a protein commonly known as the KIAA0895 protein. Its aliases include hypothetical protein LOC23366, OTTHUMP00000206979, OTTHUMP00000206980, 9530077C05Rik, and 1110003N12Rik. It is located at 7p14.2.
Glutamine Serine Rich Protein 1 or QSER1 is a protein encoded by the QSER1 gene.
Coiled-coil domain containing 94 (CCDC94) is a protein that in humans is encoded by the CCDC94 gene. The CCDC94 protein contains a coiled-coil domain, a domain of unknown function (DUF572), an uncharacterized conserved protein (COG5134), and lacks a transmembrane domain.
C6orf222 is a protein that in humans is encoded by the C6orf222 gene (6p21.31). C6orf222 is conserved in mammals, birds and reptiles with the most distant ortholog being the green sea turtle, Chelonia mydas. The C6orf222 protein contains one mammalian conserved domain: DUF3293. The protein is also predicted to contain a BH3 domain, which has predicted conservation in distant orthologs from the clade Aves.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
Uncharacterized protein Chromosome 16 Open Reading Frame 71 is a protein in humans, encoded by the C16orf71 gene. The gene is expressed in epithelial tissue of the respiratory system, adipose tissue, and the testes. Predicted associated biological processes of the gene include regulation of the cell cycle, cell proliferation, apoptosis, and cell differentiation in those tissue types. 1357 bp of the gene are antisense to spliced genes ZNF500 and ANKS3, indicating the possibility of regulated alternate expression.
Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.
Transmembrane and coiled-coil domains 4, TMCO4, is a protein in humans that is encoded by the TMCO4 gene. Currently, its function is not well defined. It is transmembrane protein that is predicted to cross the endoplasmic reticulum membrane three times. TMCO4 interacts with other proteins known to play a role in cancer development, hinting at a possible role in the disease of cancer.
Proline-rich protein 30 is a protein in humans that is encoded for by the PRR30 gene. PRR30 is a member in the family of Proline-rich proteins characterized by their intrinsic lack of structure. Copy number variations in the PRR30 gene have been associated with an increased risk for neurofibromatosis.
Chromosome 21 Open Reading Frame 58 (C21orf58) is a protein that in humans is encoded by the C21orf58 gene.
Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.
Chromosome 9 open reading frame 50 is a protein that in humans is encoded by the C9orf50 gene. C9orf50 has one other known alias, FLJ35803. In humans the gene coding sequence is 10,051 base pairs long, transcribing an mRNA of 1,624 bases that encodes a 431 amino acid protein.
Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.
WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.
KRBA1 is a protein that in humans is encoded by the KRBA1 gene. It is located on the plus strand of chromosome 7 from 149,411,872 to 149,431,664. It is also commonly known under two other aliases: KIAA1862 and KRAB A Domain Containing 1 gene and encodes the KRBA1 protein in humans. The KRBA family of genes is understood to encode different transcriptional repressor proteins
C14orf119 is a protein that in humans is encoded by the c14orf119 gene. The c14orf119 protein is predicted to be localized in the nucleus. Additionally, c14orf119 expression is decreased in individuals with systemic lupus erythematosus (SLE) when compared with healthy individual and is increased in individuals with various types of lymphomas when compared to healthy individuals.
TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.
The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.
Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.
Chromosome 19 open reading frame 22 (c19orf22) is a protein which in humans is encoded by the c19orf22 gene. The primary alias of the gene is R3H domain containing 4 (R3HDM4), but it is commonly referred to as c19orf22.