PRR23A | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | PRR23A , proline rich 23A | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 3645937 HomoloGene: 67036 GeneCards: PRR23A | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Proline-Rich Protein 23A is a protein that is encoded by the Proline-Rich 23A (PRR23A) gene.
The human PRR23A gene is on chromosome 3 at position 3q23 and is located on the antisense strand. [5] The gene is encoded from position 139,003,962 to 139,006,268. It consists on 1 exon and spans 2,307 base pairs. Other genes in the neighborhood include: FOXL2NB, FOXL2, PRR23B, and PRR23C. The FOXL2NB gene has tissue enriched expression in ovaries, and PRR23A has demonstrated expression in the ovary as well. [6] [7] Aliases for PRR23A include: Proline-Rich 23A, Proline-Rich Protein 23A, and UPF0572 Protein ENSP00000372650. [8] Two of those genes, PRR23B and PRR23C, are paralogs to PRR23A. [9]
PRR23A is primarily expressed at low levels in the brain and testis. [7] [10] There were also very low levels of PRR23A expression detected in ovary and bone marrow tissue. [7] Genes typically show high expression in the testis during RNA sequencing since it is a highly transcriptionally active tissue due to its function of sperm production. [11] However, some researchers have noted that this testis tissue expression of PRR23A may be legitimate since PRR23 family genes are thought to play a role in male reproduction. [9] [12] Furthermore, brain and testis tissue share biochemical characteristics and express a large number of common genes. [13] This may also explain why PRR23A expression has been found at similar levels within the brain and testis.
Tissue | Cerebellum | Choroid Plexus | White Matter | Cerebral Cortex | Medulla Oblongata | Pons | Thalamus | Amygdala | Hippocampal Formation | Midbrain | Basal Ganglia | Hypothalamus | Spinal Cord | Testis | Ovary | Bone Marrow |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
RNA Abundance (nTPM) | 2.4 | 1.7 | 1.7 | 1.6 | 1.6 | 1.6 | 1.5 | 1.4 | 1.3 | 1.3 | 1.2 | 1.2 | 1.2 | 1.5 | 0.1 | 0.1 |
Since PRR23A consists of 1 exon, there are no alternative splicing products. [5] This also means that the 1 known isoform in humans has an mRNA sequence of 2,307 nucleotides which matches the length of the PRR23A gene. [14] mRNA typically contain a 5' UTR with a median length of 170 nucleotides in humans, but the human PRR23A mRNA sequence does not contain a 5' UTR. [14] [15] Instead, the FASTA sequence for human PRR23A begins with the start codon ATG. [14] Although, the 5' UTR is not translated, it plays a major regulatory role for the translation of coding sequence nucleotides to their amino acids that go on to form a protein structure. [16] Therefore, it is unlikely that human PRR23A does not have this region, and that its upstream 5' UTR sequence could be obtained through further sequencing research.
Human PRR23A consists of 266 amino acids, has a predicted molecular weight of 28.2 kDal, and a predicted basal isoelectric point of 4.57. [17] PRR23A, as its name implies, is enriched with the amino acid proline. Therefore, PRR23A belongs to the category of proteins called proline-rich proteins. PRR23A contains less asparagine, threonine, and lysine compared to other human proteins. [18] [17] This protein composition for PRR23A is generally conserved across species. [18] [19]
Amino Acid | Pro (P) | Ala (A) | Leu (L) | Ser (S) | Glu (E) | Gly (G) | Val (V) | Arg (R) | Asp (D) | Phe (F) | Gln (Q) | Ile (I) | Cys (C) | His (H) | Thr (T) | Lys (K) | Met (M) | Try (Y) | Trp (W) | Asn (N) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Composition of Human PRR23A | 16.2% | 10.5% | 10.5% | 9.8% | 9.0% | 7.5% | 7.1% | 6.4% | 5.3% | 3.4% | 3.0% | 2.3% | 1.5 | 1.5 | 1.5 | 1.1 | 1.1% | 1.1 | 0.80% | 0.4 |
Composition of PRR23A Compared to Other Human Proteins | very rich | average | average | average | average | average | average | average | average | average | average | average | average | average | very poor | poor | average | average | average | very poor |
Human PRR23A is mainly a disordered protein with small stretches of beta strands and alpha helices forming. [17] [20] [21] [22] [23] [24] There are 2 known disordered regions at the beginning and the end of the protein. [17] There are 6 regions from the beginning-middle of the protein sequence that are predicted to form beta strands, and when folded into the tertiary structure are in the middle of the predicted protein structure. There is 1 possible transmembrane domain that is located in 1 of these beta strands. [18] [17] [20] [21] Some proteins can create transmembrane beta barrels when a beta sheet curls on itself to make a tube that goes through a membrane, so the PRR23A could exhibit this phenomenon [25] . There are 2 regions towards the end of the protein sequence that are predicted to form alpha helices, and when folded into a tertiary structure are in the middle of the predicted protein structure.
Antibody detection in human stomach cells has shown that PRR23A localizes in the membrane and cytoplasm. [26] Further investigation of the PRR23A protein sequence has also identified a small transmembrane region towards the beginning of the protein, and signal sequences for the ER membrane, nucleus, and mitochondria. [18] [21]
PRR23A does not have very many known interactions. The most significant protein interactions for human PRR23A are DEFB106A and DEFB107A which have been determined though co-expression data and textmining. [27] Co-expression data has also shown that DEFB106A and DEFB107A interact with one another. This means that PRR23A, DEFB106A, and DEFB107A have been observed to be correlated in expression across a large number of experiments. DEFB105A, DEFB106B, IQCJ, FAM90A10P, SPAG11B, PRSS22, USP17L4, and USP17L7 are also thought to interact with PRR23A. The basis of these interactions were determined through textmining, so further experiments such as the yeast two-hybrid assay should be conducted to increase the confidence of these protein interactions.
Name | Basis | Function |
DEFB106A | Textmining and Co-expression | Belongs to the defensin family which are antimicrobial and cytotoxic peptides made by neutrophils. Associated with Diamond-Blackfan Anemia15, Mandibulofacial with Dysostosis, and Keratomalacia. Enables lipopolysaccharide binding, protein binding, heparin binding and CCR2 chemokine receptor binding |
DEFB107A | Textmining and Co-expression | Belongs to the defensin family which are antimicrobial and cytotoxic peptides made by neutrophils. Enables lipid binding |
DEFB105A | Textmining | Belongs to the defensin family which are antimicrobial and cytotoxic peptides made by neutrophils. Associated with Familial Hypertrophic Cardiomyopathy 12, and Familial Hypertrophic Cardiomyopathy 6 |
DEFB106B | Textmining | Belongs to the defensin family which are antimicrobial and cytotoxic peptides made by neutrophils. Associated with Diamond-Blackfan Anemia15, Mandibulofacial with Dysostosis. Enables lipopolysaccharide binding, protein binding, heparin binding and CCR2 chemokine receptor binding |
IQCJ | Textmining | NA |
FAM90A10P | Textmining | Belongs to the FAM90 family |
SPAG11B | Textmining | Encodes several androgen-dependent, epididymis-specific secretory proteins. Thought to be involved in sperm maturation. Associated with Small Intestine Lymphoma and Herpes Zoster Oticus |
PRSS22 | Textmining | Gene encodes a member of the trypsin family of serine-proteases. Preferentially cleaves the synthetic substrate H-D-Leu-Thr-Arg-pNA compared to tosyl-Gly-Pro-Arg-pNA. Enables serine-type endopeptidase activity and peptidase activator activity |
USP17L4 | Textmining | Predicted to enable cysteine-type endopeptidase activity and thiol-dependent deubiquitinase |
USP17L7 | Textmining | Predicted to enable cysteine-type endopeptidase activity and thiol-dependent deubiquitinase |
Post-translational modifications for human PRR23A include: phosphorylation, [29] [30] [31] [32] acetylation, [33] myristoylation, [32] sulfonation, [34] SUMOylation, [35] and glycosylation. [36] [37] The glycosylation site supports the identified transmembrane region and ER membrane subcellular localization of PRR23A since proteins that are glycosylated are typically membrane bound and expressed in the ER. [18] [21] [38]
There are 2 known paralogs of PRR23A: PRR23B and PRR23C. [9] [17] [39]
Protein | Accession Number | Sequence Length (Amino Acids) | E-value | Sequence Identity to Human PRR23A Protein (%) | Sequence Similarity to Human PRR23A Protein (%) |
---|---|---|---|---|---|
PRR23A | NP_001128131 | 266 | 0 | 100 | 100 |
PRR23B | NP_001013672 | 265 | 5e-146 | 89.8 | 91.4 |
PRR23C | NP_001128129 | 262 | 1e-116 | 84.6 | 86.8 |
PRR23A orthologs are only found in placental mammals. [17] [39] No PRR23A orthologs have been identified in marsupials, monotremes, birds, reptiles, amphibians, fish, or invertebrates.
Genus and Species | Common Name | Taxonomic Group | Median Date of Divergence (Millions of Years Ago) | Accession Number | Sequence Length (Amino Acids) | Sequence Identity to Human Protein (%) | Sequence Similarity to Human Protein (%) |
---|---|---|---|---|---|---|---|
Homo sapiens | Human | Primates | 0 | NP_001128131 | 266 | 100 | 100 |
Macaca mulatta | Rhesus Monkey | Primates | 28.8 | XP_001113833 | 282 | 83.3 | 85.5 |
Mus musculus | Mouse | Rodentia | 87 | NP_001128132 | 258 | 41.3 | 53.1 |
Ochotona princeps | American Pika | Lagomorpha | 87 | XP_058513009 | 276 | 39.7 | 50.8 |
Pteropus alecto | Black Flying Fox | Chiroptera | 94 | XP_006907257 | 273 | 63 | 69.2 |
Sturnira hondurensis | Honduran Yellow-Shouldered Bat | Chiroptera | 94 | XP_036917059 | 270 | 56.9 | 64.1 |
Leptonychotes weddellii | Weddell Seal | Carnivora | 94 | XP_006734260 | 260 | 53.1 | 61.7 |
Canis lupus familiaris | Dog | Carnivora | 94 | XP_038288414 | 243 | 53 | 59.8 |
Neophocaena asiaeorientalis asiaeorientalis | Yangtze Finless Porpoise | Cetacea | 94 | XP_024606498 | 261 | 53 | 61.9 |
Hippopotamus amphibius kiboko | East African Hippopotamus | Hippopotamidae | 94 | XP_057593770 | 263 | 51.9 | 60.5 |
Equus caballus | Horse | Perissodactyla | 94 | XP_023477063 | 279 | 59.6 | 68.1 |
Camelus ferus | Wild Bactrian Camel | Tylopoda | 94 | XP_032343448 | 258 | 56.4 | 63.5 |
Phacochoerus africanus | Common Warthog | Suiformes | 94 | XP_047608137 | 262 | 47.3 | 57.4 |
Moschus berezovskii | Dwarf Musk Deer | Ruminantia | 94 | XP_055276284 | 291 | 45.2 | 54.2 |
Talpa occidentalis | Spanish Mole | Talpidae | 94 | XP_037382562 | 264 | 54.1 | 64.4 |
Sorex araneus | Common Shrew | Soricidae | 94 | XP_054980762 | 217 | 41.6 | 46.8 |
Erinaceus europaeus | European Hedgehog | Erinaceidae | 94 | XP_060039908 | 274 | 27.8 | 40.6 |
Elephas maximus indicus | Indian Elephant | Proboscidea | 99 | XP_049726169 | 235 | 53.4 | 58.6 |
Loxodonta africana | African Savanna Elephant | Proboscidea | 99 | XP_003420004 | 235 | 53.4 | 58.6 |
Orycteropus afer afer | Aardvark | Tubulidentata | 99 | XP_007945733 | 243 | 52.9 | 58 |
PRR23A first appeared within placental mammals which evolved 78-129 million years ago. [39] [44] Then, placental mammals began to diversify into two the major lineages of Atlantogenata and Boreoeutheria which emerged 90-100 million years ago. [45] PRR23A orthologs can be found within both of these major lineages, and several subgroups that evolved as well. [39] [46] Despite PRR23A's recent emergence in the long run of evolutionary history, it is evolving at a very rapid rate. [47] [48] [19]
PRR23A has demonstrated gene expression within the testis through increased mRNA levels, and so have the other PRR23 family genes. [9] This expression indicates that PRR23A may have a role within the male reproductive system. The larger family of proline-rich proteins have a large range of functions including: energy provisions, antistress responses, calcium binding in saliva, structure support, and many others. [49] [50] One subgroup called small proline-rich proteins (SPRRs) are antimicrobial proteins that direct bacterial membrane disruption. [51]
Epigenetic modifications of PRR23A have been shown to impact maternal early-pregnancy serum ferritin concentrations. [12] 2 CpG sites within human PRR23A have been identified: cg02806645 and cg06322988. When these locations are methylated, a decrease in serum ferritin concentrations during early-pregnancy was observed. Low levels of ferritin are a sign of iron deficiency which is especially important to monitor during pregnancy. [52] Therefore, the decreased expression of PRR23A though methylation silencing is associated with iron deficiency.
KIAA0895 is a protein that in Homo sapiens is encoded by the KIAA0895 gene. The gene encodes a protein commonly known as the KIAA0895 protein. Its aliases include hypothetical protein LOC23366, OTTHUMP00000206979, OTTHUMP00000206980, 9530077C05Rik, and 1110003N12Rik. It is located at 7p14.2.
Interferon-inducible GTPase 5 also known as immunity-related GTPase cinema 1 (IRGC1) is an enzyme that in humans is coded by the IRGC gene. It is predicted to behave like other proteins in the p47-GTPase-like and IRG families. It is most expressed in the testis.
Zinc finger protein 684 is a protein that in humans is encoded by the ZNF684 gene.
PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.
C17orf53 is a gene in humans that encodes a protein known as C17orf53, uncharacterized protein C17orf53. It has been shown to target the nucleus, with minor localization in the cytoplasm. Based on current findings C17orf53 is predicted to perform functions of transport, however further research into the protein could provide more specific evidence regarding its function.
Chromosome 19 open reading frame 44 is a protein that in humans is encoded by the C19orf44 gene. C19orf44 is an uncharacterized protein with an unknown function in humans. C19orf44 is non-limiting implying that the protein exists in other species besides human. The protein contains one domain of unknown function (DUF) that is highly conserved throughout its orthologs. This protein is most highly expressed in the testis and ovary, but also has significant expression in the thyroid and parathyroid. Other names for this protein include: LOC84167.
Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.
C11orf42 is an uncharacterized protein in Homo sapiens that is encoded by the C11orf42 gene. It is also known as chromosome 11 open reading frame 42 and uncharacterized protein C11orf42, with no other aliases. The gene is mostly conserved in mammals, but it has also been found in rodents, reptiles, fish and worms.
Single-pass membrane and coiled-coil domain-containing protein 3 is a protein that is encoded in humans by the SMCO3 gene.
C20orf202 is a protein that in humans is encoded by the C20orf202 gene. In humans, this gene encodes for a nuclear protein that is primarily expressed in the lung and placenta.
TMEM275 is a protein that in humans is encoded by the TMEM275 gene. TMEM275 has two, highly-conserved, helical trans-membrane regions. It is predicted to reside within the plasma membrane or the endoplasmic reticulum's membrane.
C3orf56 is a protein encoding gene found on chromosome 3. Although, the structure and function of the protein is not well understood, it is known that the C3orf56 protein is exclusively expressed in metaphase II of oocytes and degrades as the oocyte develops towards the blastocyst stage. Degradation of the C3orf56 protein suggests that this gene plays a role in the progression from maternal to embryonic genome and in embryonic genome activation.
The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.
Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid.
Transmembrane protein 212 is a protein that in humans is encoded by the TMEM212 gene. The protein consists of five transmembrane domains and localizes in the plasma membrane and endoplasmic reticulum. TMEM212 has orthologs in vertebrates but not invertebrates. TMEM212 has been associated with sporadic Parkinson's disease, facial processing, and adiposity in African Americans.
Chromosome 20 open reading frame 85, or most commonly known as C20orf85 is a gene that encodes for the C20orf85 Protein. This gene is not yet well understood by the scientific community.
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.
Chromosome 5 Open Reading Frame 47, or C5ORF47, is a protein which, in humans, is encoded by the C5ORF47 gene. It also goes by the alias LOC133491. The human C5ORF47 gene is primarily expressed in the testis.
Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.
Zinc Finger Protein 62, also known as "ZNF62," "ZNF755," or "ZET," is a protein that in humans is encoded by the ZFP62 gene. ZFP62 is part of the C2H2 Zinc Finger family of genes.