Hp53int1

Last updated
Hp53int1
Identifiers
Aliases
External IDs GeneCards: ; OMA:- orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

n/a

n/a

RefSeq (protein)

n/a

n/a

Location (UCSC)n/an/a
PubMed searchn/an/a
Wikidata
View/Edit Human

Human protein 53 intron 1 (Hp53int1) is a protein encoded by the Hp53int1 gene in humans.

Contents

Gene

Figure 1: The location of Hp53int1 protein on chromosome 17p.13, region 7,685,260 bp-7,686,371 bp Hp53int1 location.png
Figure 1: The location of Hp53int1 protein on chromosome 17p.13, region 7,685,260 bp-7,686,371 bp

The Hp53int1 gene is located on chromosome 17p.13, encoded by a DNA sequence 1125 base pairs in length, covering region 7,685,260 bp-7,686,371 bp. [1] The Hp53int1 gene has two aliases, WRAP53int1 and TP53int1, in accordance to its overlap with the WRAP53 and TP53int1 genes on chromosome 17. [2] Hp53int1 is located downstream of the p53p2 start site. [3]

Hp53int1 does not have multiple exons, and therefore has no isoforms.

Important relationship to TP53

The Hp53int1 gene is transcribed in the same direction as TP53 and is present in TP53 protein-rich myeloid leukemia cells HL-60 and U937. [4] This suggests a strong relationship to the TP53 gene and subsequent protein, including a share of transcription factors, promoter signals, tissue expression, and subcellular localization. While these genes are not identical, this relationship may give clues to function, structure, and expression.

mRNA transcript

5' UTR Analysis:The consensus folding of the 5' UTR of Hp53int1. Two loops are formed from amino acids 1-9 and 12-21 respectively. The connecting amino acids of both loops are rich in G's and C's. Hp53int1 5' UTR.png
5' UTR Analysis:The consensus folding of the 5’ UTR of Hp53int1. Two loops are formed from amino acids 1-9 and 12-21 respectively. The connecting amino acids of both loops are rich in G’s and C’s.

Hp53int1 is encoded by a polyadenylated transcript 1125 base pairs in length. There is a repeat sequence between base pairs 633...926 and a regulatory Poly A tail sequence between base pairs 496...1000. [1] The repeat sequence between base pairs 633 and 926 is noted to be similar to the Alu SC subfamily repeat, a sequence that characterizes the most abundant repeat sequences in humans and primates, and which likely diverged from other Alu subfamily complexes around 32 million years ago. [6]

Promoter sequence

The promoter sequence of the Hp53int1 gene was shown to be 540 base pairs in length. [7] Transcription factors SMAD2, SRY, and ETV7 had Basewise Conservation Scores of 2.99, 2.73, and 3.03 respectively, with value of 4 indicating the highest conservation and -0.5 the lowest. KLF2 and KLF5 had scores of 3.73 each. The species being compared were elephant, dog, rhesus macaque, and chicken.

Protein

The Hp53int1 protein has a molecular weight of 13.3 kdal and is 118 amino acids in length. [2] [8] It is a basic protein. [8] There is one highly conserved amino acid sequence between 102 and 110. [9] Serine site distribution is significantly higher than Arginine and Tyrosine sites. [10] There is a possible Casein Kinase II phosphorylation site between amino acids 22–25. [11] Casein Kinase II phosphorylation is involved in cell proliferation. Three regions of disorder are predicted for the Hp53int1 protein. [10]

Web Logo results using a multiple sequence alignment between Hp53int1 and its orthologs. Hp53int1 Web Logo.png
Web Logoresults using a multiple sequence alignment between Hp53int1 and its orthologs.

Post translational modifications

There is a possible O-ß-GlcNAc site between amino acids 14–20. [13] There are no transmembrane domains nor signal peptides.

There is a conserved phosphorylation site at S5, S15, and S20. [14]

Predictions for O-ss-GlcNAc in the Hp53int1 protein given by Yin O Yang. The green bars represent the potential for O-ss-GlcNAc attachment sites in Hp53int1, and must reach a value > 0.50 to be considered a likely site. There is one predicted attachment site for Hp53int1. Yin O Yang results for Hp53int1.png
Predictions for O-ß-GlcNAc in the Hp53int1 protein given by Yin O Yang. The green bars represent the potential for O-ß-GlcNAc attachment sites in Hp53int1, and must reach a value > 0.50 to be considered a likely site. There is one predicted attachment site for Hp53int1.

Structure

Secondary

Hp53int1 protein is predicted to contain one alpha helix and five beta sheets. [15] There are eight possible protein binding sites.

Tertiary

The predicted tertiary structure of the Hp53int1 protein as constructed by UnitProt Alpha Fold. The Tertiary Structure of Hp53int1.png
The predicted tertiary structure of the Hp53int1 protein as constructed by UnitProt Alpha Fold.

Subcellular localization

There is a 56.5% likelihood that the Hp53int1 protein is localized within the cytoplasm. [17]

Tissue expression

There is low tissue specificity for the Hp53int1 gene. [18] When comparing various samples of adult human tissues there is high expression in the thymus, ovaries, lymph nodes, and white blood cells. There was a high expression score for both RNA expression and protein expression in proximal digestive tract, gastrointestinal tract, and male/female tissues. During fetal development between weeks 10 and 16, there is elevated expression in the lung and heart tissues.

Hp53int1 is differentially expressed under conditions that require cell proliferation or apoptosis, in accordance to its congruence with the TP53 tumor suppressor gene [19]

Homology and evolution

Paralogs

The Hp53int1 gene was found to have no paralogs. [1]

Orthologs

Using NCBI BLAST [1] sequence analysis and Clustal W [20] a multiple sequence alignment was formulated for Hp53int1 and twenty top BLAST hits.

While the objective of this research was to find several orthologs across a variety of mammals, vertebrates, and invertebrates, there were only results in two classes: primates and bacteria. Furthermore, there was only one amino acid sequence within Hp53int1 and its orthologs of a general 20 amino acids in length that directly aligned). The primate genes are located on chromosome 13 while Hp53int1 is located on chromosome 17. [21] There is also evidence for subcellular localization of the primate proteins within the nuclear envelope, while evidence for Hp53int1 suggests it is in the cytoplasm. [17] Thus, it is reasonable to assume that the primates are not strict orthologs. A possible explanation for why strict Hp53int1 alignments are only found in bacteria is a possible crossing over event shared between bacteria and a common ancestor of primates.

Hp53int1 ortholog table.png
The phylogenetic tree of the Hp53int1 distant orthologs in the radial format. Phylogenetic tree for Hp53int1.png
The phylogenetic tree of the Hp53int1 distant orthologs in the radial format.

Function

The Hp53int1 protein is likely involved in the regulation of cellular proliferation and apoptosis. This is indicated by the use of apoptosis-regulator and ubiquitination regulator interacting proteins, its share of transcription factors with TP53 (especially the SMAD2, SRY, and ETV7 transcription factors), its location downstream of the p53p2 start site, its casein kinase II phosphorylation site, and its ubiquitous expression across tissues. The Hp53int1 gene expression is increased in environments where proteins that reduce cell proliferation are overexpressed, suggesting its requirement in cellular environments that need to arrest cell growth. [19] Also, quantitative analysis shows that the general hp53 protein is absolutely required for the activation of cellular response to DNA damage. [22]

Interacting Proteins

The following interacting proteins [23] were found for TP53, but can be applied to Hp53int1 due to their chromosomal relationship. Important to note: the functions of these interacting proteins are directly involved in cell regulation (ubiquitination, transcriptional regulator, apoptosis, and protein-tyrosine kinase).

The Hp53int1 interacting proteins.png

SNPs

There are two identified SNPS for Hp53int1. [7] SNP 15 falls at position 32 in a sequence of 43 residues located upstream of the largest ORF of the TP53 promoter. SNP 20 is then located 45 bp downstream of the 3’ end of the 1125 bp cDNA (Hp53int1).

Clinical significance

There is evidence that the Hp53int1 gene is involved in tumor suppression. Hp53int1 overlaps with TP53 exon 1 (TAD1). [24] Mutations within this region can result in alternative exon expression or incomplete splicing and a loss of tumor suppressor function. Regarding osteosarcoma, these rearrangements within TAD1 have been detected in ~20% of the cases involving incomplete splicing of TP53. These rearrangements are located across the full sequence of TP53 intron 1, but the majority of them will cluster in a domain of the Hp53int1 transcript, suggesting that these rearrangements may be facilitated by chromatin conformation within this locus [25]

Related Research Articles

<span class="mw-page-title-main">HIPK2</span> Protein-coding gene in the species Homo sapiens

Homeodomain-interacting protein kinase 2 is an enzyme that in humans is encoded by the HIPK2 gene. HIPK2 can be categorized as a Serine/Threonine Protein kinase, specifically one that interacts with homeodomain transcription factors. It belongs to a family of protein kinases known as the DYRK kinases. Within this family HIPK2 belongs to a group of homeodomain-interacting protein kinases (HIPKs), including HIPK1 and HIPK3. HIPK2 can be found in a wide variety of species and its functions in gene expression and apoptosis are regulated by several different mechanisms.

<span class="mw-page-title-main">USF1</span> Protein-coding gene in the species Homo sapiens

Upstream stimulatory factor 1 is a protein that in humans is encoded by the USF1 gene.

<span class="mw-page-title-main">60S ribosomal protein L41</span> Protein found in humans

60S ribosomal protein L41 is a protein that is specific to humans and is encoded by the RPL41 gene, also known as HG12 and large eukaryotic ribosomal subunit protein eL41. The gene family HGNC is L ribosomal proteins. The protein itself is also described as P62945-RL41_HUMAN on the GeneCards database. This RPL41 gene is located on chromosome 12.

<span class="mw-page-title-main">SUHW4</span> Protein-coding gene in the species Homo sapiens

Zinc finger protein 280D, also known as Suppressor Of Hairy Wing Homolog 4, SUWH4, Zinc Finger Protein 634, ZNF634, or KIAA1584, is a protein that in humans is encoded by the ZNF280D gene located on chromosome 15q21.3.

<span class="mw-page-title-main">C20orf27</span> Protein-coding gene in the species Homo sapiens

UPF0687 protein C20orf27 is a protein that in humans is encoded by the C20orf27 gene. It is expressed in the majority of the human tissues. One study on this protein revealed its role in regulating cell cycle, apoptosis, and tumorigenesis via promoting the activation of NFĸB pathway.

<span class="mw-page-title-main">PBDC1</span> Human gene

CXorf26, also known as MGC874, is a well conserved human gene found on the plus strand of the short arm of the X chromosome. The exact function of the gene is poorly understood, but the polysaccharide biosynthesis domain that spans a major portion of the protein product, as well as the yeast homolog, YPL225, offer insights into its possible function.

<span class="mw-page-title-main">OSER1</span> Protein-coding gene in the species Homo sapiens

Chromosome 20 open reading frame 111, or C20orf111, is the hypothetical protein that in humans is encoded by the C20orf111 gene. C20orf111 is also known as Perit1, HSPC207, and dJ1183I21.1. It was originally located using genomic sequencing of chromosome 20. The National Center for Biotechnology Information, or NCBI, shows that it is located at q13.11 on chromosome 20, however the genome browser at the University of California-Santa Cruz (UCSC) website shows that it is at location q13.12, and within a million base pairs of the adenosine deaminase locus. It was also found to have an increase in expression in cells undergoing hydrogen peroxide(H
2
O
2
)-induced apoptosis. After analyzing the amino acid content of C20orf111, it was found to be rich in serine residues.

Transmembrane protein 251, also known as C14orf109 or UPF0694, is a protein that in humans is encoded by the TMEM251 gene. One notable feature of this protein is the presence of proline residues on one of its predicted transmembrane domains., which is a determinant of the intramitochondrial sorting of inner membrane proteins.

UPF0575 protein C19orf67 is a protein which in humans is encoded by the C19orf67 gene. Orthologs of C19orf67 are found in many mammals, some reptiles, and most jawed fish. The protein is expressed at low levels throughout the body with the exception of the testis and breast tissue. Where it is expressed, the protein is predicted to be localized in the nucleus to carry out a function. The highly conserved and slowly evolving DUFF3314 region is predicted to form numerous alpha helices and may be vital to the function of the protein.

<span class="mw-page-title-main">C17orf98</span> Protein-coding gene in the species Homo sapiens

C17orf98 is a protein which in humans is coded by the gene c17orf98. The protein is derived from Homo sapiens chromosome 17. The C17orf98 gene consists of a 6,302 base sequence. Its mRNA has three exons and no alternative splice sites. The protein has 154 amino acids, with no abnormal amino acid levels. C17orf98 has a domain of unknown function (DUF4542) and is 17.6kDa in weight. C17orf98 does not belong to any other families nor does it have any isoforms. The protein has orthologs with high percent similarity in mammals and reptiles. The protein has additional distantly related orthologs across the metazoan kingdom, culminating with the sponge family.

LOC100287387 is a protein that in humans is encoded by the gene LOC100287387. The function of the protein is not yet understood in the scientific community. The gene is located on the q arm of chromosome 2.

LOC101928193 is a protein which in humans is encoded by the LOC101928193 gene. There are no known aliases for this gene or protein. Similar copies of this gene, called orthologs, are known to exist in several different species across mammals, amphibians, fish, mollusks, cnidarians, fungi, and bacteria. The human LOC101928193 gene is located on the long (q) arm of chromosome 9 with a cytogenic location at 9q34.2. The molecular location of the gene is from base pair 133,189,767 to base pair 133,192,979 on chromosome 9 for an mRNA length of 3213 nucleotides. The gene and protein are not yet well understood by the scientific community, but there is data on its genetic makeup and expression. The LOC101928193 protein is targeted for the cytoplasm and has the highest level of expression in the thyroid, ovary, skin, and testes in humans.

<span class="mw-page-title-main">WD Repeat and Coiled Coil Containing Protein</span> Protein-coding gene in humans

WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.

<span class="mw-page-title-main">PANO1</span> Mammalian protein found in Homo sapiens

PANO1 is a protein which in humans is encoded by the PANO1 gene. PANO1 is an apoptosis inducing protein that is able to regulate the function of tumor suppressor. More specifically, P14ARF is a protein in which in humans is modulated by the PANO1 gene. P14ARF is known to function as a tumor suppressor. When PANO1 is highly expressed in the cells, it is able to modulate p14ARF by stabilizing it and protecting it from degradation. With a confidence level of 5 out of 5, PANO1 has been theorized to be expressed in the nucleolus of the cell. PANO1 is an intron-less gene. Intron-less genes only make up about 3% of the human genome. A functional analysis of these types of genes revealed that they often have tissue-specific expression in tissues such as the nervous system and testis. This kind of expression is commonly associated with neuropathies, disease, and cancer. The tissue types that PANO1 has the highest expression in, are the cerebellum regions of the brain as well as pituitary and testis tissues.

<span class="mw-page-title-main">C11orf98</span> Protein-coding gene in the species Homo sapiens

C11orf98 is a protein-encoding gene on chromosome 11 in humans of unknown function. It is otherwise known as c11orf48. The gene spans the chromosomal locus from 62,662,817-62,665,210. There are 4 exons. It spans across 2,394 base pairs of DNA and produces an mRNA that is 646 base pairs long.

<span class="mw-page-title-main">C12orf29</span> Protein-coding gene in humans

C12orf29 is a protein that in humans is encoded by chromosome 12 open reading frame 29. The gene is ubiquitously expressed in various tissues. The protein has 325 amino acids. The biological process of C12orf29 has been annotated as hematopoietic progenitor cell differentiation. The molecular and cellular functions of C12orf29 gene have not yet well understood by the scientific community.

<span class="mw-page-title-main">C2orf80</span> Gene

C2orf80 is a protein that, in humans, is encoded by the c2orf80 gene. The gene c2orf80 also goes by the alias GONDA1. In humans, c2orf80 is exclusively expressed in the brain. While relatively little is known about the function of c2orf80, medical studies have shown a strong association between variations in c2orf80 and IDH-mutant gliomas, 46,XY gonadal dysgenesis, and a possible association with blood pressure.

<span class="mw-page-title-main">C13orf42</span> C13orf42 gene page

C13orf42 is a protein which, in humans, is encoded by the gene chromosome 13 open reading frame 42 (C13orf42). RNA sequencing data shows low expression of the C13orf42 gene in a variety of tissues. The C13orf42 protein is predicted to be localized in the mitochondria, nucleus, and cytosol. Tertiary structure predictions for C13orf42 indicate multiple alpha helices.

<span class="mw-page-title-main">C13orf46</span> C13of46 Gene and Protein

Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.

<span class="mw-page-title-main">ZNF839</span> Protein which in humans is encoded by the ZNF839 gene

ZNF839 or zinc finger protein 839 is a protein which in humans is encoded by the ZNF839 gene. It is located on the long arm of chromosome 14. Zinc finger protein 839 is speculated to play a role in humoral immune response to cancer as a renal carcinoma antigen (NY-REN-50). This is because NY-REN-50 was found to be over expressed in cancer patients, especially those with renal carcinoma. Zinc finger protein 839 also plays a role in transcription regulation by metal-ion binding since it binds to DNA via C2H2-type zinc finger repeats.

References

  1. 1 2 3 4 "Hp53int1 human unknown protein". NCBI Protein BLAST. 17 January 1997. Retrieved 17 December 2022.
  2. 1 2 Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (October 1990). "Basic local alignment search tool". Journal of Molecular Biology. 215 (3): 403–410. doi:10.1016/s0022-2836(05)80360-2. PMID   2231712. S2CID   14441902.
  3. Reisman D, Loging WT, Rotter V, Almon E (December 1996). "A novel transcript encoded within the 10-kb first intron of the human p53 tumor suppressor gene (D17S2179E) is induced during differentiation of myeloid leukemia cells". Genomics. 38 (3): 364–370. doi:10.1006/geno.1996.0639. PMID   8975713.
  4. Ortiz-Cuaran S, Cox D, Villar S, Friesen MD, Durand G, Chabrier A, et al. (October 2013). "Association between TP53 R249S mutation and polymorphisms in TP53 intron 1 in hepatocellular carcinoma". Genes, Chromosomes & Cancer. 52 (10): 912–919. doi: 10.1002/gcc.22086 . PMID   23836507. S2CID   21587166.
  5. "UNAFold". www.unafold.org. Retrieved 2022-12-16.
  6. Price AL, Eskin E, Pevzner PA (November 2004). "Whole-genome analysis of Alu repeat elements reveals complex evolutionary history". Genome Research. 14 (11): 2245–2252. doi:10.1101/gr.2693004. PMC   525682 . PMID   15520288.
  7. 1 2 Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D (June 2002). "The human genome browser at UCSC". Genome Research. 12 (6): 996–1006. doi:10.1101/gr.229102. PMC   186604 . PMID   12045153.
  8. 1 2 Brendel V, Bucher P, Nourbakhsh IR, Blaisdell BE, Karlin S (March 1992). "Methods and algorithms for statistical analysis of protein sequences". Proceedings of the National Academy of Sciences of the United States of America. 89 (6): 2002–2006. Bibcode:1992PNAS...89.2002B. doi: 10.1073/pnas.89.6.2002 . PMC   48584 . PMID   1549558.
  9. not provided, Bioline (2016-10-09). "Clean-up using AMPure XP beads v1". doi:10.17504/protocols.io.f3ebqje . Retrieved 2022-12-16.{{cite journal}}: Cite journal requires |journal= (help)
  10. 1 2 "GPS 5.0 - Kinase-specific Phosphorylation Site Prediction". gps.biocuckoo.cn. Retrieved 2022-12-16.
  11. "Motif Scan". myhits.sib.swiss. Retrieved 2022-12-16.
  12. "WebLogo - Create Sequence Logos". weblogo.berkeley.edu. Retrieved 2022-12-16.
  13. 1 2 "Services". healthtech.dtu.dk. Retrieved 2022-12-16.
  14. Hornbeck PV, Zhang B, Murray B, Kornhauser JM, Latham V, Skrzypek E (January 2015). "PhosphoSitePlus, 2014: mutations, PTMs and recalibrations". Nucleic Acids Research. 43 (Database issue): D512–D520. doi:10.1093/nar/gku1267. PMC   4383998 . PMID   25514926.
  15. "Welcome to PredictProtein!". predictprotein.org. Retrieved 2022-12-16.
  16. "AlphaFold Protein Structure Database". alphafold.ebi.ac.uk. Retrieved 2022-12-16.
  17. 1 2 "PSORT Prediction Tool". PSORT II. Retrieved 2022-12-12.
  18. "TP53 protein expression summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2022-12-16.
  19. 1 2 "Home - GEO Profiles - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2022-12-16.
  20. 1 2 "Clustal Omega < Multiple Sequence Alignment < EMBL-EBI". www.ebi.ac.uk. Retrieved 2022-12-16.
  21. Kent WJ (April 2002). "BLAT--the BLAST-like alignment tool". Genome Research. 12 (4): 656–664. doi:10.1101/gr.229202. PMC   187518 . PMID   11932250.
  22. Gotoh T, Vila-Caballer M, Liu J, Schiffhauer S, Finkielstein CV (January 2015). Solomon MJ (ed.). "Association of the circadian factor Period 2 to p53 influences p53's function in DNA-damage signaling". Molecular Biology of the Cell. 26 (2): 359–372. doi:10.1091/mbc.E14-05-0994. PMC   4294682 . PMID   25411341.
  23. "tp53 Interactants [Xenopus] - Xenbase Gene Catalog". www.xenbase.org. Retrieved 2022-12-16.
  24. Hainaut P, Pfeifer GP (November 2016). "Somatic TP53 Mutations in the Era of Genome Sequencing". Cold Spring Harbor Perspectives in Medicine. 6 (11): a026179. doi:10.1101/cshperspect.a026179. PMC   5088513 . PMID   27503997.
  25. Chen X, Bahrami A, Pappo A, Easton J, Dalton J, Hedlund E, et al. (April 2014). "Recurrent somatic structural variations contribute to tumorigenesis in pediatric osteosarcoma". Cell Reports. 7 (1): 104–112. doi:10.1016/j.celrep.2014.03.003. PMC   4096827 . PMID   24703847.