HSH2D

Last updated

Hematopoietic SH2 Domain Containing (HSH2D) protein is a protein encoded by the hematopoietic SH2 domain containing (HSH2D) gene.

Contents

Gene

HSH2D is located on chromosome 19 at 19p13.11. Common aliases of the gene include HSH2 (Hematopoietic SH2 Protein) and ALX (Adaptor in Lymphocytes of Unknown Function X). The mRNA encodes two main isoforms. Isoform 1, the longest isoform, contains seven exons. The gene spans from 16134028 to 16158575.

mRNA

Two main isoforms of HSH2D exist. Isoform 1 has seven exons and is 2,403 bp in length. Isoform 2 has six exons and is 2,936 bp long. Although isoform 2 has longer mRNA, it still produces the smaller isoform in the mature protein. Isoform 2 has a variant 5’ UTR and a different start codon, as well as a shorter N-terminus. [1] The mRNA has a short 5' UTR and a long 3' UTR.

Protein

The protein has a molecular weight of 39.0 kilodaltons (kDa) and a pI of 6.678. [2] The main feature of the protein is the SH2 (Src homology) domain, which is a region that has phosphotyrosine receptors and is important in many signaling molecules. [3] This domain is located from residues 26-127.

The secondary structure of the protein contains a helical section around residues 40-50, a sheet between 60-70, helices between 100-110, 135-145, 175-180, 200-225, and additional sheets between 235-240 and 295-300, shown in the figure at the bottom of the section (helices are purple arrows and sheets are red arrows). The protein has several locations of post-translational modifications, especially phosphorylation and GalNAc O-glycosylation, which has been shown to play a role in cancers. [4]

Predicted 3D structure of the HSH2D protein HSH2D Structure.png
Predicted 3D structure of the HSH2D protein

The tertiary structure of the protein has not been confirmed through research, however, predictions using I-TASSER [5] software are useful in visualizing the protein.

Expression

Based on NCBI GEO [6] expression profiles and EST analyses, the protein appears to be narrowly expressed throughout human tissues. It is highly expressed in bone marrow, CD4+ and CD8+ T cells, lymph node, mammary gland, spleen, stomach, thyroid, and small intestine tissue. Expression is elevated in cases of early T-cell precursor acute lymphoblastic leukemia [6] and lowered in breast cancer cells that are treated with estrogen, suggesting an interaction between the protein and estrogen.

Function

The function of the HSH2D protein is still not fully understood, however it has been shown to play a role in various cellular functions such as apoptosis, wound healing, vascular endothelial growth factors, membrane-associated intracellular trafficking, biogenesis of lipid droplets and collagen remodeling. [7] It is also thought to play a role in T-cell activation. [8]

Interacting proteins

HSH2D interacts with several proto-oncogenes, including FES proto-oncogene (FES) and CRK proto-oncogene (CRK). It also has suspected interactions with other proteins such as tyrosine kinase non-receptor 2 (TRK2), PTEN-induced putative kinase (PINK1), and Interleukin 2 (IL2). [9] A summary of these proteins is shown below with their suspected functions.

NameNCBI Accession NumberFunction
FES proto-oncogene (FES)NP_001996.1Hematopoiesis, growth factor and cytokine receptor signaling.
CRK proto-oncogene (CRK)NP_058431.2Adaptor that binds to tyrosine-phosphorylated proteins. Has SH2 and SH3 domains
Tyrosine kinase non-receptor 2 (TNK2)NP_005772.3Tyrosine kinase which may be linked to tyrosine phosphorylation signal transduction pathways. [10]
PTEN-induced putative kinase (PINK1)NP_115785.1Serine/threonine protein kinase
Interleukin 2 (IL2)NP_000577.2Cytokine important for T- and B- cell proliferation [11]

Clinical significance

The HSH2D protein has been studied along other human genes predicted to be involved in the human immune system. HSH2D was found to be highly expressed in patients with ulcerative colitis. [12] The protein is also associated with alpha-interferon activity. [13]

Homology

HSH2D has four distant paralogs and several orthologs in other species that have high levels of conservation.

Paralogs

The four paralogs of HSH2D in humans are other proteins containing SH2 domains. They do not have a high level of conservation other than this domain. All paralogs were found through genecards [14]

NameNCBI Accession NumberSequence Length (Amino Acids)Sequence SimilaritySequence Identity
SH2D2ANP_001154913.139929%

21.7%

SH2D7NP_001094874.145133.1%25.9%
SH2D4ANP_001167630.145412%17.4%
SH2D4BNP_997255.235733.7%18.1%

Orthologs

HSH2D has several orthologous proteins that span across several orders of species. The protein was well conserved across mammals as well as a few reptiles, amphibians, and invertebrates. The following list is not exhaustive, rather, it shows the wide range of organisms that the protein may be found in. All orthologous proteins were found with BLAST [15] or BLAT [16] programs.[ by whom? ]

Scientific NameCommon NameOrderNCBI Accession NumberSequence Length (Amino Acids)Sequence IdentitySequence Similarity
Pan troglodytes ChimpanzeePrimatesNP_001229302.135299%99.70%
Heterocephalus glaber Naked Mole RatRodentiaEHB15865.132454%63.40%
Hipposideros armiger Great roundleaf batChiropteraXP_019497370.136067%76.10%
Condylura cristata Star nosed moleSoricomorphaXP_004688256.135562%72.10%
Camelus dromedarius Dromedary camelCetariodactylaXP_010993355.136066%75.70%
Panthera pardus LeopardCarnivoraXP_019271712.136062%71.40%
Meleagris gallopavo Wild TurkeyBirdXP_010723595.132623%28.80%
Anolis carolinensis Carolina anoleReptileXP_016854511.156722%31.50%
Xenopus tropicalis Western clawed frogAmphibianXP_012809627.136331%42.80%
Callorhinchus milii Australian GhostsharkFishXP_007899329.150026%33.10%
Lingula anatina LingulaInvertebrateXP_013404014.118718%23.40%
Biomphalaria glabrata N/AInvertebrateXP_013080865.181812%10.10%
Salpingoeca rosetta N/AProtistaXP_004995081.148117.70%10.90%

Related Research Articles

<span class="mw-page-title-main">ABL (gene)</span> Human protein-coding gene on chromosome 9

Tyrosine-protein kinase ABL1 also known as ABL1 is a protein that, in humans, is encoded by the ABL1 gene located on chromosome 9. c-Abl is sometimes used to refer to the version of the gene found within the mammalian genome, while v-Abl refers to the viral gene, which was initially isolated from the Abelson murine leukemia virus.

<span class="mw-page-title-main">CRKL</span> Protein-coding gene in the species Homo sapiens

Crk-like protein is a protein that in humans is encoded by the CRKL gene.

<span class="mw-page-title-main">ABL2</span> Protein-coding gene in the species Homo sapiens

Tyrosine-protein kinase ABL2 also known as Abelson-related gene (Arg) is an enzyme that in humans is encoded by the ABL2 gene.

<span class="mw-page-title-main">C11orf49</span> Protein-coding gene in the species Homo sapiens

C11orf49 is a protein coding gene that in humans encodes for the C11orf49 protein. It is heavily expressed in brain tissue and peripheral blood mononuclear cells, with the latter being an important component of the immune system. It is predicted that the C11orf49 protein acts as a kinase, and has been shown to interact with HTT and APOE2.

<span class="mw-page-title-main">RNF128</span> Protein-coding gene in the species Homo sapiens

E3 ubiquitin-protein ligase RNF128 is an enzyme that in humans is encoded by the RNF128 gene.

<span class="mw-page-title-main">ANKRD24</span> Protein-coding gene in the species Homo sapiens

Ankyrin repeat domain-containing protein 24 is a protein in humans that is coded for by the ANKRD24 gene. The gene is also known as KIAA1981. The protein's function in humans is currently unknown. ANKRD24 is in the protein family that contains ankyrin-repeat domains.

<span class="mw-page-title-main">PRR29</span> Protein-coding gene in the species Homo sapiens

PRR29 is a protein encoded by the PRR29 gene located in humans on chromosome 17 at 17q23.

BEND2 is a protein that in humans is encoded by the BEND2 gene. It is also found in other vertebrates, including mammals, birds, and reptiles. The expression of BEND2 in Homo sapiens is regulated and occurs at high levels in the skeletal muscle tissue of the male testis and in the bone marrow. The presence of the BEN domains in the BEND2 protein indicates that this protein may be involved in chromatin modification and regulation.

<span class="mw-page-title-main">C17orf98</span> Protein-coding gene in the species Homo sapiens

C17orf98 is a protein which in humans is coded by the gene c17orf98. The protein is derived from Homo sapiens chromosome 17. The C17orf98 gene consists of a 6,302 base sequence. Its mRNA has three exons and no alternative splice sites. The protein has 154 amino acids, with no abnormal amino acid levels. C17orf98 has a domain of unknown function (DUF4542) and is 17.6kDa in weight. C17orf98 does not belong to any other families nor does it have any isoforms. The protein has orthologs with high percent similarity in mammals and reptiles. The protein has additional distantly related orthologs across the metazoan kingdom, culminating with the sponge family.

<span class="mw-page-title-main">ZCCHC18</span> Protein-coding gene in the species Homo sapiens

Zinc finger CCHC-type containing 18 (ZCCHC18) is a protein that in humans is encoded by ZCCHC18 gene. It is also known as Smad-interacting zinc finger protein 2 (SIZN2), para-neoplastic Ma antigen family member 7b (PNMA7B), and LOC644353. Other names such as zinc finger, CCHC domain containing 12 pseudogene 1, P0CG32, ZCC18_HUMAN had been used to describe this protein.

Putative uncharacterized protein C6orf52 (C6orf52) is a protein in humans that is encoded by the gene "C6orf52" and has six known isoforms. C6orf52 was identified in 2002 by The National Institutes of Health Mammalian Gene Collection (MGC) Program. C6orf52 has one known paralog, tRNA selenocysteine 1-associated protein 1 (TRNAU1AP).

<span class="mw-page-title-main">WD Repeat and Coiled Coil Containing Protein</span> Protein-coding gene in humans

WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.

<span class="mw-page-title-main">C7orf50</span> Mammalian protein found in Homo sapiens

C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.

<span class="mw-page-title-main">SBK3</span> Protein-coding gene in the species Homo sapiens

SH3 Domain Binding Kinase Family Member 3 is an enzyme that in humans is encoded by the SBK3 gene. SBK3 is a member of the serine/threonine protein kinase family. The SBK3 protein is known to exhibit transferase activity, especially phosphotransferase activity, and tyrosine kinase activity. It is well-conserved throughout mammalian organisms and has two paralogs: SBK1 and SBK2.

<span class="mw-page-title-main">C6orf136</span> Protein-coding gene in the species Homo sapiens

C6orf136 is a protein in humans encoded by the C6orf136 gene. The gene is conserved in mammals, mollusks, as well some porifera. While the function of the gene is currently unknown, C6orf136 has been shown to be hypermethylated in response to FOXM1 expression in Head Neck Squamous Cell Carcinoma (HNSCC) tissue cells. Additionally, elevated expression of C6orf136 has been associated with improved survival rates in patients with bladder cancer. C6orf136 has three known isoforms.

<span class="mw-page-title-main">FAM120AOS</span> Protein-coding gene in the species Homo sapiens

FAM120AOS, or family with sequence similarity 120A opposite strand, codes for uncharacterized protein FAM120AOS, which currently has no known function. The gene ontology describes the gene to be protein binding. Overall, it appears that the thyroid and the placenta are the two tissues with the highest expression levels of FAM120AOS across a majority of datasets.

<span class="mw-page-title-main">OCEL1</span> Protein-coding gene in the species Homo sapiens

OCEL1, also called Occludin//ELL Domain Containing 1, is a protein encoding gene located at chromosome 19p13.11 in the human genome. Other aliases for the gene include FLJ22709, FWP009, and S863-9. The function of OCEL1 has not yet been identified.

<span class="mw-page-title-main">KIAA1143</span> Research of newly discovered gene KIAA1143 about its function and biological properties/significance

KIAA1143 is an uncharacterized protein in humans that is encoded by the KIAA1143 gene. it may play a role in cell growth mechanisms and regulation/creation of cytoskeletal structure. This gene is located on chromosome 3 on the minus strand

<span class="mw-page-title-main">MROH9</span> Mammalian gene

Maestro heat-like repeat-containing protein family member 9 (MROH9) is a protein which in humans is encoded by the MROH9 gene. The word ‘maestro’ itself is an acronym, standing for male-specific transcription in the developing reproductive organs (MRO). MRO genes belong to the MROH family, which includes MROH9.

References

  1. "hematopoietic SH2 domain containing [ Homo sapiens (human) ]". National Center for Biotechnology Information. 2017-04-22.
  2. Brendel, V., Bucher, P., Nourbakhsh, I.R., Blaisdell, B.E. & Karlin, S. (1992) "Methods and algorithms for statistical analysis of protein sequences" Proc. Natl. Sci. U.S.A 89, 2009-2006
  3. Filippakopoulos, Panagis (December 2009). "SH2 domains: modulators of nonreceptor tyrosine kinase activity". Current Opinion in Structural Biology. 19 (6): 643–649. doi: 10.1016/j.sbi.2009.10.001 . PMC   2791838 . PMID   19926274.
  4. Gill, D. J. (March 2011). "Location, location, location: New insights into O-GalNAc protein glycosylation". Trends in Cell Biology. 21 (3): 149–158. doi:10.1016/j.tcb.2010.11.004. PMID   21145746.
  5. "I-TASSER".
  6. 1 2 "NCBI GEO".
  7. Mackintosh, C.G (2016). "SOLiD SAGE sequencing shows differential gene expression in jejunal lymph node samples of resistant and susceptible red deer (Cervus elaphus) challenged with Mycobacterium avium subsp. Paratuberculosis". Veterinary Immunology and Immunopathology. 169: 102–110. doi:10.1016/j.vetimm.2015.10.009. PMID   26620077.
  8. Tatebe, Ken (2010). "Response network analysis of differential gene expression in human epithelial lung cells during avian influenza infections". BMC Bioinformatics. 11: 170. doi: 10.1186/1471-2105-11-170 . PMC   2868837 . PMID   20370926.
  9. "STRING: Protein-Protein Interaction Network".
  10. "TNK2 tyrosine kinase non receptor 2".
  11. "IL2 Interleukin 2".
  12. Clark, Peter (2012). "Bioinformatics analysis reveals transcriptome and microRNA signatures and drug repositioning targets for IBD and other autoimmune diseases". Inflammatory Bowel Diseases. 18 (12): 2315–33. doi: 10.1002/ibd.22958 . PMID   22488912. S2CID   4629313.
  13. Schmeisser, H (2010). "Identification of Alpha Interferon-Induced Genes Associated with Antiviral Activity in Daudi Cells and Characterization of IFIT3 as a Novel Antiviral Gene". Journal of Virology. 84 (20): 10671–0680. doi: 10.1128/jvi.00818-10 . PMC   2950578 . PMID   20686046.
  14. "HSH2D Gene".
  15. "NCBI BLAST".
  16. "UCSC BLAT Genome Search".