ISLR | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | ISLR , HsT17563, Meflin, immunoglobulin superfamily containing leucine-rich repeat, immunoglobulin superfamily containing leucine rich repeat | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | OMIM: 602059 MGI: 1349645 HomoloGene: 4050 GeneCards: ISLR | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
In humans, the immunoglobulin super family containing leucine-rich repeat (ISLR) protein is encoded by the ISLR gene. [5] Current RNA-seq studies show that the protein is highly expressed in the endometrium and ovary and shows expression among 25 other tissues. [6] The protein is seen localized in the cytoplasm, [7] plasma membrane, [8] extracellular exosome, [9] and platelet alpha granule lumen. [5] Furthermore, the protein is known to play a role in platelet degranulation, [5] cell adhesion, [10] and response to elevated platelet cytosolic Ca2+. [11]
The aliases for ISLR are Meflin, HsT17563, and mesenchymal stromal-cell and fibroblast-expressing Linx paralogue. [5] The gene is part of the I-set family. [5]
The most updated annotation shows the gene spanning from 74,173,710 to 74,176,871 base pairs (3,161 bp) with location on the plus strand at position 15q24.1 (Chromosome 15). [5] The gene contains 3 exons and 4 distinct introns. [5] [12]
The ISLR gene have two known transcript isoforms on the plus strands: ISLR transcript variant 1 [13] and ISLR transcript variant 2. [14] Both variants encode for the same protein. [15]
Transcript variant 1 encodes for the longest protein isoform with the length of 2,331 bp, containing 2 exons. [13]
Transcript variant 2 encodes for the shorter protein isoform with the length of 2,128 bp, containing 2 exons. [14] This variant differs in the 5' UTR compared to variant 1. [14]
The human gene of ISLR has two alternatively spliced identical isoforms. [11] The domains of the ISLR gene follows as: LRR_8 (leucine-rich repeat), LRR_RI (ribonuclease inhibitor), PCC (polycystin cation channel protein) Super family, and Ig (immunoglobulin). [16]
The predicted isoelectric point of unmodified protein ISLR is 5.3. [12] The calculated molar mass is 46.0 kDa. [10]
The ISLR protein has 428 amino acids (aa) in humans. [17] Through the Statistical Analysis of Protein Sequences (SAPS) tool, the percentage of most amino acid residues is about its average percentage among human proteins except leucine which shows high abundance compared to a normal protein. [18] This is expected with the gene containing multiple LRR (leucine-rich repeats) structural motifs. [5] There is a significantly low abundance of methionine (predicted to be 0.5%). [18] In summary, the positively charged amino acid residues overcounts the negatively charged amino acid residues. [18]
Through SAPS tool, there are two predicted identical four-block length repetitive structures that fall within LRR structural motifs: LSHL at 97-100 bp and 172-175 bp. [18]
One high-scoring transmembrane segment was predicted through the SAPS tool from 411 to 428 aa (length 18) with a pocket from 417 to 418 aa. [18] The Phobius prediction for the ISLR protein sequence illustrated the potential transmembrane domain and a signal peptide (Figure 2). [19] The SignalP-5.0 prediction for the signal peptide reported a likelihood of 0.9989 with a cleavage site between position 18 and 19 with the probability of 0.9146. [20]
Phosphorylation Sites
There are 31 predicted phosphorylation sites in the protein sequence for ISLR in humans from NetPhos. [21] The results were filtered for best predictions for each residue display and accounted for serine, threonine, and tyrosine.
Through Eukaryotic Structural Motif (ELM tool) predictions, eight distinct phosphorylation sites were identified for the protein: [22]
Name | Position (aa) | Cell Compartment(s) |
---|---|---|
N-degron [31] | 1-3 | Cytosol |
Endosome-Lysosome-Basolateral sorting signals [32] | 2-7 | Cytosol, Endocytic vesicle |
Nuclear Export Signal (NES) [33] | 3-17 | Nucleus, cytosol |
Phosphotyrosine ligands bound by Src Homology 2 (SH2) domains [34] | 238-241 | Cytosol |
Class IV WW domain ligands [35] | 343-348 | Cytosol, nucleus |
Cyclin-dependent kinase subunit 1 (Cks1) ligand [36] | 344-349 | Cytosol, nucleus |
TNFR-associated factors 6 | 345-353 | Cytosol |
Peptide Amidation Site [38] | 355-358 | Extracellular, secretory granule |
di-Arg retention/retrieving signal [39] | 357-359 | Endoplasmic reticulum membrane, ER-Golgi transport vesicle membrane, rough endoplasmic reticulum, endoplasmic reticulum cisterna, cytosol, integral protein |
N-arginine dibasic (NRD) cleavage site [40] | 357-359 | Extracellular, Golgi apparatus, cell surface |
Src Homology 3 (SH3) ligand [41] | 375-381 | Plasma membrane, focal adhesion, cytosol |
Glycosaminoglycan attachment site [42] | 379-381 | Extracellular, Golgi apparatus |
STAT5 SH2 domain binding motif [43] | 370-373 389-392 | Cytosol |
PDZ domain ligands [44] | 423-428 | Cytosol, internal side of plasma membrane |
Pex14 ligand motif [45] | 424-428 | Cytosol, peroxisome, glycosome |
Table 1. Results of ELM motif search after context, structural, and globular domain filtering with acceptable structural score (above medium threshold score). [22] There are 23 total identified post-translational modifications including phosphorylation sites in the human protein of ISLR.
GlycosylationSites
Through the Simple Modular Architecture Research (SMART) tool [46] in Figure 3, an annotation predicted three N-linked glycosylation sites (red circles, starting from the left): 51 aa, 60 aa, and 309 aa. The LRR structural motifs and immunoglobulin C-2 type (IGc2) domain are shown in the diagram (Figure 3).
Amidation Sites
Predicted with MyHits which investigates relationships between protein sequences and motifs, [47] an amidation site motif was confidently predicted at position 355-358 aa.
Palmitoylation sites
Three palmitoylation sites were predicted (Table 2).
Position | Peptide | Score | Cutoff |
---|---|---|---|
19 | LLGLAQACPEPCDCG | 21.194 | 3.717 |
23 | AQACPEPCDCGEKYG | 13.22 | 10.722 |
25 | ACPEPCDCGEKYGFQ | 4.463 | 3.717 |
Table 2. Results of the CSS-Palm [48] for the human protein of ISLR.
GPI-Modifications
Predicted with big-PI predictor, [49] one glucose phosphate isomerase (GPI) modification was found at position 401 aa (best site) with P-value score of 1.71e-03. [50]
Of all the predicted beta sheets, four stretches at 253-260 aa, 265-272 aa, 323-331 aa, and 335-346 aa were identified with high confidence using CFSSP [51] and Phyre2. [52] Of all the predicted alpha helices, three alpha helices at 5-15 aa, 189-195 aa, and 214-216 aa were identified with high confidence as well. A tertiary model of the human ISLR protein predicted by I-TASSER [53] shows a combination of some alpha helices and beta sheets (Figure 1). Based on the secondary structure prediction of the protein in I-TASSER, the locations of the four beta sheets and three alpha helices confirms the predictions of high confidence made by CFSSP and Phyre2.
The ISLR protein in humans is expected to localize throughout a cell, including extracellular region, based on the predicted results of PSORT II. [54] The Reinhardt's method for cytoplasmic/nuclear discrimination predicted the protein to be more cytoplasmic with a reliability of 76.7. Additionally, ISLR was shown to localize in the cytoplasm based on the polyclonal antibody results in immunohistochemically stained human tissues in myocytes, glandular cells, skin, hepatocytes. [7] Immunofluorescent staining of ISLR in human cell line BJ (fibroblasts) showed localization to the plasma membrane using ISLR Polyclonal Antibody as well. [55]
In humans, RNA-seq was conducted on tissue samples from 95 individuals representing 27 different tissues to determine tissue-specificity of all protein-coding genes. [56] Notably, there is high expression of ISLR in endometrium and ovary and visible expression among 25 other tissues. Another study of RNA Sequencing of total RNA from 20 human tissues demonstrated high expression of ISLR in uterus. [57] Tissue-specific circular RNA induction during human fetal development showed steady expression of ISLR throughout the development with a high increase at 10 weeks for stomach. [58] Expression remained notably high to 20 weeks for stomach.
In the annotated figure, an in situ Hybridization on a 56-days old male mouse brain (sagittal cut) demonstrated expression in the olfactory areas and hippocampal formation (Figure 4). [59]
Based on Protein Abundance Database (PAXdb 4.1), the human protein of ISLR is shown with high protein abundance (ppm value > 1) relative to the whole organism. [60]
Expression profiling by microarray of ISLR in female human subjects demonstrated overexpression of ISLR in breast lipotransfer white adipose tissue CD34+ cells and significantly lower expression in leukapheresis CD34+ cells. [61]
Expression profiling by microarray of ISLR in human subjects demonstrated overexpression in non-union skeletal fractures compared to low expression in normal fractures. [62]
Expression profiling by microarray of ISLR in obese female human subjects demonstrated consistent low expression of ISLR in subjects that followed a short-term low-fat hypocaloric diet. [63]
There is one promoter region in the ISLR gene with a predicted length of 1,912 bp (Figure 5) extracted from Genomatix. [64] Additionally, there is a polyadenylation signal at 3,142 bp in the ISLR nucleotide sequence (humans). [65]
There are six distinct transcription factors that bind onto the promoter region of ISLR from Genomatix predictions: two SMAD factors, sine oculis homeobox (SIX), heat shock factor (HSF), PRDM, Snail, and cell cycles gene homology region (CHR). [64]
Genomatix results predicted more transcription factor binding sites in ISLR with the highest matrix similarity (0.97~0.99) such as:
The human gene of ISLR is predicted to be targeted by 85 miRNAs in miRDB. [66] The top scoring (>88) miRNAs are has-miR-5197-3p, has-miR-4688, has-miR-3150a-3p, has-miR-16-5p, has-miR-195-5p, has-miR-15a-5p, and has-miR-6763-5p.
RBPmap, [67] which maps predicted binding sites of RNA binding proteins, showed multiple conserved motifs in evolution relative to the human ISLR mRNA transcript variant 1 sequence [13] such as:
Currently, there is one other paralog in humans known as ISLR 2 [68] and two paralogous domains: LRRN4 (protein precursor 4) and LRRN4CL (protein precursor 4 C-terminal like). [11]
As of August 2020, there above 190 known orthologs of the ISLR human gene, [69] the most distant ortholog and homolog found in Exaiptasia pallida (sea anemone). [70] The table below demonstrates the relationships between human ISLR protein characteristics and selected orthologs covering the range from closest related to Homo sapiens to most distant.
Species 1 | Species 2 | Common Name | Taxonomic Group | Accession number | Date of Divergence | Sequence Length | Protein Percent Identity | Protein Sequence Similarity |
(Million years ago [MYA]) | (aa) | |||||||
Human vs. | Homo sapiens | Human | Primates | BAA85970.1 | 0 | 428 | 100% | 100% |
Human vs. | Macaca fascicularis | Crab-eating macaque | Primates | XP_005560108.1 | 29 | 428 | 98.33% | 99.00% |
Human vs. | Tursiops truncatus | Bottlenose dolphin | Cetacea | XP_033707870 | 96 | 428 | 91.58% | 94.00% |
Human vs. | Mus musculus | House mouse | Rodentia | BAA85973.1 | 160 | 428 | 88.24% | 91.00% |
Humans vs. | Myotis brandtii | Brandt's bat | Therapsid | XP_005882850.1 | 96 | 422 | 85.68% | 89% |
Human vs. | Monodelphis domestica | Gray short-tailed opossum | Didelphimorphia | XP_007478205 | 180 | 418 | 75.66% | 83.00% |
Human vs. | Ornithorhynchus anatinus | Platypus | Monotremata | XP_007663289.2 | 177 | 417 | 61.29% | 72.00% |
Human vs. | Lacerta agilis | Sand lizard | Squamata | XP_033016939.1 | 312 | 418 | 48.90% | 58.00% |
Human vs. | Apteryx rowi | Ōkārito kiwi | Apterygiformes | XP_025924151.1 | 318 | 429 | 48.03% | 62.00% |
Humans vs. | Haliaeetus leucocephalus | Bald eagle | Accipitriformes | XP_010569899 | 312 | 416 | 48.01% | 61% |
Human vs. | Exaiptasia pallida | Exaiptasia | Actiniaria | KXJ26782.1 | 824 | 304 | 29.89% | 46.00% |
Human vs. | Bactrocera dorsalis | Oriental fruit fly | Diptera | JAC38616.1 | 797 | 326 | 26.43% | 48.00% |
Human vs. | Fopius arisanus | Wasp (a parasitic type) | Hymenopterans | JAG75735.1 | 797 | 713 | 27.25% | 42.00% |
There are a total of 284 results from PSIQUIC View [71] of ISLR (human) that demonstrates its binding to numerous distinct proteins. iRefIndex showed 97 total results with multiple physical association interactions such as ISLR with Rho GTP-family (RHOBTB3), BMP7, Sphingose-1-Phosphate Lyase (SGPL1), Carnitine-acylcarnitine translocase (SLC25A20), Canopy FGF Signaling regulator 3 (CNPY3), and Leishmanolysin-like peptidase (LMLN). [72] The physical associations were identified with two hybrid pooling approach, affinity chromatography technology, enzymatic study, or anti-tag coimmunoprecipitation. Overall, the results from iRefIndex suggests ISLR to be involved in various mechanisms such as cell migration, transport of different complexes, and metabolism (enzymatic mechanisms). For example, RHOBTB3 is involved in transporting different complexes along pathways such as endosomes to trans Golgi network and Golgi to ER. [73] Furthermore, LMLN has been shown to play a role in cell migration, potentially mitotic progression. [74] In terms of metabolism, SGPL1 is involved in the metabolism of sphingolipids. [75]
Based on the previous results from iRefindex [72] that indicated physical associations between ISLR and other proteins, two different interactions were identified with distinct strains of coronavirus. An interaction with orf1ab polyprotein of the human coronavirus strain HKU1(HCov-HKU1) was shown in physical association with ISLR through the two hybrid pooling approach, where ISLR is indicated as "prey" and the orf1ab polyprotein is indicated as "bait". [76]
There is another detected interaction of ISLR and Human SARS coronavirus through direct contact based on the two hybrid pooling approach. [76]
The delivery of ISLR-expressing lentivirus into a tumor stroma suppressed the growth of tumors in pancreatic ductal adenocarcinoma (PDAC). [77] In PDAC, low expression of ISLR (Meflin) was associated with aggressive tumors, characterized by straight collagen fibers in the stroma. [77] Regarding tumorigenesis in IBD patients, a study investigated the Hippo signaling pathway in intestinal regeneration of epithelial cells. [78] ETS1, an oncogenic transcription factor in stromal cells, induced the expression of ISLR protein which inhibited Hippo signaling, thus promoting intestinal regeneration. [78] In mice, it was demonstrated that deletion of ISLR in stromal cells can suppress tumorigenesis in the intestine. [78] For the ISLR 2 paralog, a study demonstrated that congenital hydrocephalus, arthrogryposis, and abdominal distension is associated with an autosomal recessive knockout on the phenotype of ISLR 2 in a multiplex consanguineous family. [79] ISLR 2 encodes a protein that plays a role in axon guidance in brain development, hence, unveiling potential links to certain congenital neurological disorders. [79]
The coiled-coil domain containing 142 (CCDC142) is a gene which in humans encodes the CCDC142 protein. The CCDC142 gene is located on chromosome 2, spans 4339 base pairs and contains 9 exons. The gene codes for the coiled-coil domain containing protein 142 (CCDC142), whose function is not yet well understood. There are two known isoforms of CCDC142. CCDC142 proteins produced from these transcripts range in size from 743 to 665 amino acids and contain signals suggesting protein movement between the cytosol and nucleus. Homologous CCDC142 genes are found in many animals including vertebrates and invertebrates but not fungus, plants, protists, archea, or bacteria. Although the function of this protein is not well understood, it contains a coiled-coil domain and a RINT1_TIP1 motif located within the coiled-coil domain.
Glutamate rich protein 5 is a protein in humans encoded by the ERICH5 gene, also known as chromosome 8 open reading frame 47 (C8orf47).
FAM221B is a protein that in humans is encoded by the FAM221B gene . FAM221B is also known by the alias C9orf128, is expressed at low level, and is defined by 17 GenBank accessions . It is predicted to function in transcription regulation as a transcription factor.
Retrotransposon Gag Like 6 is a protein encoded by the RTL6 gene in humans. RTL6 is a member of the Mart family of genes, which are related to Sushi-like retrotransposons and were derived from fish and amphibians. The RTL6 protein is localized to the nucleus and has a predicted leucine zipper motif that is known to bind nucleic acids in similar proteins, such as LDOC1.
Uncharacterized protein C2orf73 is a protein that in humans is encoded by the C2orf73 gene. The protein is predicted to be localized to the nucleus.
Transmembrane and coiled-coil domains 4, TMCO4, is a protein in humans that is encoded by the TMCO4 gene. Currently, its function is not well defined. It is transmembrane protein that is predicted to cross the endoplasmic reticulum membrane three times. TMCO4 interacts with other proteins known to play a role in cancer development, hinting at a possible role in the disease of cancer.
Chromosome 19 open reading frame 18 (c19orf18) is a protein which in humans is encoded by the c19orf18 gene. The gene is exclusive to mammals and the protein is predicted to have a transmembrane domain and a coiled coil stretch. This protein has a function that is not yet fully understood by the scientific community.
C15orf39 is a protein that in humans is encoded by the Chromosome 15 open reading frame 15 (C15orf39) gene.
C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.
Chromosome 1 Opening Reading Frame 94 or C1orf94 is a protein in human coded by the C1orf94 gene. The function of this protein is still poorly understood.
ProteinFAM89A is a protein which in humans is encoded by the FAM89A gene. It is also known as chromosome 1 open reading frame 153 (C1orf153). Highest FAM89A gene expression is observed in the placenta and adipose tissue. Though its function is largely unknown, FAM89A is found to be differentially expressed in response to interleukin exposure, and it is implicated in immune responses pathways and various pathologies such as atherosclerosis and glioma cell expression.
Leucine rich single-pass membrane protein 2 is a single-pass membrane protein rich in leucine, that in humans is encoded by the LSMEM2 gene. The LSMEM2 protein is conserved in mammals, birds, and reptiles. In humans, LSMEM2 is found to be highly expressed in the heart, skeletal muscle and tongue.
The FAM214B, also known as protein family with sequence similarity 214, B (FAM214B) is a protein that, in humans, is encoded by the FAM214B gene located on the human chromosome 9. The protein has 538 amino acids. The gene contain 9 exon. There has been studies that there are low expression of this gene in patients with major depression disorder. In most organisms such as mammals, amphibians, reptiles, and birds, there are high levels of gene expression in the bone marrow and blood. For humans in fetal development, FAM214B is mostly expressed in the brains and bone marrow.
C2orf80 is a protein that in humans is encoded by the c2orf80 gene. The gene c2orf80 also goes by the alias GONDA1. In humans, c2orf80 is exclusively expressed in the brain. While relatively little is known about the function of c2orf80, medical studies have shown a strong association between variations in c2orf80 and IDH-mutant gliomas, 46,XY gonadal dysgenesis, and a possible association with blood pressure.
Zinc Finger Protein 548 (ZNF548) is a human protein encoded by the ZNF548 gene which is located on chromosome 19. It is found in the nucleus and is hypothesized to play a role in the regulation of transcription by RNA Polymerase II. It belongs to the Krüppel C2H2-type zinc-finger protein family as it contains many zinc-finger repeats.
Chromosome 5 open reading frame 22 (c5orf22) is a protein-coding gene of poorly characterized function in Homo sapiens. The primary alias is unknown protein family 0489 (UPF0489).
Chromosome 13 Open Reading Frame 46 is a protein which in humans is encoded by the C13orf46 gene. In humans, C13orf46 is ubiquitously expressed at low levels in tissues, including the lungs, stomach, prostate, spleen, and thymus. This gene encodes eight alternatively spliced mRNA transcript, which produce five different protein isoforms.
Transmembrane protein 82 (TMEM82) is a protein encoded by the TMEM82 gene in humans.
Leucine-rich repeat-containing protein 74A (LRRC74A), is a protein encoded by the LRRC74A gene. The protein LRRC74A is localized in the cytoplasm. It has a calculated molecular weight of approximately 55 kDa. The LRRC74A protein is nominally expressed in the testis, salivary gland, and pancreas.
{{cite journal}}
: Cite journal requires |journal=
(help){{cite journal}}
: Cite journal requires |journal=
(help){{cite journal}}
: Cite journal requires |journal=
(help){{cite journal}}
: Cite journal requires |journal=
(help)