EVI5L

Last updated
EVI5L
Identifiers
Aliases EVI5L , ecotropic viral integration site 5 like
External IDs HomoloGene: 71934 GeneCards: EVI5L
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001159944
NM_145245

n/a

RefSeq (protein)

NP_001153416
NP_660288

n/a

Location (UCSC) Chr 19: 7.83 – 7.86 Mb n/a
PubMed search [2] n/a
Wikidata
View/Edit Human

EVI5L (Ecotropic Viral Integration Site 5-Like) is a protein that in humans is encoded by the EVI5L gene. [3] EVI5L is a member of the Ras superfamily of monomeric guanine nucleotide-binding (G) proteins, and functions as a GTPase-activating protein (GAP) with a broad specificity. [4] [5] Measurement of in vitro Rab-GAP activity has shown that EVI5L has significant Rab2A- and Rab10-GAP activity. [6]

Contents

Gene

The EVI5L gene is 34,701 base pairs long and has an unprocessed mRNA that is 3,756 nucleotides in length. It consists of 19 exons that encode for an 805 amino acid protein. [7]

Relative location of EVI5L on the short arm of Chromosome 19 Location of EVI5L on Chromosome 19.png
Relative location of EVI5L on the short arm of Chromosome 19

Locus

EVI5L is located on the short arm (p) of chromosome 19 in region 1, band 3, and sub-band 2 (19p13.2) starting at 7,830,275 base pairs and ending at 7,864,976 base pairs. It is encoded for on the plus strand. It is located near the CLEC4M (C-type lectin domain family 4, member M) gene, which is involved in peptide antigen transport. [8]

Homology and Evolution

Homologous domains

EVI5L contains a RAB-GAP TBC domain, which is involved with regulating membrane trafficking by cycling between inactive (GDP-bound) and active (GTP-bound) conformations. [9] It also has the apolipophorin-III and tetratricopeptide repeat (TPR) domains. Apolipophorin-III play vital roles in the transport of lipids and lipoprotein metabolism, [10] while TPR mediates protein-protein interactions and the assembly of multi protein complexes. [11] These three domains are highly conserved in EVI5L orthologs.

Paralogs

There are 7 moderately related proteins in humans that are paralogous to the RAB-GAP TBC domain of EVI5L. All of these proteins are in the guanosine nucleotide-binding protein family [12]

Paralogs of EVI5L
Paralogous ProteinProtein NameSequence Length (amino acids)Amino Acid Identity
EVI5 Ecotropic viral integration site 5810 aa51%
TBC1D14TBC1 domain family, member 14693 aa19%
RABGAP1 RAB GTPase activating protein 11069 aa18%
RABGAP1LRAB GTPase activating protein 1-like815 aa18%
TBC1D12TBC1 Domain Family Member 12775 aa17%
TBC1(Tre-2/USP6, BUB2, Cdc16) Domain Family, Member 11168 aa15%
TBC1D4 TBC1 domain family, member 41298 aa13%

Orthologs

There are 63 [13] orthologs of EVI5L that have been identified including mammals, birds, reptiles, and fish. [14] EVI5L is highly conserved among its orthologs but is not present in insects, plants, bacteria, archea or protists.

Homologs

Amino acid sequence identity vs. time since species divergence for orthologs of EVI5L. It is seen that EVI5L has moderate/fast evolution rate when compared to Cytochrome C and H3 histone. Amino acid sequence identity vs time since species divergence for orthologs of EVI5L.png
Amino acid sequence identity vs. time since species divergence for orthologs of EVI5L. It is seen that EVI5L has moderate/fast evolution rate when compared to Cytochrome C and H3 histone.

The following table lists the homologs of EVI5L:

Genus and SpeciesCommon NameAccession NumberSeq.
Length
Seq.
Identity
Seq.
Similarity
Time of Divergence
Homo sapiens Humans NM_001159944.2 3756 bp---
Pan troglodytes Common Chimpanzee XM_003316056.2 3874 bp99%99%6.3 mya
Canis familiaris Dog XM_003432793.1 2430 bp98%99%94.2 mya
Sus scrofa Wild Boar XM_003123194 3673 bp95%99%94.2 mya
Chelonia mydas Sea Turtle EMP36617 3436 bp94%99%294.5 mya
Alligator sinensis Chinese Alligator XM_006036467.1 6780 bp82%91%296.4 mya
Ficedula albicollis Collared Flycatcher XM_005062373.1 2090 bp79%88%324.2 mya
Haplochromis burtoni Cichlid XM_005934450.1 6638 bp70%84%400.1 mya
Danio rerio Zebrafish XM_689590 2856 bp69%82%400.1 mya
Oreochromis niloticus Nile Tilapia XM_003447957 6757 bp68%84%400.1 mya

Protein

The protein of EVI5L consists of 805 [15] amino acid residues. The molecular weight of the mature protein is 92.5 kdal with an isoelectric point of 5.05. EVI5L has an unusually large amount of glutamic acid residues, compared to similar proteins. Most of the protein is neutral, with no positive charge, negative charge, or mixed charge clusters. [16] It has a very small negative hydrophobicity (-0.597019). EVI5L is a soluble protein [17] that localizes in the nucleus. [18] It contains no signal peptide, no mitochondrial targeting motifs and no peroxisomal targeting signal in the C-terminus. There is no transmembrane domain in EVI5L. [19]

Isoforms

EVI5L has two isoforms produced by alternative splicing. Isoform 2 is missing in-frame exon 11, making it shorter (794 amino acids). [20]

Important domains and post-translational modifications of EVI5L Domains and post-translational modifications of EVI5L.png
Important domains and post-translational modifications of EVI5L

Post-Translational Modifications

Post translational modifications of EVI5L that are evolutionarily conserved in majority of the orthologs include glycosylation (C-mannosylation), [21] glycation, [22] phosphorylation (non-kinase and kinase specific), [23] [24] and sumoylation. [25] There is also one leucine-rich nuclear export signal. [26]

Highly conserved regions of RAB-GAP1L correspond to the RAB-GAP TBC domain in EVI5L. The conserved regions are next to each other, and form a pocket for possible interactions with RAB2A and RAB10. There are no beta-sheets present, just alpha helices. Predicted Secondary Structure of EVI5L.png
Highly conserved regions of RAB-GAP1L correspond to the RAB-GAP TBC domain in EVI5L. The conserved regions are next to each other, and form a pocket for possible interactions with RAB2A and RAB10. There are no beta-sheets present, just alpha helices.

Secondary Structure

The entire secondary structure of EVI5L is made up of alpha helices, with no beta sheets present. [27] [28] This is also true for EVI5Ls closest structural paralog, RABGAP1L. [29]

Expression

Promoter

The predicted promoter for the EVI5L gene spans 664 base pairs from 7,910,867 to 7,911,530 with a predicted transcriptional start site that is 114 base pairs and spans from 7,911,346 to 7,911,460. [30] The promoter region and beginning of the EVI5L gene (7,910,997 to 7,911,843) is not conserved past primates. This region was used to determine transcription factor interactions.

Some of the main transcription factors predicted to bind to the promoter includes: activator-, mediator- and TBP-dependent core promoter element for RNA polymerase II transcription from TATA-less promoters, p53 tumor suppressor, brachyury gene, mesoderm developmental factor, EGR/nerve growth factor induced protein C & related factors, and GLI zinc finger family. [31]

EVI5L has ubiquitously low expression, with slightly higher expression in the testis and fetal brain. Expression in EVI5L.png
EVI5L has ubiquitously low expression, with slightly higher expression in the testis and fetal brain.

Expression

Expression data from expressed sequence tag mapping, microarray and in situ hybridization shows EVI5L has ubiquitously low expression. [32] [33] [34] However, it has slightly higher expression in the testis and fetal brain.

Function and Biochemistry

The exact function of EVI5L is unknown. Given this, the paralogs of the gene are associated with starvation-induced autophagosome formation and trafficking and translocation of GLUT4-containing vesicles. [35] [36] Therefore, EVI5L is predicted to target endocytic vesicles.

Interacting Proteins

SRPK2 phosphorylates the serine residues of EVI5L EVI5L phosphorylation sites.png
SRPK2 phosphorylates the serine residues of EVI5L

EVI5L has been shown to interact with NUDT18 (nucleoside diphosphate linked moiety X)-type motif 18 [37] and SRPK2 (serine/threonine-protein kinase 2). [38] NUDT18 is a member of the Nudix hydrolase family. Nudix hydrolases eliminate potentially toxic nucleotide metabolites from the cell and regulate the concentrations and availability of many different nucleotide substrates, cofactors, and signaling molecules. [39] SRPK2 is a Serine/arginine rich protein-specific kinase which specifically phosphorylates its substrates at serine residues located in regions rich in arginine/serine dipeptides, known as RS domains and is involved in the phosphorylation of SR splicing factors and the regulation of splicing. [40]

Clinical significance

Zebrafish deficient for Rab23 or its GTPase-activating protein, EVI5L, exhibit abnormal heart formation. This is attributed to the requirement of RAB23 in the differentiation of cardiac progenitor cells. RAB23 is required for normal development of the brain, spinal cord and heart, and without EVI5L to activate RAB23, abnormal formation of these organs ensues. [41]

Related Research Articles

<span class="mw-page-title-main">TMEM242</span> Protein-coding gene in the species Homo sapiens

Transmembrane protein 242 (TMEM242) is a protein that in humans is encoded by the TMEM242 gene. The tmem242 gene is located on chromosome 6, on the long arm, in band 2 section 5.3. This protein is also commonly called C6orf35, BM033, and UPF0463 Transmembrane Protein C6orf35. The tmem242 gene is 35,238 base pairs long, and the protein is 141 amino acids in length. The tmem242 gene contains 4 exons. The function of this protein is not well understood by the scientific community. This protein contains a DUF1358 domain.

<span class="mw-page-title-main">FAM43A</span> Protein-coding gene in the species Homo sapiens

The family with sequence similarity 43 member A (FAM43A) gene, also known as; GCO3P195887, GC03P194406, GC03P191784, and NM_153690.3, codes for a 423 bp protein that is conserved in primates, and orthologs have been found in vertebrate and invertebrate species. Three transcripts have been identified, two protein coding isoforms, and a non-coding transcript (cAug10). Molecular weight of 45.8 kdal in the unphosphorylated state and isoelectric point of 6.1.

<span class="mw-page-title-main">FAM203B</span> Protein-coding gene in the species Homo sapiens

Family with Sequence Similarity 203, Member B (FAM203B) is a protein encoded by the FAM203B gene (8q24.3) in humans. While FAM203B is only found in humans and possibly non-human primates, its paralog, FAM203A, is highly conserved. The FAM203B protein contains two conserved domains of unknown function, DUF383 and DUF384, and no transmembrane domains. This protein has no known function yet, although the homolog of FAM203A in Caenorhabditis elegans (Y54H5A.2) is thought to help regulate the actin cytoskeleton.

<span class="mw-page-title-main">FAM214A</span> Protein-coding gene in the species Homo sapiens

Protein FAM214A, also known as protein family with sequence similarity 214, A (FAM214A) is a protein that, in humans, is encoded by the FAM214A gene. FAM214A is a gene with unknown function found at the q21.2-q21.3 locus on Chromosome 15 (human). The protein product of this gene has two conserved domains, one of unknown function (DUF4210) and another one called Chromosome_Seg. Although the function of the FAM214A protein is uncharacterized, both DUF4210 and Chromosome_Seg have been predicted to play a role in chromosome segregation during meiosis.

<span class="mw-page-title-main">WWC2</span> Protein-coding gene in the species Homo sapiens

WW and C2 domain containing 2 (WWC2) is a protein that in humans is encoded by the WWC2 gene (4q35.1). Though function of WWC2 remains unknown, it has been predicted that WWC2 may play a role in cancer.

<span class="mw-page-title-main">Coiled-coil domain containing 42B</span> Protein found in humans

Coiled Coil Domain Containing protein 42B, also known as CCDC42B, is a protein encoded by the protein-coding gene CCDC42B.

<span class="mw-page-title-main">IFFO1</span> Protein-coding gene in the species Homo sapiens

Intermediate filament family orphan 1 is a protein that in humans is encoded by the IFFO1 gene. IFFO1 has uncharacterized function and a weight of 61.98 kDa. IFFO1 proteins play an important role in the cytoskeleton and the nuclear envelope of most eukaryotic cell types.

PRP36 is an extracellular protein in Homo sapiens that is encoded by the PRR36 gene that contains a domain of unknown function, DUF4596, towards the C terminus of the protein. The function of PRP36 is unknown, but high gene expression has been observed in various regions of the brain such as the prefrontal cortex, cerebellum, and the amygdala. PRP36 has one alias: Putative Uncharacterized Protein FLJ22184.

<span class="mw-page-title-main">C14orf93</span> Protein-coding gene in the species Homo sapiens

C14orf93 is a protein that is encoded in humans by the C14orf93 gene. It is a globular protein with a conserved C-terminus that is localized to the nucleus. While expressed relatively highly in all tissues except nervous tissue, it is expressed particularly highly in T cells and other immune tissues.

<span class="mw-page-title-main">CRACD-like protein</span>

CRACD-like protein. previously known as KIAA1211L is a protein that in humans is encoded by the CRACDL gene. It is highly expressed in the cerebral cortex of the brain. Furthermore, it is localized to the microtubules and the centrosomes and is subcellularly located in the nucleus. Finally, CRACDL is associated with certain mental disorders and various cancers.

Transmembrane Protein 217 is a protein encoded by the gene TMEM217. TMEM217 has been found to have expression correlated with the lymphatic system and endothelial tissues and has been predicted to have a function linked to the cytoskeleton.

<span class="mw-page-title-main">TMEM44</span> Protein-coding gene in the species Homo sapiens

TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.

<span class="mw-page-title-main">TEDC2</span> Protein-coding gene in the species Homo sapiens

Tubulin epsilon and delta complex 2 (TEDC2), also known as Chromosome 16 open reading frame 59 (C16orf59), is a protein that in humans is encoded by the TEDC2 gene. Its NCBI accession number is NP_079384.2.

<span class="mw-page-title-main">WD Repeat and Coiled Coil Containing Protein</span> Protein-coding gene in humans

WD Repeat and Coiled-coiled containing protein (WDCP) is a protein which in humans is encoded by the WDCP gene. The function of the protein is not completely understood, but WDCP has been identified in a fusion protein with anaplastic lymphoma kinase found in colorectal cancer. WDCP has also been identified in the MRN complex, which processes double-stranded breaks in DNA.

<span class="mw-page-title-main">C7orf50</span> Mammalian protein found in Homo sapiens

C7orf50 is a gene in humans that encodes a protein known as C7orf50. This gene is ubiquitously expressed in the kidneys, brain, fat, prostate, spleen, among 22 other tissues and demonstrates low tissue specificity. C7orf50 is conserved in chimpanzees, Rhesus monkeys, dogs, cows, mice, rats, and chickens, along with 307 other organisms from mammals to fungi. This protein is predicted to be involved with the import of ribosomal proteins into the nucleus to be assembled into ribosomal subunits as a part of rRNA processing. Additionally, this gene is predicted to be a microRNA (miRNA) protein coding host gene, meaning that it may contain miRNA genes in its introns and/or exons.

<span class="mw-page-title-main">ISLR</span> Protein-coding gene in the species Homo sapiens

In humans, the immunoglobulin super family containing leucine-rich repeat (ISLR) protein is encoded by the ISLR gene. Current RNA-seq studies show that the protein is highly expressed in the endometrium and ovary and shows expression among 25 other tissues. The protein is seen localized in the cytoplasm, plasma membrane, extracellular exosome, and platelet alpha granule lumen. Furthermore, the protein is known to play a role in platelet degranulation, cell adhesion, and response to elevated platelet cytosolic Ca2+.

<span class="mw-page-title-main">FAM98C</span> Gene

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965. FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

<span class="mw-page-title-main">C2orf72</span> Human protein encoding gene

C2orf72 is a gene in humans that encodes a protein currently named after its gene, C2orf72. It is also designated LOC257407 and can be found under GenBank accession code NM_001144994.2. The protein can be found under UniProt accession code A6NCS6.

<span class="mw-page-title-main">TBC1D30</span> Protein-coding gene in the species Homo sapiens

TBC1D30 is a gene in the human genome that encodes the protein of the same name. This protein has two domains, one of which is involved in the processing of the Rab protein. Much of the function of this gene is not yet known, but it is expressed mostly in the brain and adrenal cortex.

Chromosome 4 open reading frame 54 is a protein that in humans is coded by the c4orf54 gene. This gene is also known as FOPV and LOC285556. This protein is mostly expressed in the nucleus of muscle cells. Orthologs are found in vertebrates but not invertebrates.

References

  1. 1 2 3 GRCh38: Ensembl release 89: ENSG00000142459 - Ensembl, May 2017
  2. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  3. "Entrez Gene, EVI5L". NCBI. Retrieved 2014-04-22.
  4. "Uniprot: EVI5L".
  5. Faitar SL, Kilijanski JJ, Heassler KW, Davie JJ (April 2013). "The identification, characterization, and subcellular localization of a novel potential GTPase activating protein involved in cytokinesis" (PDF). The FASEB Journal. 27 (1046): 2. doi: 10.1096/fj.1530-6860 .
  6. Itoh T, Satoh M, Kanno E, Fukuda M (September 2006). "Screening for target Rabs of TBC (Tre-2/Bub2/Cdc16) domain-containing proteins based on their Rab-binding activity". Genes to Cells. 11 (9): 1023–37. doi:10.1111/j.1365-2443.2006.00997.x. PMID   16923123. S2CID   43042490.
  7. National Center for Biotechnology Information EntrezGene reference information for EVI5L ecotropic viral integration site 5-like (Homo sapiens)
  8. Chromosome-centric Human Proteome Project Chromosome-centric Human Proteome Project, EVI5L (Homo sapiens) Archived 2014-05-12 at the Wayback Machine
  9. Pan X, Eathiraj S, Munson M, Lambright DG (July 2006). "TBC-domain GAPs for Rab GTPases accelerate GTP hydrolysis by a dual-finger mechanism". Nature. 442 (7100): 303–6. Bibcode:2006Natur.442..303P. doi:10.1038/nature04847. PMID   16855591. S2CID   4407126.
  10. National Center for Biotechnology Information: Conserved Domains Database (CDD) and Resource Group Conserved domains for EVI5L ecotropic viral integration site 5-like (Homo sapiens)
  11. D'Andrea LD, Regan L (December 2003). "TPR proteins: the versatile helix". Trends in Biochemical Sciences. 28 (12): 655–62. doi:10.1016/j.tibs.2003.10.007. PMID   14659697.
  12. Ensembl Paralogs, EVI5L (Homo sapiens)
  13. European Bioinformatics Institute Orthologs, EVI5L (Homo sapiens)
  14. Ensembl Orthologs, EVI5L (Homo sapiens)
  15. National Center for Biotechnology Information: Protein EVI5-like protein isoform 1
  16. SDSC Biology Workbench: SAPS [workbench.sdsc.edu EVI5L Statistical Analysis of Protein Sequences]
  17. "SOSUI: classification and secondary structure prediction". Archived from the original on 2021-05-27. Retrieved 2014-05-11.
  18. PSORTII: Localization of proteins in yeast and animal cells Localization of proteins in yeast and animal cells: EVI5L
  19. TMHMM:Prediction of transmembrane helices in proteins EVI5L Transmembrane Domain
  20. Origene Isoforms, EVI5L (Homo sapiens)
  21. NetCGlyc 1.0: Prediction of mammalian C-mannosylation sites.C-mannosylation in EVI5L
  22. NetGlycate: Predicts glycation of ε amino groups of lysines in mammalian proteins NetGlycate: Predicts glycation in EVI5L
  23. NetPhos: Produces neural network predictions for serine, threonine and tyrosine phosphorylation sites in eukaryotic proteins. Phosphorylation sites in EVI5L
  24. NetPhosK: Produces neural network predictions of kinase specific eukaryotic protein phosphorylation sites Kinase-specific Phosphorylation sites in EVI5L Archived 2021-07-09 at the Wayback Machine
  25. SUMOplot™ Analysis Program: Predicts and scores sumoylation sites Sumoylation sites in EVI5L
  26. NetNES: Predicts leucine-rich nuclear export signals (NES) in eukaryotic proteins Leucine-rich nuclear export signals (NES) in EVI5L
  27. CHOFAS: Secondary structure prediction program CHOFAS: secondary structure in EVI5L
  28. PELE: Secondary structure prediction program PELE: secondary structure in EVI5L
  29. National Center for Biotechnology Information: Structure Structure of RabGap1L
  30. "El Durado (Genomatix)".
  31. "El Durado-Genomatix".[ permanent dead link ]
  32. "Unigene NCBI". Archived from the original on 2013-07-12. Retrieved 2014-05-11.
  33. "GEO Profiles NCBI".
  34. "Bio GPS".
  35. GeneCards: TBC1 Domain Family, Member 14 TBC1 Domain Family, Member 14 Function
  36. GeneCards: TBC1 (Tre-2/USP6, BUB2, Cdc16) Domain Family, Member 1 BC1 (Tre-2/USP6, BUB2, Cdc16) Domain Family, Member 1 Function
  37. STRING - Known and Predicted Protein-Protein Interactions NUDT18 and EVI5L interaction
  38. IntAct: molecular interaction data SRPK2 and EVI5L interaction
  39. GeneCards: NUDT18 (nucleoside diphosphate linked moiety X)-type motif 18 NUDT18 (nucleoside diphosphate linked moiety X)-type motif 18 Function
  40. SRPK2 (serine/threonine-protein kinase 2) SRPK2 (serine/threonine-protein kinase 2) Function
  41. Jenkins D, Beales PL, Wilkie AO (May 2012). "Rab23 is required for cardiac progenitor cell differentiation and positively-regulates Wnt11/AP-1 signalling in zebrafish". Cilia. 1 (Suppl 1): O6. doi: 10.1186/2046-2530-1-S1-O6 . PMC   3555715 .

Further reading