ERICH2

Last updated
ERICH2
Identifiers
Aliases ERICH2 , glutamate rich 2
External IDs MGI: 1913998 HomoloGene: 75226 GeneCards: ERICH2
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001289947
NM_001290030
NM_001290031

NM_025744

RefSeq (protein)

NP_001276876
NP_001276959
NP_001276960

n/a

Location (UCSC)n/a Chr 2: 70.34 – 70.37 Mb
PubMed search [2] [3]
Wikidata
View/Edit Human View/Edit Mouse

Glutamate Rich Protein 2 is a protein in humans encoded by the gene ERICH2. This protein is expressed heavily in male tissues specifically in the testes, and proteins are specifically found in the nucleoli fibrillar center and the vesicles of these testicular cells. [4] The protein has multiple protein interactions which indicate that it may play a role in histone modification and proper histone functioning. [5]

Contents

Gene

ERICH2 gene location as depicted by the National Center for Biotechnology Information (NCBI). ERICH2 Location.png
ERICH2 gene location as depicted by the National Center for Biotechnology Information (NCBI).

ERICH2 is located on human Chromosome 2, at 2q31.1. [6] It contains 10 distinct exons. The gene itself is 28,930 base pairs long and is flanked by the EIF2S2P4 and GAD1 genes. [6] There are no known paralogs of the ERICH2 gene.

mRNA

ERICH2 transcription produces three validated distinct mRNA variants. The longest transcript variant is 1,388 base pairs in length, 1,311 of which are coding. [6] The second variant differs from the first in its 5' UTR. It also has coding sequence differences and a distinct N-terminus compared to variant 1. [6] Variant 3 lacks several exons, has a distinct 3' UTR and C- terminus coding region. This variant is also shorter than the other two at 1,063 base pairs. [6]

Cartoon of the ERICH2 protein. The green box represents the PHA03247 domain, the orange box represents the amidation site, the blue box represents the cAMP- and cGMP- dependent protein kinase binding site. The gray box labels P represents an area rich in Proline, while the gray box label conserved is that in which is conserved throughout distant orthologs. the gray tags represent phosphorylation sites, and the red flags represents sites of glutamate amino acids. The green lines on the top of the cartoon represent the Pat4 nuclear localization signals while the gray brackets represent the Pat7 localization signals. Cartoon 3.0.png
Cartoon of the ERICH2 protein. The green box represents the PHA03247 domain, the orange box represents the amidation site, the blue box represents the cAMP- and cGMP- dependent protein kinase binding site. The gray box labels P represents an area rich in Proline, while the gray box label conserved is that in which is conserved throughout distant orthologs. the gray tags represent phosphorylation sites, and the red flags represents sites of glutamate amino acids. The green lines on the top of the cartoon represent the Pat4 nuclear localization signals while the gray brackets represent the Pat7 localization signals.

Protein

The ERICH2 protein is 436 amino acids in length, and has a molecular weight of approximately 48,000 kD, [7] with an isoelectric point of approximately 5. [7] The protein is determined to be rich in the amino acid proline and low in tyrosine and glycine.

Motifs and domains

Conceptual Translation of ERICH2. Intron-exon boundaries are highlighted in yellow. The PHA03247 domain is highlight it light gray. The acetylation site is in orange font. The amidation site is in light blue font. The c-AMP and c-GMP dependent protein kinase phosphorylation site is highlighted teal. Phosphorylation sites are in pink text. The most conserved region in distant orthologs is highlighted green. The beta strand structure is represented by a black arrow. The alpha helix structure is represented by a purple arrow. Doc 3.pdf
Conceptual Translation of ERICH2. Intron-exon boundaries are highlighted in yellow. The PHA03247 domain is highlight it light gray. The acetylation site is in orange font. The amidation site is in light blue font. The c-AMP and c-GMP dependent protein kinase phosphorylation site is highlighted teal. Phosphorylation sites are in pink text. The most conserved region in distant orthologs is highlighted green. The beta strand structure is represented by a black arrow. The alpha helix structure is represented by a purple arrow.

Two known motifs were found in the human ERICH2 protein. The KKNT motif functions in cAMP- and cGMP- dependent protein phosphorylation, this protein motif was found only in primates. [8] There is also a FGRR motif conserved in mammals that is defined as an amidation site. [9] Finally the ERICH2 protein contains the PHA03247 domain that is 32 amino acids long. [6] This domain is not generally conserved through orthologs and the function is unknown. It is present in the proteins that make up the herpes virion. [10]

Structure and localization

Secondary structure prediction shows one alpha helix and one beta strand formation. The alpha helix encompasses the entire conserved section as seen in the cartoon of the ERICH2 protein. The beta strand is predicted 12 amino acids down from the amidation site and encompasses 4 amino acids. [11] Four nuclear localization signals were found in the protein, two pat4 signals and two pat7 signals, their locations are shown in the cartoon. [12] It is predicted in the 78th percent that the protein resides in the nucleus. [12]

Expression

A fluoroscopy of human cells, from the CACO-2 cell line of colorectal cancer, showing the presence of the ERICH2 antibody, as well as highlighted microtubules and DNA. The figure shows the location of the ERICH2 protein, mainly in the nucleoli fibrillar center and vesicles. Expression Fluoroscopy .png
A fluoroscopy of human cells, from the CACO-2 cell line of colorectal cancer, showing the presence of the ERICH2 antibody, as well as highlighted microtubules and DNA. The figure shows the location of the ERICH2 protein, mainly in the nucleoli fibrillar center and vesicles.

ERICH2 is not ubiquitously expressed. It however, has been shown to be expressed narrowly in the choroid plexus of a developing fetus and in the testes of adults. [14] Lung and female tissue expression were also present but expression was greatly decreased. [15] Proteins are specifically located in the nucleoli fibrillar center and the vesicles within cells. [4] [16]

Regulation of expression

Many phosphorylation sites are predicted for the ERICH2 protein. None are predicted on tyrosines only on serines and the threonines. [17] [18] There is also a predicted acetylation site at the N-terminus of the protein, specifically it is predicted on the third amino acid. [19] Many SOX/SRY-sex/testis determining and related HMG box factor transcription factors and estrogen related transcription factors are predicted to bind and regulate transcription of ERICH2. [20]

Function

Interacting proteins

ERICH2 interacts with proteins in the H2A family. [5] [16] The H2A proteins specifically play a role in the octamer structure of histone. ERICH2 is specifically known to interact with the H2AFY protein, which plays a key role in the stable X chromosome inactivation and can function by replacing a normal H2A in certain nucleosomes and thus repressing transcription. [21]

ERICH2 is also known to interact with the protein SDCB1 which functions in vesicle trafficking and the regulation of growth and proliferation of certain cancer cells. [22]

The IWS1 protein also interacts with ERICH2. This protein functions as a transcription factor and plays a key role in defining the composition of the RNA polymerase II elongation complex. [23] This complex then plays a role in histone modification and proper splicing.

Two-hybrid assays and other protein interaction methods have shown an interaction with the PSORS1C2 protein, but the function of this protein remains unknown. [5]

Homology

No paralogs for the ERICH2 protein are known. ERICH2 has 124 known orthologs spanning multiple taxa. [6]

Genus and SpeciesCommon NameDate of Divergence (MYA) [24] Sequence Length (aa)Sequence IdentitySequence similarity
Homo sapiensHuman0436----
Rousettus aegyptiacusEgyptian fruit bat9443058%63%
Propithecus coquereliCoquerel's sifaka7432354%60%
Mus musculusMouse9046347%56%
Ursus MaritimusPolar Bear9429650%53%
Alligator mississippiensisAmerican Alligator32037028%38%
Thamnophis sirtalisCommon Garter Snake32030927%37%
Callorhinchus miliiAustralian Ghost Shark46531922%33%
Danio rerioZebra Fish43231024%30%
Strongylocentrotus purpuratusPurple Sea Urchin62747023%25%
Crassostrea gigasPacific Oyster75829317%22%
Bemisia tabaciSilverleaf Whitefly75821313%14%
Trichoplax adhaerensTrichoplax93016412%15%

Related Research Articles

Histone H2B is one of the 5 main histone proteins involved in the structure of chromatin in eukaryotic cells. Featuring a main globular domain and long N-terminal and C-terminal tails, H2B is involved with the structure of the nucleosomes.

<span class="mw-page-title-main">Proser2</span> Protein-coding gene in the species Homo sapiens

PROSER2, also known as proline and serine rich 2, is a protein that in humans is encoded by the PROSER2 gene. PROSER2, or c10orf47(Chromosome 10 open reading frame 47), is found in band 14 of the short arm of chromosome 10 (10p14) and contains a highly conserved SARG domain. It is a fast evolving gene with two paralogs, c1orf116 and specifically androgen-regulated gene protein isoform 1. The PROSER2 protein has a currently uncharacterized function however, in humans, it may play a role in cell cycle regulation, reproductive functioning, and is a potential biomarker of cancer.

WD repeat-containing protein 90 is a protein that, in humans, is encoded by the WDR90 gene (16p13.3). This human protein is 1750 amino acids, and has a molecular weight of 187.7 kDa. It contains multiple WD40 repeat domains and one domain of unknown function. This protein is conserved all the way back to invertebrates. Proteins containing WD transducin repeating domains have been found to play a role in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control, autophagy and apoptosis.

Coiled-coil domain containing protein 180 (CCDC180) is a protein that in humans is encoded by the CCDC180 gene. This protein is known to localize to the nucleus and is thought to be involved in regulation of transcription as are many proteins containing coiled-coil domains. As it is expressed most highly in the testes and is regulated by SRY and SOX transcription factors, it could be involved in sex determination.

<span class="mw-page-title-main">FAM71F2</span> Protein-coding gene in the species Homo sapiens

FAM71F2 or Family with Sequence Similarity 71 member F2 is a protein that in humans is encoded by the Family with Sequence Similarity 71 member F2 gene. This gene is highly active in the reproductive tissues, specifically the testis, and may serve as a potential biomarker for determining metastatic testicular cancer.

<span class="mw-page-title-main">C12orf60</span>

Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.

Leukocyte Receptor Cluster Member 9 is an uncharacterized protein encoded by the LENG9 gene. In humans, LENG9 is predicted to play a role in fertility and reproductive disorders associated with female endometrium structures.

<span class="mw-page-title-main">Gram domain containing 1b</span>

GRAM domain containing 1B, also known as GRAMD1B, Aster-B and KIAA1201, is a cholesterol transport protein that is encoded by the GRAMD1B gene. It contains a transmembrane region and two domains of known function; the GRAM domain and a VASt domain. It is anchored to the endoplasmic reticulum. This highly conserved gene is found in a variety of vertebrates and invertebrates. Homologs are found in yeast.

Testis expressed 36, TEX36, is a protein that in humans is encoded by the tex36 gene. TEX36 interacts with proteins involved in the MAP kinase family, supporting that TEX36 may be regulated with on or off configurations. The encoded protein is highly expressed in fetal, testes, and placental tissues and has background expression levels in adults. There are also many motifs specific to male sex determination and spermatogenic factors, suggesting that it is involved in development.

<span class="mw-page-title-main">TMEM44</span>

TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.

<span class="mw-page-title-main">C16orf86</span> Protein-coding gene in the species Homo sapiens

Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.

Transmembrane protein 151A, also known as TMEM151A, is a protein that is encoded by the TMEM151A gene.

C11orf42 is an uncharacterized protein in homo sapiens that is encoded by the C11orf42 gene. It is also known as chromosome 11 open reading frame 42 and uncharacterized protein C11orf42, with no other aliases. The gene is mostly conserved in mammals, but it has also been found in rodents, reptiles, fish and worms.

<span class="mw-page-title-main">GOLGA8H</span>

Golgin subfamily A member 8H, also known as GOLGA8H, is a protein that in Homo sapiens is encoded by the GOLGA8H gene. Function of the GOLGA8H involves a process that is carried out at the cellular level which results in the assembly, arrangement of constituent parts, or disassembly of the Golgi apparatus.

<span class="mw-page-title-main">C16orf90</span>

C16orf90 or chromosome 16 open reading frame 90 produces uncharacterized protein C16orf90 in homo sapiens. C16orf90's protein has four predicted alpha-helix domains and is mildly expressed in the testes and lowly expressed throughout the body. While the function of C16orf90 is not yet well understood by the scientific community, it has suspected involvement in the biological stress response and apoptosis based on expression data from microarrays and post-translational modification data.

<span class="mw-page-title-main">ZNF337</span>

ZNF337, also known as zinc finger protein 337, is a protein that in humans is encoded by the ZNF337 gene. The ZNF337 gene is located on human chromosome 20 (20p11.21). Its protein contains 751 amino acids, has a 4,237 base pair mRNA and contains 6 exons total. In addition, alternative splicing results in multiple transcript variants. The ZNF337 gene encodes a zinc finger domain containing protein, however, this gene/protein is not yet well understood by the scientific community. The function of this gene has been proposed to participate in a processes such as the regulation of transcription (DNA-dependent), and proteins are expected to have molecular functions such as DNA binding, metal ion binding, zinc ion binding, which would be further localized in various subcellular locations. While there are no commonly associated or known aliases, an important paralog of this gene is ZNF875

<span class="mw-page-title-main">KRBA1</span>

KRBA1 is a protein that in humans is encoded by the KRBA1 gene. It is located on the plus strand of chromosome 7 from 149,411,872 to 149,431,664. It is also commonly known under two other aliases: KIAA1862 and KRAB A Domain Containing 1 gene and encodes the KRBA1 protein in humans. The KRBA family of genes is understood to encode different transcriptional repressor proteins

<span class="mw-page-title-main">MIF4GD</span> Protein-coding gene in the species Homo sapiens

MIF4GD, or MIF4G domain-containing protein, is a protein which in humans is encoded by the MIF4GD gene. It is also known as SLIP1, SLBP -interacting protein 1, AD023, and MIFD. MIF4GD is expressed ubiquitously in humans, and has been found to be involved in activating proteins for histone mRNA translation, alternative splicing and translation of mRNAs, and is a factor in the regulation of cell proliferation.

<span class="mw-page-title-main">ZNF548</span> Protein-coding gene in the species Homo sapiens

Zinc Finger Protein 548 (ZNF548) is a human protein encoded by the ZNF548 gene which is located on chromosome 19. It is found in the nucleus and is hypothesized to play a role in the regulation of transcription by RNA Polymerase II. It belongs to the Krüppel C2H2-type zinc-finger protein family as it contains many zinc-finger repeats.

bMERB domain containing 1 is a gene expressed in humans which has broad expression across the brain. This gene codes for bMERB1 domain-containing protein 1 isoform 1. It is predicted that this gene is involved in actin cytoskeleton regulation, microtubule regulation and glial cell migration.

References

  1. 1 2 3 GRCm38: Ensembl release 89: ENSMUSG00000075302 - Ensembl, May 2017
  2. "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  3. "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. 1 2 "Cell atlas - ERICH2 - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2017-02-17.
  5. 1 2 3 "Results - mentha: the interactome browser". mentha.uniroma2.it. Retrieved 2017-04-27.
  6. 1 2 3 4 5 6 7 "ERICH2 glutamate rich 2 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2017-02-17.
  7. 1 2 Kramer, Jack (1990). "Biology WorkBench 3.2".[ permanent dead link ]
  8. "PROSITE". prosite.expasy.org. Retrieved 2017-02-26.
  9. "PROSITE". prosite.expasy.org. Retrieved 2017-02-26.
  10. Ludwig A, Krieger MA (December 2016). "Genomic and phylogenetic evidence of VIPER retrotransposon domestication in trypanosomatids". Memórias do Instituto Oswaldo Cruz. 111 (12): 765–769. doi:10.1590/0074-02760160224. PMC   5146736 . PMID   27849219.
  11. Pearson, William (1999). "Biology workbench". SDSC Biology workbench. Retrieved 2017-02-15.[ permanent dead link ]
  12. 1 2 "PSORT II Prediction". psort.hgc.jp. Retrieved 2017-05-07.
  13. "Cell atlas - ERICH2 - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2017-04-27.
  14. European Molecular Biology Lab. "Expression Atlas".
  15. Group, Schuler. "EST Profile - Hs.443729". www.ncbi.nlm.nih.gov. Retrieved 2017-04-27.
  16. 1 2 Database, GeneCards Human Gene. "ERICH2 Gene - GeneCards | ERIC2 Protein | ERIC2 Antibody". www.genecards.org. Retrieved 2017-05-07.
  17. "ExPASy: SIB Bioinformatics Resource Portal - Categories". www.expasy.org. Retrieved 2017-05-07.
  18. "NetPhos 3.1 Server". www.cbs.dtu.dk. Retrieved 2017-05-07.
  19. "NetAcet 1.0 Server". www.cbs.dtu.dk. Retrieved 2017-05-07.
  20. "Genomatix: Genome Annotation and Browser: Query Input". www.genomatix.de. Retrieved 2017-05-07.
  21. "H2AFY - Core histone macro-H2A.1 - Homo sapiens (Human) - H2AFY gene & protein". www.uniprot.org. Retrieved 2017-04-27.
  22. "SDCBP - Syntenin-1 - Homo sapiens (Human) - SDCBP gene & protein". www.uniprot.org. Retrieved 2017-04-27.
  23. Database, GeneCards Human Gene. "IWS1 Gene - GeneCards | IWS1 Protein | IWS1 Antibody". www.genecards.org. Retrieved 2017-05-07.
  24. "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 2017-05-07.