ERICH2 | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | ERICH2 , glutamate rich 2 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 1913998 HomoloGene: 75226 GeneCards: ERICH2 | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Glutamate Rich Protein 2 is a protein in humans encoded by the gene ERICH2. This protein is expressed heavily in male tissues specifically in the testes, and proteins are specifically found in the nucleoli fibrillar center and the vesicles of these testicular cells. [4] The protein has multiple protein interactions which indicate that it may play a role in histone modification and proper histone functioning. [5]
ERICH2 is located on human Chromosome 2, at 2q31.1. [6] It contains 10 distinct exons. The gene itself is 28,930 base pairs long and is flanked by the EIF2S2P4 and GAD1 genes. [6] There are no known paralogs of the ERICH2 gene.
ERICH2 transcription produces three validated distinct mRNA variants. The longest transcript variant is 1,388 base pairs in length, 1,311 of which are coding. [6] The second variant differs from the first in its 5' UTR. It also has coding sequence differences and a distinct N-terminus compared to variant 1. [6] Variant 3 lacks several exons, has a distinct 3' UTR and C- terminus coding region. This variant is also shorter than the other two at 1,063 base pairs. [6]
The ERICH2 protein is 436 amino acids in length, and has a molecular weight of approximately 48,000 kD, [7] with an isoelectric point of approximately 5. [7] The protein is determined to be rich in the amino acid proline and low in tyrosine and glycine.
Two known motifs were found in the human ERICH2 protein. The KKNT motif functions in cAMP- and cGMP- dependent protein phosphorylation, this protein motif was found only in primates. [8] There is also a FGRR motif conserved in mammals that is defined as an amidation site. [9] Finally the ERICH2 protein contains the PHA03247 domain that is 32 amino acids long. [6] This domain is not generally conserved through orthologs and the function is unknown. It is present in the proteins that make up the herpes virion. [10]
Secondary structure prediction shows one alpha helix and one beta strand formation. The alpha helix encompasses the entire conserved section as seen in the cartoon of the ERICH2 protein. The beta strand is predicted 12 amino acids down from the amidation site and encompasses 4 amino acids. [11] Four nuclear localization signals were found in the protein, two pat4 signals and two pat7 signals, their locations are shown in the cartoon. [12] It is predicted in the 78th percent that the protein resides in the nucleus. [12]
ERICH2 is not ubiquitously expressed. It however, has been shown to be expressed narrowly in the choroid plexus of a developing fetus and in the testes of adults. [14] Lung and female tissue expression were also present but expression was greatly decreased. [15] Proteins are specifically located in the nucleoli fibrillar center and the vesicles within cells. [4] [16]
Many phosphorylation sites are predicted for the ERICH2 protein. None are predicted on tyrosines only on serines and the threonines. [17] [18] There is also a predicted acetylation site at the N-terminus of the protein, specifically it is predicted on the third amino acid. [19] Many SOX/SRY-sex/testis determining and related HMG box factor transcription factors and estrogen related transcription factors are predicted to bind and regulate transcription of ERICH2. [20]
ERICH2 interacts with proteins in the H2A family. [5] [16] The H2A proteins specifically play a role in the octamer structure of histone. ERICH2 is specifically known to interact with the H2AFY protein, which plays a key role in the stable X chromosome inactivation and can function by replacing a normal H2A in certain nucleosomes and thus repressing transcription. [21]
ERICH2 is also known to interact with the protein SDCB1 which functions in vesicle trafficking and the regulation of growth and proliferation of certain cancer cells. [22]
The IWS1 protein also interacts with ERICH2. This protein functions as a transcription factor and plays a key role in defining the composition of the RNA polymerase II elongation complex. [23] This complex then plays a role in histone modification and proper splicing.
Two-hybrid assays and other protein interaction methods have shown an interaction with the PSORS1C2 protein, but the function of this protein remains unknown. [5]
No paralogs for the ERICH2 protein are known. ERICH2 has 124 known orthologs spanning multiple taxa. [6]
Genus and Species | Common Name | Date of Divergence (MYA) [24] | Sequence Length (aa) | Sequence Identity | Sequence similarity |
Homo sapiens | Human | 0 | 436 | -- | -- |
Rousettus aegyptiacus | Egyptian fruit bat | 94 | 430 | 58% | 63% |
Propithecus coquereli | Coquerel's sifaka | 74 | 323 | 54% | 60% |
Mus musculus | Mouse | 90 | 463 | 47% | 56% |
Ursus Maritimus | Polar Bear | 94 | 296 | 50% | 53% |
Alligator mississippiensis | American Alligator | 320 | 370 | 28% | 38% |
Thamnophis sirtalis | Common Garter Snake | 320 | 309 | 27% | 37% |
Callorhinchus milii | Australian Ghost Shark | 465 | 319 | 22% | 33% |
Danio rerio | Zebra Fish | 432 | 310 | 24% | 30% |
Strongylocentrotus purpuratus | Purple Sea Urchin | 627 | 470 | 23% | 25% |
Crassostrea gigas | Pacific Oyster | 758 | 293 | 17% | 22% |
Bemisia tabaci | Silverleaf Whitefly | 758 | 213 | 13% | 14% |
Trichoplax adhaerens | Trichoplax | 930 | 164 | 12% | 15% |
Histone H2B is one of the 5 main histone proteins involved in the structure of chromatin in eukaryotic cells. Featuring a main globular domain and long N-terminal and C-terminal tails, H2B is involved with the structure of the nucleosomes.
PROSER2, also known as proline and serine rich 2, is a protein that in humans is encoded by the PROSER2 gene. PROSER2, or c10orf47(Chromosome 10 open reading frame 47), is found in band 14 of the short arm of chromosome 10 (10p14) and contains a highly conserved SARG domain. It is a fast evolving gene with two paralogs, c1orf116 and specifically androgen-regulated gene protein isoform 1. The PROSER2 protein has a currently uncharacterized function however, in humans, it may play a role in cell cycle regulation, reproductive functioning, and is a potential biomarker of cancer.
WD repeat-containing protein 90 is a protein that, in humans, is encoded by the WDR90 gene (16p13.3). This human protein is 1750 amino acids, and has a molecular weight of 187.7 kDa. It contains multiple WD40 repeat domains and one domain of unknown function. This protein is conserved all the way back to invertebrates. Proteins containing WD transducin repeating domains have been found to play a role in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control, autophagy and apoptosis.
Coiled-coil domain containing protein 180 (CCDC180) is a protein that in humans is encoded by the CCDC180 gene. This protein is known to localize to the nucleus and is thought to be involved in regulation of transcription as are many proteins containing coiled-coil domains. As it is expressed most highly in the testes and is regulated by SRY and SOX transcription factors, it could be involved in sex determination.
FAM71F2 or Family with Sequence Similarity 71 member F2 is a protein that in humans is encoded by the Family with Sequence Similarity 71 member F2 gene. This gene is highly active in the reproductive tissues, specifically the testis, and may serve as a potential biomarker for determining metastatic testicular cancer.
Uncharacterized protein C12orf60 is a protein that in humans is encoded by the C12orf60 gene. The gene is also known as LOC144608 or MGC47869. The protein lacks transmembrane domains and helices, but it is rich in alpha-helices. It is predicted to localize in the nucleus.
Leukocyte Receptor Cluster Member 9 is an uncharacterized protein encoded by the LENG9 gene. In humans, LENG9 is predicted to play a role in fertility and reproductive disorders associated with female endometrium structures.
GRAM domain containing 1B, also known as GRAMD1B, Aster-B and KIAA1201, is a cholesterol transport protein that is encoded by the GRAMD1B gene. It contains a transmembrane region and two domains of known function; the GRAM domain and a VASt domain. It is anchored to the endoplasmic reticulum. This highly conserved gene is found in a variety of vertebrates and invertebrates. Homologs are found in yeast.
Testis expressed 36, TEX36, is a protein that in humans is encoded by the tex36 gene. TEX36 interacts with proteins involved in the MAP kinase family, supporting that TEX36 may be regulated with on or off configurations. The encoded protein is highly expressed in fetal, testes, and placental tissues and has background expression levels in adults. There are also many motifs specific to male sex determination and spermatogenic factors, suggesting that it is involved in development.
TMEM44 is a protein that in humans is encoded by the TMEM44 gene. DKFZp686O18124 is a synonym of TMEM44.
Uncharacterized protein C16orf86 is a protein in humans that is encoded by the C16orf86 gene. It is mostly made of alpha helices and it is expressed in the testes, but also in other tissues such as the kidney, colon, brain, fat, spleen, and liver. For the function of C16orf86, it is not well understood, however it could be a transcription factor in the nucleus that regulates G0/G1 in the cell cycle for tissues such as the kidney, brain, and skeletal muscles as mentioned in the DNA microarray data below in the gene level regulation section.
Transmembrane protein 151A, also known as TMEM151A, is a protein that is encoded by the TMEM151A gene.
C11orf42 is an uncharacterized protein in homo sapiens that is encoded by the C11orf42 gene. It is also known as chromosome 11 open reading frame 42 and uncharacterized protein C11orf42, with no other aliases. The gene is mostly conserved in mammals, but it has also been found in rodents, reptiles, fish and worms.
Golgin subfamily A member 8H, also known as GOLGA8H, is a protein that in Homo sapiens is encoded by the GOLGA8H gene. Function of the GOLGA8H involves a process that is carried out at the cellular level which results in the assembly, arrangement of constituent parts, or disassembly of the Golgi apparatus.
C16orf90 or chromosome 16 open reading frame 90 produces uncharacterized protein C16orf90 in homo sapiens. C16orf90's protein has four predicted alpha-helix domains and is mildly expressed in the testes and lowly expressed throughout the body. While the function of C16orf90 is not yet well understood by the scientific community, it has suspected involvement in the biological stress response and apoptosis based on expression data from microarrays and post-translational modification data.
ZNF337, also known as zinc finger protein 337, is a protein that in humans is encoded by the ZNF337 gene. The ZNF337 gene is located on human chromosome 20 (20p11.21). Its protein contains 751 amino acids, has a 4,237 base pair mRNA and contains 6 exons total. In addition, alternative splicing results in multiple transcript variants. The ZNF337 gene encodes a zinc finger domain containing protein, however, this gene/protein is not yet well understood by the scientific community. The function of this gene has been proposed to participate in a processes such as the regulation of transcription (DNA-dependent), and proteins are expected to have molecular functions such as DNA binding, metal ion binding, zinc ion binding, which would be further localized in various subcellular locations. While there are no commonly associated or known aliases, an important paralog of this gene is ZNF875
KRBA1 is a protein that in humans is encoded by the KRBA1 gene. It is located on the plus strand of chromosome 7 from 149,411,872 to 149,431,664. It is also commonly known under two other aliases: KIAA1862 and KRAB A Domain Containing 1 gene and encodes the KRBA1 protein in humans. The KRBA family of genes is understood to encode different transcriptional repressor proteins
MIF4GD, or MIF4G domain-containing protein, is a protein which in humans is encoded by the MIF4GD gene. It is also known as SLIP1, SLBP -interacting protein 1, AD023, and MIFD. MIF4GD is expressed ubiquitously in humans, and has been found to be involved in activating proteins for histone mRNA translation, alternative splicing and translation of mRNAs, and is a factor in the regulation of cell proliferation.
Zinc Finger Protein 548 (ZNF548) is a human protein encoded by the ZNF548 gene which is located on chromosome 19. It is found in the nucleus and is hypothesized to play a role in the regulation of transcription by RNA Polymerase II. It belongs to the Krüppel C2H2-type zinc-finger protein family as it contains many zinc-finger repeats.
bMERB domain containing 1 is a gene expressed in humans which has broad expression across the brain. This gene codes for bMERB1 domain-containing protein 1 isoform 1. It is predicted that this gene is involved in actin cytoskeleton regulation, microtubule regulation and glial cell migration.