Endoplasmic reticulum aminopeptidase 1 (ERAP1) is an enzyme that in humans is encoded by the ERAP1 gene. This M1 zinc aminopeptidase is involved in the antigen processing and presentation pathway. ERAP1 is mainly located in the endoplasmic reticulum (ER), where it trims peptides at their N-terminus, adapting them for presentation by MHC class I molecules (MHC-I). [5]
Historical names of ERAP1, [ HUGO Gene Nomenclature Committee]:
Efficient presentation of antigenic peptides by MHC class I molecules provides the key signal for adaptive immune responses by cytotoxic (CD8+) T lymphocytes. In the "endogenous" antigen presentation pathway, proteins synthesized by cells undergo cytosolic degradation and some of their peptide fragments are transported to the ER, where suitable-length peptides are loaded onto MHC class I molecules. In the ER, ERAP1 shortens longer peptides to the optimal length for stable binding onto MHC class I molecules (Figure 1). ERAP1, like other APP components, is induced by interferon gamma, a cytokine inducing antigen presentation. [6] ERAP1 preferentially trims N-extended substrates of 9-16 residues to optimally sized 8-10 residue peptides. [7] This “molecular ruler” mechanism is unique to ERAP1. ERAP1 also functions in the presentation of extracellular antigens in the cross-presentation pathway.
ERAP1 "trimming" modulates the peptide repertoire presented by MHC class I molecules and thereby shapes the adaptive immune response. [6] [8] [9] In murine models, ERAAP (the murine homologue of human ERAP1) deficiency results in a strong alteration and increased immunogenicity of the peptide repertoire presented by MHC-I. [10] Murine models with genetic deficiency for ERAAP have been instrumental for understanding the role of peptide trimming in the ER. [8] [11] [12]
Besides peptide trimming in the ER, ERAP1 has been proposed to perform additional functions depending on its location. ERAP1 can be secreted into the extracellular space in response to inflammatory stimuli, which can lead to the activation of immune cells, such as macrophages or natural killer cells, and enhanced expression of pro-inflammatory cytokines. [13]
The ERAP1 gene (HGNC: 18173) is located at the long arm of chromosome 5 (5q15). The gene is ~47Kb in length [14] and contains 20 exons and 19 introns, [15] which encode 9 different splice variants. The coding sequence shows a high degree of conservation among placental mammals (>80% identity). The sequences of 227 ERAP1 orthologs identified in approximately 200 species are available. [16]
ERAP1 is a polymorphic gene that has many single nucleotide variants (SNVs) including several common missense variants that alter the ERAP1 amino acid sequence. The various combinations of common SNVs in ERAP1 organize into distinct haplotypes that encode different protein isoforms often referred to as “allotypes”. [17] The allotypes of ERAP1 can broadly be categorized based on their enzymatic activity ranging from “high” to “low”. [18] [19] The enzymatic activity of ERAP1 is dependent on substrate recognition of the peptide, seen by the trimming efficiency of specific peptide substrates varying significantly for a given allotype. As is the case for allotype 10, a poor trimming allotype, shown to be tenfold less active compared to the ancestral allotype in hydrolysis of the substrate l-leucine-7-amido-4-methylcoumarin (Leu-AMC). [19]
Genetic variants and haplotypes (i.e., allotypes) of ERAP1 have been associated with a wide variety of inflammatory conditions, infectious diseases, and cancer. In particular, ERAP1 is a major risk gene identified in genome-wide association studies of MHC-I associated inflammatory conditions (or “MHC-I-opathies”), including Ankylosing Spondylitis, [20] Bechet’s disease, [21] Birdshot Uveitis, [22] and Psoriasis. [23] In these conditions, ERAP1 is often in epistasis with the primary risk MHC-I allele. [24] Other disease associations include insulin dependent Diabetes Mellitus and Multiple Sclerosis. [25] Historically, ERAP1 gene associations were first reported in Hypertension. [26] Emerging evidence links ERAP1 SNVs to cancer development, and susceptibility to infectious disease, [27] such as ERAP1 SNVs that modify the resistance to influenza virus infection. [28]
ERAP1 belongs to the oxytocinase subfamily of the M1-family of zinc metalloproteases. It is composed of four structural domains. Domain I (residues 1-254) consists of an eight-stranded ß-sheet and provides binding sites for the N-terminus of substrates. It fits against the catalytic domain II and engages with domain IV through an elongated loop. Domain II (residues 255-529) is the thermolysin-like catalytic domain, composed by an alpha-helix and a five stranded beta sheet. This sheet comprises the specific for exopeptidases GAMEN motif which creates one part of the substrate binding-cleft. The catalytic Zn atom is coordinated by the residues His353, His357 and Glu386, found in the zinc-binding motif (H-E-X-X-H-X18-E) on the helix 6a. Domain III (residues 530-614) is composed by two beta-sheets forming a beta sandwich and acts as a linker between domains II and IV. Finally, domain IV (615-941) consists mainly of a-helices and exhibits a bowl-shaped form. At the closed (active) state, it juxtaposes with domain II forming a large internal cavity, which holds the C-term substrate binding site (Figure 2A). It is the most variable domain among this family of aminopeptidases. [29] [30]
ERAP1 uses a catalytic mechanism similar to the one proposed for LTA4 hydrolase. [29] ERAP1 adopts a thermolysinlike fold and has been crystallized in two distinct conformations: a. the open and b. the closed (Figure 2). In the open conformation, domain IV lies away from the active site thus making the internal cavity more accessible to substrates. In the closed conformation, the internal cavity is occluded from the external solvent, and it is of adequate size to accommodate a 16-residue peptide. The catalytic residues and in particular Tyr438 are optimally positioned for catalysis in the closed conformation. Consequently, substrate binding is hypothesized to take place in the open conformation, while N-terminal bond cleavage takes place in the closed one. [29] [30] It has been proposed that binding of substrate or small inhibitors induces conformational closing of ERAP1 in solution. [31]
ERAP1 prefers peptide substrates 9-16 amino acids long and is much less active for peptides 8-9 amino acids long. It is considered that ERAP1 uses a “molecular ruler” mechanism, according to which the substrate binds through its hydrophobic C-terminus in a hydrophobic pocket at the junction of domain III and domain IV and the N-terminus binds to the active site. When the length of the peptide is shorter than 8 or 9 amino acids, the peptide is too short to reach the active site, limiting rates of cleavage. [7]
ERAP1 has a wide substrate specificity with a preference for hydrophobic residues (e.g. leucine and methionine) at the N-terminus of the peptide substrate. [32] [33] Tryptophan, arginine, cysteine and charged amino acids, like aspartic and glutamic acid, are poorly removed. [30] [32] ERAP1’s trimming efficiency can also be influenced by the internal sequence of the peptide, with preferences for hydrophobic and positively charged residues. [32]
Hisatsune et al. demonstrated, by co-immunoprecipitation, an interaction between ERAP1 and the disulfide-shuffling chaperone ERp44, facilitated by disulfide bonds formed with cysteine residues in the exon 10 loop of ERAP1. [34] This interaction was proposed to be the main mechanism for ER retention.
Some experimental evidence has indicated the possibility of heterodimer formation between ERAP1 and ERAP2, another member of the oxytocinase sub-family of M1 aminopeptidases, that shares structural and functional similarities. The co-elution of ERAP1 and ERAP2 was detected through microsome fractionation, in the 230 kDa fraction, suggesting the formation of heterodimers. [35] Proximity ligation assay analysis suggested a direct physical interaction between the two enzymes. [35] A leucine zipper mediated ERAP1/ERAP2 complex exhibited enhanced trimming efficiency compared to a mixture of the two enzymes. [36] Computational dynamics showed that ERAP1/ERAP2 heterodimerization could be mediated by the exon 10 loop, [37] known to be involved in ERAP1-ERp44 interactions.
Chen et al. suggested that ERAP1 can trim N-terminally extended precursor antigenic peptides when bound onto MHC I. [38] However, a re-evaluation of this trimming model by kinetic and biochemical analyses suggested that most MHC-I bound peptides had limited to no access to the active site of ERAP. [39]
Therapeutic approaches for ERAP1 regulation rely mostly on the development of small molecule inhibitors. The most explored classes of inhibitors for ERAP1 are the catalytic or the allosteric site ones.
The first generation of ERAP1 inhibitors is a series of phosphinic pseudopeptides derived from a rational design approach targeting the catalytic zinc-binding site, in 20131. Notably, DG013A (Table 1, Figure 3) displayed high potency (ERAP1 IC50 = 33 nM) but poor selectivity against ERAP2 and IRAP, with reported SAR optimization studies demonstrating the importance of side chains at positions P1’ and P2’. [40] [41] A high-resolution crystal structure of phosphinic analogue DG046 bound in the active site of ERAP1, has been obtained (Figure 4). [42]
A novel family of zinc-targeting diaminobenzoic acid (DABA) compounds were rationally designed and developed in 2013, displaying micromolar potency for ERAP1 inhibition (compound 2, IC50 = 2 μM, Table 1). Moderate selectivity over ERAP2 and IRAP with additional optimization efforts based on extensive investigation of SAR has been achieved. [43] [44]
In 2020, urea derivative 3 (Table 1) was identified via high-throughput screening (HTS), as a competitive inhibitor of ERAP1 aminopeptidase activity (IC50 = 6.9 μM) with increased selectivity over ERAP2 and IRAP. [45] SAR exploration and docking studies showed that the N-acetylpiperazine carbonyl group was critical for the activity via its zinc-binding group properties. [45]
A high-throughput library screen identified Clerodane acid, a natural product from Dodonaea viscosa, to be a selective activator of ERAP1 for small substrates and an inhibitor of peptide trimming. This compound was found active in blocking ERAP1-dependent antigen presentation (EC50 = 1μΜ) and to bind to the regulatory site of ERAP1 that normally accommodates C-termini of peptide substates. [46] Clerodane acid was also found to induce strong immunopeptidome shifts in cancer cells. [47]
This family of compounds were identified as potential allosteric (C-terminus recognition site of peptides) inhibitor via fluorescence-based high-throughput screening in 2021. [48] Compound 4 (Table 1) displayed high potency (ERAP1 IC50 = 34 nM) and at the same time selectivity against ERAP2 and IRAP. [48]
Sulfonamide compound 5 (Table 1) was identified through high-throughput screening studies as a potential allosteric selective inhibitor, binding at the interface between domain II and IV of ERAP1. It activates small peptide hydrolysis but effectively inhibits processing of long peptides with 8−13 residues (IC50 = 5.3 μM) and displays selectivity over ERAP2 and IRAP. [45]
As of 2023, an ERAP1 Inhibitor (GRWD5769) developed by Grey Wolf Therapeutics has entered phase I/II. Its safety, tolerability, efficacy, and pharmacokinetics are being evaluated in patients with viral associated solid tumours (head and neck squamous cell carcinoma, cervical cancer, and hepatocellular carcinoma) that are particularly sensitive to ERAP1 inhibition, as monotherapy, or in combination with PD-1 immune checkpoint Inhibitor Libtayo® (cemiplimab). [49]
Calreticulin also known as calregulin, CRP55, CaBP3, calsequestrin-like protein, and endoplasmic reticulum resident protein 60 (ERp60) is a protein that in humans is encoded by the CALR gene.
Protein disulfide isomerase, or PDI, is an enzyme in the endoplasmic reticulum (ER) in eukaryotes and the periplasm of bacteria that catalyzes the formation and breakage of disulfide bonds between cysteine residues within proteins as they fold. This allows proteins to quickly find the correct arrangement of disulfide bonds in their fully folded state, and therefore the enzyme acts to catalyze protein folding.
MHC class I molecules are one of two primary classes of major histocompatibility complex (MHC) molecules and are found on the cell surface of all nucleated cells in the bodies of vertebrates. They also occur on platelets, but not on red blood cells. Their function is to display peptide fragments of proteins from within the cell to cytotoxic T cells; this will trigger an immediate response from the immune system against a particular non-self antigen displayed with the help of an MHC class I protein. Because MHC class I molecules present peptides derived from cytosolic proteins, the pathway of MHC class I presentation is often called cytosolic or endogenous pathway.
The T-cell receptor (TCR) is a protein complex found on the surface of T cells, or T lymphocytes, that is responsible for recognizing fragments of antigen as peptides bound to major histocompatibility complex (MHC) molecules. The binding between TCR and antigen peptides is of relatively low affinity and is degenerate: that is, many TCRs recognize the same antigen peptide and many antigen peptides are recognized by the same TCR.
Calnexin (CNX) is a 67kDa integral protein of the endoplasmic reticulum (ER). It consists of a large N-terminal calcium-binding lumenal domain, a single transmembrane helix and a short, acidic cytoplasmic tail. In humans, calnexin is encoded by the gene CANX.
Aminopeptidases are enzymes that catalyze the cleavage of amino acids from the N-terminus (beginning), of proteins or peptides. They are found in many organisms; in the cell, they are found in many organelles, in the cytosol, and as membrane proteins. Aminopeptidases are used in essential cellular functions, and are often zinc metalloenzymes, containing a zinc cofactor.
Insulin regulated aminopeptidase (IRAP) is a protein that in humans is encoded by the leucyl and cystinyl aminopeptidase (LNPEP) gene. IRAP is a type II transmembrane protein which belongs to the oxytocinase subfamily of M1 aminopeptidases, alongside ERAP1 and ERAP2. It is also known as oxytocinase, leucyl and cystinyl aminopeptidase, placental leucine aminopeptidase (P-LAP), cystinyl aminopeptidase (CAP), and vasopressinase. IRAP is expressed in different cell types, mainly located in specialized regulated endosomes that can be recruited to the cell surface upon cell type-specific receptor activation.
MHC Class II molecules are a class of major histocompatibility complex (MHC) molecules normally found only on professional antigen-presenting cells such as dendritic cells, macrophages, some endothelial cells, thymic epithelial cells, and B cells. These cells are important in initiating immune responses.
Transporter associated with antigen processing (TAP) protein complex belongs to the ATP-binding-cassette transporter family. It delivers cytosolic peptides into the endoplasmic reticulum (ER), where they bind to nascent MHC class I molecules.
HLA class I histocompatibility antigen, alpha chain E (HLA-E) also known as MHC class I antigen E is a protein that in humans is encoded by the HLA-E gene. The human HLA-E is a non-classical MHC class I molecule that is characterized by a limited polymorphism and a lower cell surface expression than its classical paralogues. The functional homolog in mice is called Qa-1b, officially known as H2-T23.
The unfolded protein response (UPR) is a cellular stress response related to the endoplasmic reticulum (ER) stress. It has been found to be conserved between mammalian species, as well as yeast and worm organisms.
TAP-associated glycoprotein, also known as tapasin or TAPBP, is a protein that in humans is encoded by the TAPBP gene.
Protein disulfide-isomerase A3 (PDIA3), also known as glucose-regulated protein, 58-kD (GRP58), is an isomerase enzyme encoded by the autosomal gene PDIA3 in humans. This protein localizes to the endoplasmic reticulum (ER) and interacts with lectin chaperones calreticulin and calnexin (CNX) to modulate folding of newly synthesized glycoproteins. It is thought that complexes of lectins and this protein mediate protein folding by promoting formation of disulfide bonds in their glycoprotein substrates.
Leucyl aminopeptidases are enzymes that preferentially catalyze the hydrolysis of leucine residues at the N-terminus of peptides and proteins. Other N-terminal residues can also be cleaved, however. LAPs have been found across superkingdoms. Identified LAPs include human LAP, bovine lens LAP, porcine LAP, Escherichia coli LAP, and the solanaceous-specific acidic LAP (LAP-A) in tomato.
Binding immunoglobulin protein (BiPS) also known as 78 kDa glucose-regulated protein (GRP-78) or heat shock 70 kDa protein 5 (HSPA5) is a protein that in humans is encoded by the HSPA5 gene.
Minor histocompatibility antigen H13 is a protein that in humans is encoded by the HM13 gene.
PNGase also known as N-glycanase 1 or peptide-N(4)-(N-acetyl-beta-glucosaminyl)asparagine amidase is an enzyme that in humans is encoded by the NGLY1 gene. PNGase is a de-N-glycosylating enzyme that removes N-linked or asparagine-linked glycans (N-glycans) from glycoproteins. More specifically, NGLY1 catalyzes the hydrolysis of the amide bond between the innermost N-acetylglucosamine (GlcNAc) and an Asn residue on an N-glycoprotein, generating a de-N-glycosylated protein, in which the N-glycoylated Asn residue is converted to asp, and a 1-amino-GlcNAc-containing free oligosaccharide. Ammonia is then spontaneously released from the 1-amino GlcNAc at physiological pH (<8), giving rise to a free oligosaccharide with an N,N’-diacetylchitobiose structure at the reducing end.
Immunoevasins are proteins expressed by some viruses that enable the virus to evade immune recognition by interfering with MHC I complexes in the infected cell, therefore blocking the recognition of viral protein fragments by CD8+ cytotoxic T lymphocytes. Less frequently, MHC II antigen presentation and induced-self molecules may also be targeted. Some viral immunoevasins block peptide entry into the endoplasmic reticulum (ER) by targeting the TAP transporters. Immunoevasins are particularly abundant in viruses that are capable of establishing long-term infections of the host, such as herpesviruses.
Endoplasmic reticulum aminopeptidase 2 (ERAP2) is a protein that in humans is encoded by the ERAP2 gene. ERAP2 is part of the M1 aminopeptidase family. It is expressed along with ERAP1 in the endoplasmic reticulum (ER). In the ER, both enzymes help process and present antigens by trimming the ends of precursor peptides. This creates the optimal pieces for display by Major Histocompatibility Complex (MHC) class I molecules.
The peptide-loading complex (PLC) is a short-lived, multisubunit membrane protein complex that is located in the endoplasmic reticulum (ER). It orchestrates peptide translocation and selection by major histocompatibility complex class I (MHC-I) molecules. Stable peptide-MHC I complexes are released to the cell surface to promote T-cell response against malignant or infected cells. In turn, T-cells recognize the activated peptides, which could be immunogenic or non-immunogenic.