Envelope protein | |||||||||
---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||
Symbol | CoV_E | ||||||||
Pfam | PF02723 | ||||||||
InterPro | IPR003873 | ||||||||
PROSITE | PS51926 | ||||||||
|
The envelope (E) protein is the smallest and least well-characterized of the four major structural proteins found in coronavirus virions. [2] [3] [4] It is an integral membrane protein less than 110 amino acid residues long; [2] in SARS-CoV-2, the causative agent of Covid-19, the E protein is 75 residues long. [5] Although it is not necessarily essential for viral replication, absence of the E protein may produce abnormally assembled viral capsids or reduced replication. [2] [3] E is a multifunctional protein [6] and, in addition to its role as a structural protein in the viral capsid, it is thought to be involved in viral assembly, likely functions as a viroporin, and is involved in viral pathogenesis. [2] [5]
The E protein consists of a short hydrophilic N-terminal region, a hydrophobic helical transmembrane domain, and a somewhat hydrophilic C-terminal region. In SARS-CoV and SARS-CoV-2, the C-terminal region contains a PDZ-binding motif (PBM). [2] [5] This feature appears to be conserved only in the alpha and beta coronavirus groups, but not gamma. [2] In the beta and gamma groups, a conserved proline residue is found in the C-terminal region likely involved in targeting the protein to the Golgi. [2]
The transmembrane helices of the E proteins of SARS-CoV and SARS-CoV-2 can oligomerize and have been shown in vitro to form pentameric structures with central pores that serve as cation-selective ion channels. [5] Both viruses' E protein pentamers have been structurally characterized by nuclear magnetic resonance spectroscopy. [5] [7]
The membrane topology of the E protein has been studied in a number of coronaviruses with inconsistent results; the protein's orientation in the membrane may be variable. [3] The balance of evidence suggests the most common orientation has the C-terminus oriented toward the cytoplasm. [8] Studies of SARS-CoV-2 E protein are consistent with this orientation. [5] [9]
In some, but not all, coronaviruses, the E protein is post-translationally modified by palmitoylation on conserved cysteine residues. [2] [8] In the SARS-CoV E protein, one glycosylation site has been observed, which may influence membrane topology; [8] however, the functional significance of E glycosylation is unclear. [2] Ubiquitination of SARS-CoV E has also been described, though its functional significance is also not known. [2]
NCBI genome ID | 86693 |
---|---|
Genome size | 29,903 bases |
Year of completion | 2020 |
Genome browser (UCSC) |
The E protein is expressed at high abundance in infected cells. However, only a small amount of the total E protein produced is found in assembled virions. [2] [4] E protein is localized to the endoplasmic reticulum, Golgi apparatus, and endoplasmic-reticulum–Golgi intermediate compartment (ERGIC), the intracellular compartment that gives rise to the coronavirus viral envelope. [2] [5]
Studies in different coronaviruses have reached different conclusions about whether E is essential to viral replication. In some coronaviruses, including MERS-CoV, E has been reported to be essential. [10] In others, including mouse coronavirus [11] and SARS-CoV, E is not essential, though its absence reduces viral titer, [12] in some cases by introducing propagation defects or causing abnormal capsid morphology. [2]
The E protein is found in assembled virions where it forms protein-protein interactions with the coronavirus membrane protein (M), the most abundant of the four structural proteins contained in the viral capsid. [2] [4] The interaction between E and M occurs through their respective C-termini on the cytoplasmic side of the membrane. [2] In most coronaviruses, E and M are sufficient to form virus-like particles, [2] [4] though SARS-CoV has been reported to depend on N as well. [14] There is good evidence that E is involved in inducing membrane curvature to create the typical spherical coronavirus virion. [2] [15] It is likely that E is involved in viral budding or scission, although its role in this process has not been well characterized. [2] [4] [15]
In its pentameric state, E forms cation-selective ion channels and likely functions as a viroporin. [5] NMR studies show that viroporin presents an open conformation at low pH or in the presence of calcium ions, while the closed conformation is favored at basic pH. [16] The NMR structure shows a hydrophobic gate at leucine 28 in the middle of the pore. The passage of ions through the gate is thought to be facilitated by the polar residues at the C-terminus. [17]
The cation leakage may disrupt ion homeostasis, alter membrane permeability, and modulate pH in the host cell, which may facilitate viral release. [2] [4]
The E protein's role as a viroporin appears to be involved in pathogenesis and may be related to activation of the inflammasome. [3] [18] In SARS-CoV, mutations that disrupt E's ion channel function result in attenuated pathogenesis in animal models despite little effect on viral growth. [10]
Protein-protein interactions between E and proteins in the host cell are best described in SARS-CoV and occur via the C-terminal PDZ domain binding motif. The SARS-CoV E protein has been reported to interact with five host cell proteins: Bcl-xL, PALS1, syntenin, sodium/potassium (Na+/K+) ATPase α-1 subunit, and stomatin. [2] The interaction with PALS1 may be related to pathogenesis via the resulting disruption in tight junctions. [3] [10] This interaction has also been identified in SARS-CoV-2. [19]
The sequence of the E protein is not well conserved across coronavirus genera, with sequence identities reaching under 30%. [12] In laboratory experiments on mouse hepatitis virus, substitution of E proteins from different coronaviruses, even from different groups, could produce viable viruses, suggesting that significant sequence diversity can be tolerated in functional E proteins. [20] The SARS-CoV-2 E protein is very similar to that of SARS-CoV, with three substitutions and one deletion. [4] A study of SARS-CoV-2 sequences suggests that the E protein is evolving relatively slowly compared to other structural proteins. [21] The conserved nature of the envelope protein among SARS-CoV and SARS-CoV-2 variants has led it to be researched as a potential target for universal coronavirus vaccine development. [22] [23]
Coronaviruses are a group of related RNA viruses that cause diseases in mammals and birds. In humans and birds, they cause respiratory tract infections that can range from mild to lethal. Mild illnesses in humans include some cases of the common cold, while more lethal varieties can cause SARS, MERS and COVID-19. In cows and pigs they cause diarrhea, while in mice they cause hepatitis and encephalomyelitis.
Severe-acute-respiratory-syndrome–related coronavirus is a species of virus consisting of many known strains. Two strains of the virus have caused outbreaks of severe respiratory diseases in humans: severe acute respiratory syndrome coronavirus 1, which caused the 2002–2004 outbreak of severe acute respiratory syndrome (SARS), and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which is causing the ongoing pandemic of COVID-19. There are hundreds of other strains of SARSr-CoV, which are only known to infect non-human mammal species: bats are a major reservoir of many strains of SARSr-CoV; several strains have been identified in Himalayan palm civets, which were likely ancestors of SARS-CoV-1.
Coronaviridae is a family of enveloped, positive-strand RNA viruses which infect amphibians, birds, and mammals. The group includes the subfamilies Letovirinae and Orthocoronavirinae; the members of the latter are known as coronaviruses.
A viral envelope is the outermost layer of many types of viruses. It protects the genetic material in their life cycle when traveling between host cells. Not all viruses have envelopes. A viral envelope protein or E protein is a protein in the envelope, which may be acquired by the capsid from an infected host cell.
Murine coronavirus (M-CoV) is a virus in the genus Betacoronavirus that infects mice. Belonging to the subgenus Embecovirus, murine coronavirus strains are enterotropic or polytropic. Enterotropic strains include mouse hepatitis virus (MHV) strains D, Y, RI, and DVIM, whereas polytropic strains, such as JHM and A59, primarily cause hepatitis, enteritis, and encephalitis. Murine coronavirus is an important pathogen in the laboratory mouse and the laboratory rat. It is the most studied coronavirus in animals other than humans, and has been used as an animal disease model for many virological and clinical studies.
Viral entry is the earliest stage of infection in the viral life cycle, as the virus comes into contact with the host cell and introduces viral material into the cell. The major steps involved in viral entry are shown below. Despite the variation among viruses, there are several shared generalities concerning viral entry.
In virology, a spike protein or peplomer protein is a protein that forms a large structure known as a spike or peplomer projecting from the surface of an enveloped virus. The proteins are usually glycoproteins that form dimers or trimers.
The Coronavirus packaging signal is a conserved cis-regulatory element found in Betacoronavirus. It has an important role in regulating the packaging of the viral genome into the capsid. As part of the viral life cycle, within the infected cell, the viral genome becomes associated with viral proteins and assembles into new infective progeny viruses. This process is called packaging and is vital for viral replication.
Vpu is an accessory protein that in HIV is encoded by the vpu gene. Vpu stands for "Viral Protein U". The Vpu protein acts in the degradation of CD4 in the endoplasmic reticulum and in the enhancement of virion release from the plasma membrane of infected cells. Vpu induces the degradation of the CD4 viral receptor and therefore participates in the general downregulation of CD4 expression during the course of HIV infection. Vpu-mediated CD4 degradation is thought to prevent CD4-Env binding in the endoplasmic reticulum to facilitate proper Env assembly into virions. It is found in the membranes of infected cells, but not the virus particles themselves.
Transmissible gastroenteritis virus or Transmissible gastroenteritis coronavirus (TGEV) is a coronavirus which infects pigs. It is an enveloped, positive-sense, single-stranded RNA virus which enters its host cell by binding to the APN receptor. The virus is a member of the genus Alphacoronavirus, subgenus Tegacovirus, species Alphacoronavirus 1.
ORF7a is a gene found in coronaviruses of the Betacoronavirus genus. It expresses the Betacoronavirus NS7A protein, a type I transmembrane protein with an immunoglobulin-like protein domain. It was first discovered in SARS-CoV, the virus that causes severe acute respiratory syndrome (SARS). The homolog in SARS-CoV-2, the virus that causes COVID-19, has about 85% sequence identity to the SARS-CoV protein.
Viroporins are small and usually hydrophobic multifunctional viral proteins that modify cellular membranes, thereby facilitating virus release from infected cells. Viroporins are capable of assembling into oligomeric ion channels or pores in the host cell's membrane, rendering it more permeable and thus facilitating the exit of virions from the cell. Many viroporins also have additional effects on cellular metabolism and homeostasis mediated by protein-protein interactions with host cell proteins. Viroporins are not necessarily essential for viral replication, but do enhance growth rates. They are found in a variety of viral genomes but are particularly common in RNA viruses. Many viruses that cause human disease express viroporins. These viruses include hepatitis C virus, HIV-1, influenza A virus, poliovirus, respiratory syncytial virus, and SARS-CoV.
The membrane (M) protein is an integral membrane protein that is the most abundant of the four major structural proteins found in coronaviruses. The M protein organizes the assembly of coronavirus virions through protein-protein interactions with other M protein molecules as well as with the other three structural proteins, the envelope (E), spike (S), and nucleocapsid (N) proteins.
The nucleocapsid (N) protein is a protein that packages the positive-sense RNA genome of coronaviruses to form ribonucleoprotein structures enclosed within the viral capsid. The N protein is the most highly expressed of the four major coronavirus structural proteins. In addition to its interactions with RNA, N forms protein-protein interactions with the coronavirus membrane protein (M) during the process of viral assembly. N also has additional functions in manipulating the cell cycle of the host cell. The N protein is highly immunogenic and antibodies to N are found in patients recovered from SARS and COVID-19.
Spike (S) glycoprotein is the largest of the four major structural proteins found in coronaviruses. The spike protein assembles into trimers that form large structures, called spikes or peplomers, that project from the surface of the virion. The distinctive appearance of these spikes when visualized using negative stain transmission electron microscopy, "recalling the solar corona", gives the virus family its main name.
ORF3a is a gene found in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It encodes an accessory protein about 275 amino acid residues long, which is thought to function as a viroporin. It is the largest accessory protein and was the first of the SARS-CoV accessory proteins to be described.
ORF7b is a gene found in coronaviruses of the genus Betacoronavirus, which expresses the accessory protein Betacoronavirus NS7b protein. It is a short, highly hydrophobic transmembrane protein of unknown function.
ORF8 is a gene that encodes a viral accessory protein, Betacoronavirus NS8 protein, in coronaviruses of the subgenus Sarbecovirus. It is one of the least well conserved and most variable parts of the genome. In some viruses, a deletion splits the region into two smaller open reading frames, called ORF8a and ORF8b - a feature present in many SARS-CoV viral isolates from later in the SARS epidemic, as well as in some bat coronaviruses. For this reason the full-length gene and its protein are sometimes called ORF8ab. The full-length gene, exemplified in SARS-CoV-2, encodes a protein with an immunoglobulin domain of unknown function, possibly involving interactions with the host immune system. It is similar in structure to the ORF7a protein, suggesting it may have originated through gene duplication.
ORF6 is a gene that encodes a viral accessory protein in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It is not present in MERS-CoV. It is thought to reduce the immune system response to viral infection through interferon antagonism.
ORF9b is a gene that encodes a viral accessory protein in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It is an overlapping gene whose open reading frame is entirely contained within the N gene, which encodes coronavirus nucleocapsid protein. The encoded protein is 97 amino acid residues long in SARS-CoV and 98 in SARS-CoV-2, in both cases forming a protein dimer.