|Coronavirus spike glycoprotein S1|
|Coronavirus spike glycoprotein S2|
|Betacoronavirus spike glycoprotein S1, receptor binding|
|Betacoronavirus-like spike glycoprotein S1, N-terminal|
Spike (S) protein (sometimes also spike glycoprotein ,formerly known as E2 ) is the largest of the four major structural proteins found in coronaviruses. The spike protein assembles into trimers that form large structures, called spikes or peplomers, that project from the surface of the virion. The distinctive appearance of these spikes when visualized using negative stain transmission electron microscopy, "recalling the solar corona", gives the virus family its name.
The function of the spike protein is to mediate viral entry into the host cell by first interacting with molecules on the exterior cell surface and then fusing the viral and cellular membranes. The spike protein is a class I fusion protein that contains two regions, known as S1 and S2, responsible for these two functions. The S1 region contains the receptor-binding domain that binds to receptors on the cell surface. Coronaviruses use a very diverse range of receptors; SARS-CoV (which causes SARS) and SARS-CoV-2 (which causes COVID-19) both interact with angiotensin-converting enzyme 2 (ACE2). The S2 region contains the fusion peptide and other fusion infrastructure necessary for membrane fusion with the host cell, a required step for infection and viral replication. The spike protein determines the virus' host range (which organisms it can infect) and cell tropism (which cells or tissues it can infect within an organism).
The spike protein is highly immunogenic. Antibodies against the spike protein are found in patients recovered from SARS and COVID-19. Neutralizing antibodies target epitopes on the receptor-binding domain.Most COVID-19 vaccine development efforts in response to the COVID-19 pandemic aim to activate the immune system against the spike protein.
The spike protein is very large, often 1200-1400 amino acid residues long;it is 1273 residues in SARS-CoV-2. It is a single-pass transmembrane protein with a short C-terminal tail on the interior of the virus, a transmembrane helix, and a large N-terminal ectodomain exposed on the virus exterior.
The spike protein forms homotrimers in which three copies of the protein interact through their ectodomains.The trimer structures have been described as club- pear-, or petal-shaped. Each spike protein contains two regions known as S1 and S2, and in the assembled trimer the S1 regions at the N-terminal end form the portion of the protein furthest from the viral surface while the S2 regions form a flexible "stalk" containing most of the protein-protein interactions that hold the trimer in place.
The S1 region of the spike protein is responsible for interacting with receptor molecules on the surface of the host cell in the first step of viral entry.S1 contains two domains, called the N-terminal domain (NTD) and C-terminal domain (CTD), sometimes also known as the A and B domains. Depending on the coronavirus, either or both domains may be used as receptor-binding domains (RBD). Target receptors can be very diverse, including cell surface receptor proteins and sugars such as sialic acids as receptors or coreceptors. In general, the NTD binds sugar molecules while the CTD binds proteins, with the exception of mouse hepatitis virus which uses its NTD to interact with a protein receptor called CEACAM1. The NTD has a galectin-like protein fold, but binds sugar molecules somewhat differently than galectins.
The CTD is responsible for the interactions of MERS-CoV with its receptor dipeptidyl peptidase-4,and those of SARS-CoV and SARS-CoV-2 with their receptor angiotensin-converting enzyme 2 (ACE2). The CTD of these viruses can be further divided into two subdomains, known as the core and the extended loop or receptor-binding motif (RBM), where most of the residues that directly contact the target receptor are located. There are subtle differences, mainly in the RBM, between the SARS-CoV and SARS-CoV-2 spike proteins' interactions with ACE2. Comparisons of spike proteins from multiple coronaviruses suggest that divergence in the RBM region can account for differences in target receptors, even when the core of the S1 CTD is structurally very similar.
Within coronavirus lineages, as well as across the four major coronavirus subgroups, the S1 region is less well conserved than S2, as befits its role in interacting with virus-specific host cell receptors.Within the S1 region, the NTD is more highly conserved than the CTD.
The S2 region of the spike protein is responsible for membrane fusion between the viral envelope and the host cell, enabling entry of the virus' genome into the cell.The S2 region contains the fusion peptide, a stretch of mostly hydrophobic amino acids whose function is to enter and destabilize the host cell membrane. S2 also contains two heptad repeat subdomains known as HR1 and HR2, sometimes called the "fusion core" region. These subdomains undergo dramatic conformational changes during the fusion process to form a six-helix bundle, a characteristic feature of the class I fusion proteins. The S2 region is also considered to include the transmembrane helix and C-terminal tail located in the interior of the virion.
Relative to S1, the S2 region is very well conserved among coronaviruses.
The spike protein is a glycoprotein and is heavily glycosylated through N-linked glycosylation.Studies of the SARS-CoV-2 spike protein have also reported O-linked glycosylation in the S1 region. The C-terminal tail, located in the interior of the virion, is enriched in cysteine residues and is palmitoylated.
Spike proteins are activated through proteolytic cleavage. They are cleaved by host cell proteases at the S1-S2 boundary and later at what is known as the S2' site at the N-terminus of the fusion peptide.
Like other class I fusion proteins, the spike protein undergoes a very large conformational change during the fusion process.Both the pre-fusion and post-fusion states of several coronaviruses, especially SARS-CoV-2, have been studied by cryo-electron microscopy. Functionally important protein dynamics have also been observed within the pre-fusion state, in which the relative orientations of some of the S1 regions relative to S2 in a trimer can vary. In the closed state, all three S1 regions are packed closely and the region that makes contact with host cell receptors is sterically inaccessible, while the open states have one or two S1 RBDs more accessible for receptor binding, in an open or "up" conformation.
|NCBI genome ID|
|Genome size||29,903 bases|
|Year of completion||2020|
The gene encoding the spike protein is located toward the 3' end of the virus's positive-sense RNA genome, along with the genes for the other three structural proteins and various virus-specific accessory proteins.Protein trafficking of spike proteins appears to depend on the coronavirus subgroup: when expressed in isolation without other viral proteins, spike proteins from betacoronaviruses are able to reach the cell surface, while those from alphacoronaviruses and gammacoronaviruses are retained intracellularly. In the presence of the M protein, spike protein trafficking is altered and instead is retained at the ERGIC, the site at which viral assembly occurs. In SARS-CoV-2, both the M and the E protein modulate spike protein trafficking through different mechanisms.
The spike protein is not required for viral assembly or the formation of virus-like particles;however, presence of spike may influence the size of the envelope. Incorporation of the spike protein into virions during assembly and budding is dependent on protein-protein interactions with the M protein through the C-terminal tail. Examination of virions using cryo-electron microscopy suggests that there are approximately 25 to 100 spike trimers per virion.
The spike protein is responsible for viral entry into the host cell, a required early step in viral replication. It is essential for replication.It performs this function in two steps, first binding to a receptor on the surface of the host cell through interactions with the S1 region, and then fusing the viral and cellular membranes through the action of the S2 region. The location of fusion varies depending on the specific coronavirus, with some able to enter at the plasma membrane and others entering from endosomes after endocytosis.
The interaction of the receptor-binding domain in the S1 region with its target receptor on the cell surface initiates the process of viral entry. Different coronaviruses target different cell-surface receptors, sometimes using sugar molecules such as sialic acids, or forming protein-protein interactions with proteins exposed on the cell surface.Different coronaviruses vary widely in their target receptor. The presence of a target receptor that S1 can bind is a determinant of host range and cell tropism.
|Human coronavirus 229E||Alphacoronavirus||Aminopeptidase N|
|Human coronavirus NL63||Alphacoronavirus||Angiotensin-converting enzyme 2|
|Human coronavirus HKU1||Betacoronavirus||N-acetyl-9-O-acetylneuraminic acid|
|Human coronavirus OC43||Betacoronavirus||N-acetyl-9-O-acetylneuraminic acid|
|Middle East respiratory syndrome–related coronavirus||Betacoronavirus||Dipeptidyl peptidase-4|
|Severe acute respiratory syndrome coronavirus||Betacoronavirus||Angiotensin-converting enzyme 2|
|Severe acute respiratory syndrome coronavirus 2||Betacoronavirus||Angiotensin-converting enzyme 2|
Proteolytic cleavage of the spike protein, sometimes known as "priming", is required for membrane fusion. Relative to other class I fusion proteins, this process is complex and requires two cleavages at different sites, one at the S1/S2 boundary and one at the S2' site to release the fusion peptide.Coronaviruses vary in which part of the viral life cycle these cleavages occur, particularly the S1/S2 cleavage. Many coronaviruses are cleaved at S1/S2 before viral exit from the virus-producing cell, by furin and other proprotein convertases; in SARS-CoV-2 a polybasic furin cleavage site is present at this position. Others may be cleaved by extracellular proteases such as elastase, by proteases located on the cell surface after receptor binding, or by proteases found in lysosomes after endocytosis. The specific proteases responsible for this cleavage depends on the virus, cell type, and local environment. In SARS-CoV, the serine protease TMPRSS2 is important for this process, with additional contributions from cysteine proteases cathepsin B and cathepsin L in endosomes. Trypsin and trypsin-like proteases have also been reported to contribute. In SARS-CoV-2, TMPRSS2 is the primary protease for S2' cleavage, and its presence is reported to be essential for viral infection.
Like other class I fusion proteins, the spike protein in its pre-fusion conformation is in a metastable state.A dramatic conformational change is triggered to induce the heptad repeats in the S2 region to refold into an extended six-helix bundle, causing the fusion peptide to interact with the cell membrane and bringing the viral and cell membranes into close proximity. Receptor binding and proteolytic cleavage (sometimes known as "priming") are required, but additional triggers for this conformational change vary depending on the coronavirus and local environment. In vitro studies of SARS-CoV suggest a dependence on calcium concentration. Unusually for coronaviruses, infectious bronchitis virus, which infects birds, can be triggered by low pH alone; for other coronaviruses, low pH is not itself a trigger but may be required for activity of proteases, which in turn are required for fusion. The location of membrane fusion—at the plasma membrane or in endosomes—may vary based on the availability of these triggers for conformational change. Fusion of the viral and cell membranes permits the entry of the virus' positive-sense RNA genome into the host cell cytosol, after which expression of viral proteins begins.
In addition to fusion of viral and host cell membranes, some coronavirus spike proteins can initiate membrane fusion between infected cells and neighboring cells, forming syncytia.This behavior can be observed in infected cells in cell culture. Syncytia have been observed in patient tissue samples from infections with SARS-CoV, MERS-CoV, and SARS-CoV-2, though some reports highlight a difference in syncytia formation between the SARS-CoV and SARS-CoV-2 spikes attributed to sequence differences near the S1/S2 cleavage site.
Because it is exposed on the surface of the virus, the spike protein is a major antigen to which neutralizing antibodies are developed.Its extensive glycosylation can serve as a glycan shield that hides epitopes from the immune system. Due to the outbreak of SARS and the COVID-19 pandemic, antibodies to SARS-CoV and SARS-CoV-2 spike proteins have been extensively studied. Antibodies to the SARS-CoV and SARS-CoV-2 spike proteins have been identified that target epitopes on the receptor-binding domain or interfere with the process of conformational change. The majority of antibodies from infected individuals target the receptor-binding domain.
In response to the COVID-19 pandemic, a number of COVID-19 vaccines have been developed using a variety of technologies, including mRNA vaccines and viral vector vaccines. Most vaccine development has targeted the spike protein.Building on techniques previously used in vaccine research aimed at respiratory syncytial virus and SARS-CoV, many SARS-CoV-2 vaccine development efforts have used constructs that include mutations to stabilize the spike protein's pre-fusion conformation, facilitating development of antibodies against epitopes exposed in this conformation.
Monoclonal antibodies that target the spike protein have been developed as COVID-19 treatments. As of July 8, 2021, three monoclonal antibody products had received Emergency Use Authorization in the United States:bamlanivimab/etesevimab, casirivimab/imdevimab, and sotrovimab. Bamlanivimab/etesevimab was not recommended in the United States due to the increase in SARS-CoV-2 variants that are less susceptible to these antibodies.
Throughout the COVID-19 pandemic, the genome of SARS-CoV-2 viruses was sequenced many times, resulting in identification of thousands of distinct variants.Many of these possess mutations that change the amino acid sequence of the spike protein. In a World Health Organization analysis from July 2020, the spike (S) gene was the second most frequently mutated in the genome, after ORF1ab (which encodes most of the virus' nonstructural proteins). The evolution rate in the spike gene is higher than that observed in the genome overall. Analyses of SARS-CoV-2 genomes suggests that some sites in the spike protein sequence, particularly in the receptor-binding domain, are of evolutionary importance and are undergoing positive selection.
Spike protein mutations raise concern because they may affect infectivity or transmissibility, or facilitate immune escape.The mutation D614G has arisen independently in multiple viral lineages and become dominant among sequenced genomes; it may have advantages in infectivity and transmissibility possibly due to increasing the density of spikes on the viral surface, increasing the proportion of binding-competent conformations, or improving stability. Mutations at position E484, particularly E484K, have been associated with immune escape and reduced antibody binding.
During the COVID-19 pandemic, anti-vaccination misinformation about COVID-19 circulated on social media platforms related to the spike protein's role in COVID-19 vaccines. Spike proteins were said to be dangerously "cytotoxic" and mRNA vaccines containing them therefore in themselves dangerous. Spike proteins are not cytotoxic or dangerous.Spike proteins were also said to be "shed" by vaccinated people, in an erroneous allusion to the phenomenon of vaccine-induced viral shedding, which is a rare effect of live-virus vaccines unlike those used for COVID-19. "Shedding" of spike proteins is not possible.
The class I fusion proteins, a group whose well-characterized examples include the coronavirus spike protein, influenza virus hemagglutinin, and HIV Gp41, are thought to be evolutionarily related.The S2 region of the spike protein responsible for membrane fusion is more highly conserved than the S1 region responsible for receptor interactions. The S1 region appears to have undergone significant diversifying selection.
Within the S1 region, the N-terminal domain is more conserved than the C-terminal domain.The NTD's galectin-like protein fold suggests a relationship with structurally similar cellular proteins from which it may have evolved through gene capture from the host. It has been suggested that the CTD may have evolved from the NTD by gene duplication. The surface-exposed position of the CTD, vulnerable to the host immune system, may place this region under high selective pressure. Comparisons of the structures of different coronavirus CTDs suggests they may be under diversifying selection and in some cases, distantly related coronaviruses that use the same cell-surface receptor may do so through convergent evolution.
The genomic region that encodes for the Spike is also a hotspot for homologous as well as non-homologous recombination.Especially for Spike recombination events among different coronavirus subgenera, the cross-over sites are not single, but they tend to be double, meaning that the Spike moves to another subgenus in a cassette-like mode. So far, there is no evidence that the Spike has moved (via recombination) from one subgenus to another, within the Beta-Coronavirus genus (that includes human SARS-CoV, SARS-CoV-2, MERS, HKU1, OC43). However, many such Spike recombination events have occurred among different subgenera in Alpha, Gamma and Delta coronavirus genera, and they are a significant cause for concern and vigilance, concerning the evolution of the COVID19 pandemic.
Coronaviruses are a group of related RNA viruses that cause diseases in mammals and birds. In humans and birds, they cause respiratory tract infections that can range from mild to lethal. Mild illnesses in humans include some cases of the common cold, while more lethal varieties can cause SARS, MERS, and COVID-19. In cows and pigs they cause diarrhea, while in mice they cause hepatitis and encephalomyelitis.
Severe acute respiratory syndrome–related coronavirus is a species of virus consisting of many known strains phylogenetically related to severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1) that have been shown to possess the capability to infect humans, bats, and certain other mammals. These enveloped, positive-sense single-stranded RNA viruses enter host cells by binding to the angiotensin-converting enzyme 2 (ACE2) receptor. The SARSr-CoV species is a member of the genus Betacoronavirus and of the subgenus Sarbecovirus.
Hemagglutinin esterase (HEs) is a glycoprotein that certain enveloped viruses possess and use as invading mechanism. HEs helps in the attachment and destruction of certain sialic acid receptors that are found on the host cell surface. Viruses that possess HEs include influenza C virus, toroviruses, and coronaviruses. HEs is a dimer transmembrane protein consisting of two monomers, each monomer is made of three domains. The three domains are: membrane fusion, esterase, and receptor binding domains.
Angiotensin-converting enzyme 2 (ACE2) is an enzyme attached to the membrane of cells in the intestines, kidney, testis, gallbladder, and heart. ACE2 lowers blood pressure by catalyzing the hydrolysis of angiotensin II into angiotensin (1–7). ACE2 counters the activity of the related angiotensin-converting enzyme (ACE) by reducing the amount of angiotensin-II and increasing Ang(1-7), making it a promising drug target for treating cardiovascular diseases.
Murine coronavirus (M-CoV) is a virus in the genus Betacoronavirus that infects mice. Belonging to the subgenus Embecovirus, murine coronaviruses strains are enterotropic or polytropic. Enterotropic strains include mouse hepatitis virus (MHV) strains D, Y, RI, and DVIM, whereas polytropic strains, such as JHM and A59, primarily cause hepatitis, enteritis, and encephalitis. Murine coronavirus is an important pathogen in the laboratory mouse and the laboratory rat. It is the most studied coronavirus in animals other than humans, and has been used as an animal disease model for many virological and clinical studies.
Viral entry is the earliest stage of infection in the viral life cycle, as the virus comes into contact with the host cell and introduces viral material into the cell. The major steps involved in viral entry are shown below. Despite the variation among viruses, there are several shared generalities concerning viral entry.
Entry inhibitors, also known as fusion inhibitors, are a class of antiviral drugs that prevent a virus from entering a cell, for example, by blocking a receptor. Entry inhibitors are used to treat conditions such as HIV and hepatitis D.
In virology, a spike protein or peplomer protein is a protein that forms a large structure known as a spike or peplomer projecting from the surface of an enveloped virus. The proteins are usually glycoproteins that form dimers or trimers. Often the term "spike protein" refers specifically to the coronavirus spike protein, one of the four major structural proteins common to all coronaviruses, which gives rise to the distinctive appearance of these viruses in electron micrographs.
Env is a viral gene that encodes the protein forming the viral envelope. The expression of the env gene enables retroviruses to target and attach to specific cell types, and to infiltrate the target cell membrane.
Feline coronavirus (FCoV) is a positive-stranded RNA virus that infects cats worldwide. It is a coronavirus of the species Alphacoronavirus 1 which includes canine coronavirus (CCoV) and porcine transmissible gastroenteritis coronavirus (TGEV). It has two different forms: feline enteric coronavirus (FECV) that infects the intestines and feline infectious peritonitis virus (FIPV) that causes the disease feline infectious peritonitis (FIP).
Betacoronavirus is one of four genera of coronaviruses. Member viruses are enveloped, positive-strand RNA viruses that infect mammals. The natural reservoir for betacoronaviruses are bats and rodents. Rodents are the reservoir for the subgenus Embecovirus, while bats are the reservoir for the other subgenera.
Severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2), also known as the coronavirus, is the virus that causes COVID-19, the respiratory illness responsible for the ongoing COVID-19 pandemic. The virus was previously referred to by its provisional name, 2019 novel coronavirus (2019-nCoV), and has also been called human coronavirus 2019. First identified in the city of Wuhan, Hubei, China, the World Health Organization declared the outbreak a Public Health Emergency of International Concern on 30 January 2020, and a pandemic on 11 March 2020. SARS‑CoV‑2 is a positive-sense single-stranded RNA virus that is contagious in humans. As described by the US National Institutes of Health, it is the successor to SARS-CoV-1, the virus that caused the 2002–2004 SARS outbreak.
Kizzmekia "Kizzy" Shanta Corbett is an American viral immunologist. She is the Shutzer Assistant Professor at the Harvard Radcliffe Institute and assistant professor of immunology and infectious diseases at Harvard T.H. Chan School of Public Health. She joined Harvard following six years at the Vaccine Research Center (VRC) at the National Institute of Allergy and Infectious Diseases, National Institutes of Health based in Bethesda, Maryland. She earned a PhD in microbiology and immunology from the University of North Carolina at Chapel Hill in 2014. Appointed to the VRC in 2014, Corbett was the scientific lead of the VRC's Coronavirus Team, with research efforts aimed at propelling novel coronavirus vaccines, including a COVID-19 vaccine. In February 2021, Corbett was highlighted in the Time's "Time100 Next" list under the category of Innovators, with a profile written by Anthony Fauci.
Jason S. McLellan is a structural biologist, associate professor in the Department of Molecular Biosciences and Robert A. Welch Chair in Chemistry at The University of Texas at Austin who specializes in understanding the structure and function of viral proteins, including those of coronaviruses. His research focuses on applying structural information to the rational design of vaccines and other therapies for viruses, including SARS-CoV-2, the novel coronavirus that causes COVID-19. McLellan and his team collaborated with researchers at the National Institute of Allergy and Infectious Diseases’ Vaccine Research Center to design a stabilized version of the SARS-CoV-2 spike protein, which biotechnology company Moderna used as the basis for the vaccine mRNA-1273, the first COVID-19 vaccine candidate to enter phase I clinical trials in the U.S. At least three other vaccines use this modified spike protein: those from Pfizer and BioNTech; Johnson & Johnson and Janssen Pharmaceutica; and Novavax.
Bat coronavirus RaTG13 is a SARS-like betacoronavirus that infects the horseshoe bat Rhinolophus affinis. It was discovered in 2013 in bat droppings from a mining cave near the town of Tongguan in Mojiang county in Yunnan, China. Recent research suggests that BANAL-52, a strain of coronavirus found in bats in Laos is a closer match to SARS-CoV2 than RaTG13 is.
UB-612 is a COVID-19 vaccine candidate developed by United Biomedical, Inc. Asia, Vaxxinity, and DASA. It is a peptide vaccine.
The envelope (E) protein is the smallest and least well-characterized of the four major structural proteins found in coronavirus virions. It is an integral membrane protein less than 110 amino acid residues long; in SARS-CoV-2, the causative agent of Covid-19, the E protein is 75 residues long. Although it is not necessarily essential for viral replication, absence of the E protein may produce abnormally assembled viral capsids or reduced replication. E is a multifunctional protein and, in addition to its role as a structural protein in the viral capsid, it is thought to be involved in viral assembly, likely functions as a viroporin, and is involved in viral pathogenesis.
The membrane (M) protein is an integral membrane protein that is the most abundant of the four major structural proteins found in coronaviruses. The M protein organizes the assembly of coronavirus virions through protein-protein interactions with other M protein molecules as well as with the other three structural proteins, the envelope (E), spike (S), and nucleocapsid (N) proteins.
The nucleocapsid (N) protein is a protein that packages the positive-sense RNA genome of coronaviruses to form ribonucleoprotein structures enclosed within the viral capsid. The N protein is the most highly expressed of the four major coronavirus structural proteins. In addition to its interactions with RNA, N forms protein-protein interactions with the coronavirus membrane protein (M) during the process of viral assembly. N also has additional functions in manipulating the cell cycle of the host cell. The N protein is highly immunogenic and antibodies to N are found in patients recovered from SARS and Covid-19.
ORF3a is a gene found in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It encodes an accessory protein about 275 amino acid residues long, which is thought to function as a viroporin. It is the largest accessory protein and was the first of the SARS-CoV accessory proteins to be described.