Betacoronavirus NS7A protein | |||||||||
---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||
Symbol | bCoV_NS7A | ||||||||
Pfam | PF08779 | ||||||||
InterPro | IPR014888 | ||||||||
|
ORF7a (also known by several other names, including SARS coronavirus X4, SARS-X4, ORF7a, or U122) [1] is a gene found in coronaviruses of the Betacoronavirus genus. It expresses the Betacoronavirus NS7A protein, a type I transmembrane protein with an immunoglobulin-like protein domain. It was first discovered in SARS-CoV, the virus that causes severe acute respiratory syndrome (SARS). [2] The homolog in SARS-CoV-2, the virus that causes COVID-19, has about 85% sequence identity to the SARS-CoV protein. [3]
A number of possible functions for the ORF7a protein have been described. The primary function is thought to be immunomodulation and interferon antagonism. The protein is not essential for viral replication. [1]
Studies in SARS-CoV suggest that the protein forms protein-protein interactions with spike protein and ORF3a, and is present in mature virions, making it a minor viral structural protein. [1] [4] It is unclear if this occurs in SARS-CoV-2. [5] It may have a role in viral assembly. [1]
A number of interactions with host proteins and effects on host cell processes have been described. The SARS-CoV ORF7a protein has been reported to have binding activity to integrin I domains. [6]
It has also been reported to induce apoptosis via a caspase dependent pathway. [1] [7] Also, it contains a motif which has been demonstrated to mediate COPII dependent transport out of the endoplasmic reticulum, and the protein is targeted to the Golgi apparatus. [8]
In SARS-CoV-2, ORF7a protein has been described as an effective interferon antagonist. [3] The SARS-CoV-2 protein may have immunomodulatory effects through interaction with monocytes. [5]
The ORF7a protein is a transmembrane protein with 121 amino acid residues in SARS-CoV-2 [5] and 122 in SARS-CoV. [2] It is a type I transmembrane protein with an N-terminal signal peptide, an ectodomain that has an immunoglobulin fold, and a C-terminal endoplasmic reticulum retention signal sequence. [5] [6] [1] The structure contains seven beta strands which form two beta sheets, arranged in a beta sandwich. [2] Most of the sequence differences between SARS-CoV and SARS-CoV-2 occur in the Ig-like ectodomain and may produce differences in protein-protein interactions. [5]
The SARS-CoV-2 ORF7a protein has been reported to be post-translationally modified by ubiquitination. Polyubiquitin chains attached to lysine 119 may be related to the protein's reported interferon antagonism. [3] [9]
NCBI genome ID | 86693 |
---|---|
Genome size | 29,903 bases |
Year of completion | 2020 |
Genome browser (UCSC) |
Along with the genes for other viral accessory proteins, the ORF7a gene is located near those encoding the viral structural proteins, at the 5' end of the coronavirus RNA genome. [3] ORF7a is an overlapping gene that overlaps ORF7b. [10] In SARS-CoV, subcellular localization to the endoplasmic reticulum, Golgi apparatus, and ERGIC has been reported, [1] with similar Golgi localization described for SARS-CoV-2. [11]
It is thought that ORF8 in SARS-CoV-2, which encodes a protein with a similar Ig-like fold, may be a paralog of ORF7a that originated through gene duplication, [13] [14] though some bioinformatics analyses suggest the similarity may be too low to support duplication, which is relatively uncommon in viruses. [15] Immunoglobulin domains are uncommon in coronaviruses; other than the subset of betacoronaviruses with ORF8 and ORF7a, only a small number of bat alphacoronaviruses have been identified as containing likely Ig domains, while they are absent from gammacoronaviruses and deltacoronaviruses. [16] [14] The beta and alpha Ig domains may be independent acquisitions, where ORF8 and ORF7a may have been acquired from host proteins. [16]
Many SARS-CoV-2 genomes have been sequenced throughout the COVID-19 pandemic and a number of variations have been reported, including deletion mutations, [17] nonsense mutations (introducing a premature stop codon and truncating the protein), [18] and at least one gene fusion. [19]
Coronaviruses are a group of related RNA viruses that cause diseases in mammals and birds. In humans and birds, they cause respiratory tract infections that can range from mild to lethal. Mild illnesses in humans include some cases of the common cold, while more lethal varieties can cause SARS, MERS and COVID-19, which is causing the ongoing pandemic. In cows and pigs they cause diarrhea, while in mice they cause hepatitis and encephalomyelitis.
Severe acute respiratory syndrome–related coronavirus is a species of virus consisting of many known strains phylogenetically related to severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1) that have been shown to possess the capability to infect humans, bats, and certain other mammals. These enveloped, positive-sense single-stranded RNA viruses enter host cells by binding to the angiotensin-converting enzyme 2 (ACE2) receptor. The SARSr-CoV species is a member of the genus Betacoronavirus and of the subgenus Sarbecovirus.
Murine coronavirus (M-CoV) is a virus in the genus Betacoronavirus that infects mice. Belonging to the subgenus Embecovirus, murine coronavirus strains are enterotropic or polytropic. Enterotropic strains include mouse hepatitis virus (MHV) strains D, Y, RI, and DVIM, whereas polytropic strains, such as JHM and A59, primarily cause hepatitis, enteritis, and encephalitis. Murine coronavirus is an important pathogen in the laboratory mouse and the laboratory rat. It is the most studied coronavirus in animals other than humans, and has been used as an animal disease model for many virological and clinical studies.
Putative transmembrane domain more commonly known as Non-structural Protein 6 (NSP6) is one of the two non-structural proteins that gene 11 in rotavirus encodes for alongside NSP5. NSP6 is composed of six transmembrane domains and a C terminal tail. In contrast to the other rotavirus non-structural proteins, NSP6 was found to have a high rate of turnover, being completely degraded within 2 hours of synthesis. NSP6 was found to be a sequence-independent nucleic acid binding protein, with similar affinities for ssRNA and dsRNA
Poliovirus receptor-related 1 (PVRL1), also known as nectin-1 and CD111 (formerly herpesvirus entry mediator C, HVEC) is a human protein of the immunoglobulin superfamily (IgSF), also considered a member of the nectins. It is a membrane protein with three extracellular immunoglobulin domains, a single transmembrane helix and a cytoplasmic tail. The protein can mediate Ca2+-independent cellular adhesion further characterizing it as IgSF cell adhesion molecule (IgSF CAM).
Feline coronavirus (FCoV) is a positive-stranded RNA virus that infects cats worldwide. It is a coronavirus of the species Alphacoronavirus 1, which includes canine coronavirus (CCoV) and porcine transmissible gastroenteritis coronavirus (TGEV). FCoV has two different forms: feline enteric coronavirus (FECV), which infects the intestines, and feline infectious peritonitis virus (FIPV), which causes the disease feline infectious peritonitis (FIP).
Betacoronavirus is one of four genera of coronaviruses. Member viruses are enveloped, positive-strand RNA viruses that infect mammals. The natural reservoir for betacoronaviruses are bats and rodents. Rodents are the reservoir for the subgenus Embecovirus, while bats are the reservoir for the other subgenera.
ORF3b is a gene found in coronaviruses of the subgenus Sarbecovirus, encoding a short non-structural protein. It is present in both SARS-CoV and SARS-CoV-2, though the protein product has very different lengths in the two viruses. The encoded protein is significantly shorter in SARS-CoV-2, at only 22 amino acid residues compared to 153–155 in SARS-CoV. Both the longer SARS-CoV and shorter SARS-CoV-2 proteins have been reported as interferon antagonists. It is unclear whether the SARS-CoV-2 gene expresses a functional protein.
ORF3d is a gene found in SARS-CoV-2 and at least one closely related coronavirus found in pangolins, though it is not found in other closely related viruses within the Sarbecovirus subgenus. It is 57 codons long and encodes a novel 57 amino acid residue protein of unknown function. At least two isoforms have been described, of which the shorter 33-residue form, ORF3d-2, may be more highly expressed, or even the only form expressed. It is reported to be antigenic and antibodies to the ORF3d protein occur in patients recovered from COVID-19. There is no homolog in the genome of the otherwise closely related SARS-CoV.
The envelope (E) protein is the smallest and least well-characterized of the four major structural proteins found in coronavirus virions. It is an integral membrane protein less than 110 amino acid residues long; in SARS-CoV-2, the causative agent of Covid-19, the E protein is 75 residues long. Although it is not necessarily essential for viral replication, absence of the E protein may produce abnormally assembled viral capsids or reduced replication. E is a multifunctional protein and, in addition to its role as a structural protein in the viral capsid, it is thought to be involved in viral assembly, likely functions as a viroporin, and is involved in viral pathogenesis.
The membrane (M) protein is an integral membrane protein that is the most abundant of the four major structural proteins found in coronaviruses. The M protein organizes the assembly of coronavirus virions through protein-protein interactions with other M protein molecules as well as with the other three structural proteins, the envelope (E), spike (S), and nucleocapsid (N) proteins.
The nucleocapsid (N) protein is a protein that packages the positive-sense RNA genome of coronaviruses to form ribonucleoprotein structures enclosed within the viral capsid. The N protein is the most highly expressed of the four major coronavirus structural proteins. In addition to its interactions with RNA, N forms protein-protein interactions with the coronavirus membrane protein (M) during the process of viral assembly. N also has additional functions in manipulating the cell cycle of the host cell. The N protein is highly immunogenic and antibodies to N are found in patients recovered from SARS and COVID-19.
Spike (S) glycoprotein is the largest of the four major structural proteins found in coronaviruses. The spike protein assembles into trimers that form large structures, called spikes or peplomers, that project from the surface of the virion. The distinctive appearance of these spikes when visualized using negative stain transmission electron microscopy, "recalling the solar corona", gives the virus family its main name.
ORF3a is a gene found in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It encodes an accessory protein about 275 amino acid residues long, which is thought to function as a viroporin. It is the largest accessory protein and was the first of the SARS-CoV accessory proteins to be described.
ORF7b is a gene found in coronaviruses of the genus Betacoronavirus, which expresses the accessory protein Betacoronavirus NS7b protein. It is a short, highly hydrophobic transmembrane protein of unknown function.
ORF8 is a gene that encodes a viral accessory protein, Betacoronavirus NS8 protein, in coronaviruses of the subgenus Sarbecovirus. It is one of the least well conserved and most variable parts of the genome. In some viruses, a deletion splits the region into two smaller open reading frames, called ORF8a and ORF8b - a feature present in many SARS-CoV viral isolates from later in the SARS epidemic, as well as in some bat coronaviruses. For this reason the full-length gene and its protein are sometimes called ORF8ab. The full-length gene, exemplified in SARS-CoV-2, encodes a protein with an immunoglobulin domain of unknown function, possibly involving interactions with the host immune system. It is similar in structure to the ORF7a protein, suggesting it may have originated through gene duplication.
ORF6 is a gene that encodes a viral accessory protein in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It is not present in MERS-CoV. It is thought to reduce the immune system response to viral infection through interferon antagonism.
ORF9b is a gene that encodes a viral accessory protein in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It is an overlapping gene whose open reading frame is entirely contained within the N gene, which encodes coronavirus nucleocapsid protein. The encoded protein is 97 amino acid residues long in SARS-CoV and 98 in SARS-CoV-2, in both cases forming a protein dimer.
ORF1ab refers collectively to two open reading frames (ORFs), ORF1a and ORF1b, that are conserved in the genomes of nidoviruses, a group of viruses that includes coronaviruses. The genes express large polyproteins that undergo proteolysis to form several nonstructural proteins with various functions in the viral life cycle, including proteases and the components of the replicase-transcriptase complex (RTC). Together the two ORFs are sometimes referred to as the replicase gene. They are related by a programmed ribosomal frameshift that allows the ribosome to continue translating past the stop codon at the end of ORF1a, in a -1 reading frame. The resulting polyproteins are known as pp1a and pp1ab.
The nidoviral papain-like protease is a papain-like protease protein domain encoded in the genomes of nidoviruses. It is expressed as part of a large polyprotein from the ORF1a gene and has cysteine protease enzymatic activity responsible for proteolytic cleavage of some of the N-terminal viral nonstructural proteins within the polyprotein. A second protease also encoded by ORF1a, called the 3C-like protease or main protease, is responsible for the majority of further cleavages. Coronaviruses have one or two papain-like protease domains; in SARS-CoV and SARS-CoV-2, one PLPro domain is located in coronavirus nonstructural protein 3 (nsp3). Arteriviruses have two to three PLP domains. In addition to their protease activity, PLP domains function as deubiquitinating enzymes (DUBs) that can cleave the isopeptide bond found in ubiquitin chains. They are also "deISGylating" enzymes that remove the ubiquitin-like domain interferon-stimulated gene 15 (ISG15) from cellular proteins. These activities are likely responsible for antagonizing the activity of the host innate immune system. Because they are essential for viral replication, papain-like protease domains are considered drug targets for the development of antiviral drugs against human pathogens such as MERS-CoV, SARS-CoV, and SARS-CoV-2.