ORF3a

Last updated
Betacoronavirus viroporin
6xdc SARS-CoV-2 ORF3a.png
Cryo-electron microscopy structure of the SARS-CoV-2 ORF3a protein dimer. From PDB: 6XDC . [1]
Identifiers
SymbolbCoV_viroporin
Pfam PF11289
InterPro IPR024407
Available protein structures:
Pfam   structures / ECOD  
PDB RCSB PDB; PDBe; PDBj
PDBsum structure summary

ORF3a (previously known as X1 or U274) [2] is a gene found in coronaviruses of the subgenus Sarbecovirus , including SARS-CoV [3] [2] and SARS-CoV-2. [1] [4] It encodes an accessory protein about 275 amino acid residues long, which is thought to function as a viroporin. [1] It is the largest accessory protein [2] [4] and was the first of the SARS-CoV accessory proteins to be described. [3]

Contents

Comparative genomics

ORF3a is well conserved within the subgenus Sarbecovirus . [3] [2] The protein has 73% sequence identity between SARS-CoV (274 residues) and SARS-CoV-2 (275 residues). [1] Within the ORF3a open reading frame there are several overlapping genes in the genome: ORF3a, ORF3b, and (in SARS-CoV-2 only) ORF3c. In SARS-CoV-2, the overlap between ORF3a, ORF3c, and ORF3d potentially represents a rare example of all three possible reading frames of the same sequence region encoding functional proteins. [5] [6]

Although ORF3a is present in Sarbecovirus , it is absent in another Betacoronavirus subgenus, Embecovirus , which includes the human coronaviruses HKU1 and OC43. It may be distantly related to ORF5 in Merbecovirus , which includes MERS-CoV. Distant homologs of ORF3a have been identified in Alphacoronavirus , which includes the human coronaviruses 229E and NL63, but not in Gammacoronavirus or Deltacoronavirus . [1]

Structure

The ORF3a protein is a transmembrane protein that contains three transmembrane domains. It has an N-terminal ectodomain and C-terminal endodomain, which is separated from the transmembrane domain by a cysteine-rich region. [3] [2] It is thought to function as a dimer or tetramer, which is assembled at the plasma membrane. It may also form higher-order oligomers, with unknown functional effects. [3] [2] [1]

Post-translational modifications

In SARS-CoV, post-translational modification of ORF3a by O-glycosylation has been observed. [3] [7] In hCoV-NL63, it is N-glycosylated. [8]

Expression and localization

Genomic information
SARS-CoV-2 genome.svg
Genomic organisation of isolate Wuhan-Hu-1, the earliest sequenced sample of SARS-CoV-2, indicating the location of ORF3a
NCBI genome ID 86693
Genome size 29,903 bases
Year of completion 2020
Genome browser (UCSC)

Along with the genes for other accessory proteins, the ORF3a gene is located near those encoding the structural proteins, at the 3' end of the coronavirus RNA genome. ORF3a is located between the spike (S) and envelope (E) genes. [3] ORF3a is expressed from the second-largest subgenomic RNA. [2] In SARS-CoV, subcellular localization is diverse and it can be found in the cytoplasm, at the plasma membrane, and in the Golgi apparatus. [3] [2] Its sequence contains protein trafficking signals that target it to the plasma membrane. [3] In hCoV-NL63, it is targeted to the endoplasmic-reticulum–Golgi intermediate compartment (ERGIC). [8]

Function

The ORF3a protein does not appear to be essential for viral replication. From studies with SARS-CoV, there is conflicting evidence on whether or not its deletion reduces replication efficiency. [3] [2]

Viroporin

The ORF3a protein is thought to form a cation-permeable ion channel. [3] [1] [9] It is believed to function as a viroporin. [1] Along with the envelope protein, it is one of two possible viroporins in SARS-CoV-2, and one of three in SARS-CoV, which encodes the additional possible viroporin ORF8a. [1]

Viral protein interactions

The ORF3a protein in SARS-CoV has been shown to form protein-protein interactions with several structural proteins - spike protein, membrane protein, and nucleocapsid protein - as well as ORF7a, another accessory protein. [3] Through the cysteine-rich region, it may form disulfide bonds to the spike protein. [3] [2] Incorporation of the ORF3b protein into virions has been observed for SARS-CoV [3] [2] and hCoV-NL63, [8] indicating that it is a viral structural protein.

Host cell effects

A number of effects of ORF3a on the host cell have been described under experimental conditions. ORF3a has been associated with induction of apoptosis in studies of both SARS-CoV and SARS-CoV-2 in cell culture. [3] [2] [4]

Immunogenicity

The ORF3a protein is antigenic and antibodies have been observed in patients recovered from infections with SARS-CoV (which causes the disease SARS) [3] [2] or with SARS-CoV-2 (which causes COVID-19). [1]

Related Research Articles

<span class="mw-page-title-main">Coronavirus</span> Subfamily of viruses in the family Coronaviridae

Coronaviruses are a group of related RNA viruses that cause diseases in mammals and birds. In humans and birds, they cause respiratory tract infections that can range from mild to lethal. Mild illnesses in humans include some cases of the common cold, while more lethal varieties can cause SARS, MERS and COVID-19. In cows and pigs they cause diarrhea, while in mice they cause hepatitis and encephalomyelitis.

<span class="mw-page-title-main">SARS-related coronavirus</span> Species of coronavirus causing SARS and COVID-19

Betacoronavirus pandemicum is a species of virus consisting of many known strains. Two strains of the virus have caused outbreaks of severe respiratory diseases in humans: severe acute respiratory syndrome coronavirus 1, the cause of the 2002–2004 outbreak of severe acute respiratory syndrome (SARS), and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of the pandemic of COVID-19. There are hundreds of other strains of SARSr-CoV, which are only known to infect non-human mammal species: bats are a major reservoir of many strains of SARSr-CoV; several strains have been identified in Himalayan palm civets, which were likely ancestors of SARS-CoV-1.

<i>Coronaviridae</i> Family of viruses in the order Nidovirales

Coronaviridae is a family of enveloped, positive-strand RNA viruses which infect amphibians, birds, and mammals. The group includes the subfamilies Letovirinae and Orthocoronavirinae; the members of the latter are known as coronaviruses.

<i>Human coronavirus NL63</i> Species of virus

Human coronavirus NL63 (HCoV-NL63) is a species of coronavirus, specifically a Setracovirus from among the Alphacoronavirus genus. It was identified in late 2004 in patients in the Netherlands by Lia van der Hoek and Krzysztof Pyrc using a novel virus discovery method VIDISCA. Later on the discovery was confirmed by the researchers from Rotterdam. The virus is an enveloped, positive-sense, single-stranded RNA virus which enters its host cell by binding to ACE2. Infection with the virus has been confirmed worldwide, and has an association with many common symptoms and diseases. Associated diseases include mild to moderate upper respiratory tract infections, severe lower respiratory tract infection, croup and bronchiolitis.

<span class="mw-page-title-main">Vpu protein</span>

Vpu is an accessory protein that in HIV is encoded by the vpu gene. Vpu stands for "Viral Protein U". The Vpu protein acts in the degradation of CD4 in the endoplasmic reticulum and in the enhancement of virion release from the plasma membrane of infected cells. Vpu induces the degradation of the CD4 viral receptor and therefore participates in the general downregulation of CD4 expression during the course of HIV infection. Vpu-mediated CD4 degradation is thought to prevent CD4-Env binding in the endoplasmic reticulum to facilitate proper Env assembly into virions. It is found in the membranes of infected cells, but not the virus particles themselves.

<span class="mw-page-title-main">ORF7a</span> Gene found in coronaviruses of the Betacoronavirus genus

ORF7a is a gene found in coronaviruses of the Betacoronavirus genus. It expresses the Betacoronavirus NS7A protein, a type I transmembrane protein with an immunoglobulin-like protein domain. It was first discovered in SARS-CoV, the virus that causes severe acute respiratory syndrome (SARS). The homolog in SARS-CoV-2, the virus that causes COVID-19, has about 85% sequence identity to the SARS-CoV protein.

<i>Embecovirus</i> Subgenus of viruses

Embecovirus is a subgenus of coronaviruses in the genus Betacoronavirus. The viruses in this subgenus, unlike other coronaviruses, have a hemagglutinin esterase (HE) gene. The viruses in the subgenus were previously known as group 2a coronaviruses.

<i>Merbecovirus</i> Subgenus of viruses

Merbecovirus is a subgenus of viruses in the genus Betacoronavirus, including the human pathogen Middle East respiratory syndrome–related coronavirus (MERS-CoV). The viruses in this subgenus were previously known as group 2c coronaviruses.

ORF3b is a gene found in coronaviruses of the subgenus Sarbecovirus, encoding a short non-structural protein. It is present in both SARS-CoV and SARS-CoV-2, though the protein product has very different lengths in the two viruses. The encoded protein is significantly shorter in SARS-CoV-2, at only 22 amino acid residues compared to 153–155 in SARS-CoV. Both the longer SARS-CoV and shorter SARS-CoV-2 proteins have been reported as interferon antagonists. It is unclear whether the SARS-CoV-2 gene expresses a functional protein.

ORF3d is a gene found in SARS-CoV-2 and at least one closely related coronavirus found in pangolins, though it is not found in other closely related viruses within the Sarbecovirus subgenus. It is 57 codons long and encodes a novel 57 amino acid residue protein of unknown function. At least two isoforms have been described, of which the shorter 33-residue form, ORF3d-2, may be more highly expressed, or even the only form expressed. It is reported to be antigenic and antibodies to the ORF3d protein occur in patients recovered from COVID-19. There is no homolog in the genome of the otherwise closely related SARS-CoV.

<span class="mw-page-title-main">Coronavirus envelope protein</span> Major structure in coronaviruses

The envelope (E) protein is the smallest and least well-characterized of the four major structural proteins found in coronavirus virions. It is an integral membrane protein less than 110 amino acid residues long; in SARS-CoV-2, the causative agent of Covid-19, the E protein is 75 residues long. Although it is not necessarily essential for viral replication, absence of the E protein may produce abnormally assembled viral capsids or reduced replication. E is a multifunctional protein and, in addition to its role as a structural protein in the viral capsid, it is thought to be involved in viral assembly, likely functions as a viroporin, and is involved in viral pathogenesis.

<span class="mw-page-title-main">Coronavirus membrane protein</span> Major structure in coronaviruses

The membrane (M) protein is an integral membrane protein that is the most abundant of the four major structural proteins found in coronaviruses. The M protein organizes the assembly of coronavirus virions through protein-protein interactions with other M protein molecules as well as with the other three structural proteins, the envelope (E), spike (S), and nucleocapsid (N) proteins.

<span class="mw-page-title-main">Coronavirus nucleocapsid protein</span> Most expressed structure in coronaviruses

The nucleocapsid (N) protein is a protein that packages the positive-sense RNA genome of coronaviruses to form ribonucleoprotein structures enclosed within the viral capsid. The N protein is the most highly expressed of the four major coronavirus structural proteins. In addition to its interactions with RNA, N forms protein-protein interactions with the coronavirus membrane protein (M) during the process of viral assembly. N also has additional functions in manipulating the cell cycle of the host cell. The N protein is highly immunogenic and antibodies to N are found in patients recovered from SARS and COVID-19.

ORF3c is a gene found in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It was first identified in the SARS-CoV-2 genome and encodes a 41 amino acid non-structural protein of unknown function. It is also present in the SARS-CoV genome, but was not recognized until the identification of the SARS-CoV-2 homolog.

ORF7b is a gene found in coronaviruses of the genus Betacoronavirus, which expresses the accessory protein Betacoronavirus NS7b protein. It is a short, highly hydrophobic transmembrane protein of unknown function.

<span class="mw-page-title-main">ORF8</span> Gene that encodes a viral accessory protein

ORF8 is a gene that encodes a viral accessory protein, Betacoronavirus NS8 protein, in coronaviruses of the subgenus Sarbecovirus. It is one of the least well conserved and most variable parts of the genome. In some viruses, a deletion splits the region into two smaller open reading frames, called ORF8a and ORF8b - a feature present in many SARS-CoV viral isolates from later in the SARS epidemic, as well as in some bat coronaviruses. For this reason the full-length gene and its protein are sometimes called ORF8ab. The full-length gene, exemplified in SARS-CoV-2, encodes a protein with an immunoglobulin domain of unknown function, possibly involving interactions with the host immune system. It is similar in structure to the ORF7a protein, suggesting it may have originated through gene duplication.

ORF6 is a gene that encodes a viral accessory protein in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It is not present in MERS-CoV. It is thought to reduce the immune system response to viral infection through interferon antagonism.

<span class="mw-page-title-main">ORF9b</span> Gene

ORF9b is a gene that encodes a viral accessory protein in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It is an overlapping gene whose open reading frame is entirely contained within the N gene, which encodes coronavirus nucleocapsid protein. The encoded protein is 97 amino acid residues long in SARS-CoV and 98 in SARS-CoV-2, in both cases forming a protein dimer.

ORF9c is an open reading frame (ORF) in coronavirus genomes of the subgenus Sarbecovirus. It is 73 codons long in the SARS-CoV-2 genome. Although it is often included in lists of Sarbecovirus viral accessory protein genes, experimental and bioinformatics evidence suggests ORF9c may not be a functional protein-coding gene.

ORF10 is an open reading frame (ORF) found in the genome of the SARS-CoV-2 coronavirus. It is 38 codons long. It is not conserved in all Sarbecoviruses. In studies prompted by the COVID-19 pandemic, ORF10 attracted research interest as one of two viral accessory protein genes not conserved between SARS-CoV and SARS-CoV-2 and was initially described as a protein-coding gene likely under positive selection. However, although it is sometimes included in lists of SARS-CoV-2 accessory genes, experimental and bioinformatics evidence suggests ORF10 is likely not a functional protein-coding gene.

References

  1. 1 2 3 4 5 6 7 8 9 10 Kern DM, Sorum B, Mali SS, Hoel CM, Sridharan S, Remis JP, et al. (July 2021). "Cryo-EM structure of SARS-CoV-2 ORF3a in lipid nanodiscs". Nature Structural & Molecular Biology. 28 (7): 573–582. doi: 10.1038/s41594-021-00619-0 . PMC   8772433 . PMID   34158638. S2CID   235609553.
  2. 1 2 3 4 5 6 7 8 9 10 11 12 13 Liu DX, Fung TS, Chong KK, Shukla A, Hilgenfeld R (September 2014). "Accessory proteins of SARS-CoV and other coronaviruses". Antiviral Research. 109: 97–109. doi:10.1016/j.antiviral.2014.06.013. PMC   7113789 . PMID   24995382.
  3. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 McBride R, Fielding BC (November 2012). "The role of severe acute respiratory syndrome (SARS)-coronavirus accessory proteins in virus pathogenesis". Viruses. 4 (11): 2902–2923. doi: 10.3390/v4112902 . PMC   3509677 . PMID   23202509.
  4. 1 2 3 Redondo N, Zaldívar-López S, Garrido JJ, Montoya M (7 July 2021). "SARS-CoV-2 Accessory Proteins in Viral Pathogenesis: Knowns and Unknowns". Frontiers in Immunology. 12: 708264. doi: 10.3389/fimmu.2021.708264 . PMC   8293742 . PMID   34305949.
  5. Nelson CW, Ardern Z, Goldberg TL, Meng C, Kuo CH, Ludwig C, et al. (October 2020). "Dynamically evolving novel overlapping gene as a factor in the SARS-CoV-2 pandemic". eLife. 9: e59633. doi: 10.7554/eLife.59633 . PMC   7655111 . PMID   33001029.
  6. Jungreis I, Nelson CW, Ardern Z, Finkel Y, Krogan NJ, Sato K, et al. (June 2021). "Conflicting and ambiguous names of overlapping ORFs in the SARS-CoV-2 genome: A homology-based resolution". Virology. 558: 145–151. doi:10.1016/j.virol.2021.02.013. PMC   7967279 . PMID   33774510.
  7. Oostra M, de Haan CA, de Groot RJ, Rottier PJ (March 2006). "Glycosylation of the severe acute respiratory syndrome coronavirus triple-spanning membrane proteins 3a and M". Journal of Virology. 80 (5): 2326–2336. doi:10.1128/JVI.80.5.2326-2336.2006. PMC   1395384 . PMID   16474139.
  8. 1 2 3 Müller MA, van der Hoek L, Voss D, Bader O, Lehmann D, Schulz AR, et al. (January 2010). "Human coronavirus NL63 open reading frame 3 encodes a virion-incorporated N-glycosylated membrane protein". Virology Journal. 7 (1): 6. doi: 10.1186/1743-422X-7-6 . PMC   2819038 . PMID   20078868.
  9. Lu W, Zheng BJ, Xu K, Schwarz W, Du L, Wong CK, et al. (August 2006). "Severe acute respiratory syndrome-associated coronavirus 3a protein forms an ion channel and modulates virus release". Proceedings of the National Academy of Sciences of the United States of America. 103 (33): 12540–12545. Bibcode:2006PNAS..10312540L. doi: 10.1073/pnas.0605402103 . PMC   1567914 . PMID   16894145.