ORF10

Last updated
Orf10 protein, SARS-CoV-2
Identifiers
SymbolOrf10_SARS-CoV-2
InterPro IPR044342

ORF10 is an open reading frame (ORF) found in the genome of the SARS-CoV-2 coronavirus. It is 38 codons long. [1] It is not conserved in all Sarbecoviruses (including SARS-CoV). In studies prompted by the COVID-19 pandemic, ORF10 attracted research interest as one of two viral accessory protein genes not conserved between SARS-CoV and SARS-CoV-2 [2] and was initially described as a protein-coding gene likely under positive selection. [3] However, although it is sometimes included in lists of SARS-CoV-2 accessory genes, experimental and bioinformatics evidence suggests ORF10 is likely not a functional protein-coding gene. [4]

Contents

Properties

ORF10 is located downstream of the N gene, which encodes coronavirus nucleocapsid protein. It is the annotated open reading frame furthest to the 3' end of the genome. It encodes a 38-amino acid hypothetical protein. [1]

Expression and function

It is unlikely that ORF10 is translated under natural conditions, since subgenomic RNA containing the ORF10 region is not detected, though there is some ribosome footprinting signal. [5] When experimentally overexpressed, the ORF10 protein has been reported to interact with ZYG11B and its cullin-RING ligase protein complex. [6] However, this interaction has been shown to be dispensable in in vitro studies of the viral life cycle. [7]

Evolution

Some studies of SARS-CoV-2 genomes have described ORF10 as likely to be functional and under positive selection. [3] However, premature stop codons have been identified in SARS-CoV-2 variants [8] and in many Sarbecovirus sequences, suggesting that the putative protein product is not essential for viral replication. [4] Loss of ORF10 has also shown no effect on replication under experimental conditions in vitro . [8] It has been suggested through bioinformatics analysis that apparent sequence conservation in SARS-CoV-2 ORF10 may not be due to a protein-coding function, but instead due to conserved RNA secondary structure in the region. [4] The conserved region, which extends beyond ORF10 itself, overlaps with the coronavirus 3' UTR pseudoknot region, a secondary structure known to be involved in genome replication. [4]

Related Research Articles

<span class="mw-page-title-main">Coronavirus</span> Subfamily of viruses in the family Coronaviridae

Coronaviruses are a group of related RNA viruses that cause diseases in mammals and birds. In humans and birds, they cause respiratory tract infections that can range from mild to lethal. Mild illnesses in humans include some cases of the common cold, while more lethal varieties can cause SARS, MERS and COVID-19. In cows and pigs they cause diarrhea, while in mice they cause hepatitis and encephalomyelitis.

<span class="mw-page-title-main">SARS-related coronavirus</span> Species of coronavirus causing SARS and COVID-19

Severe-acute-respiratory-syndrome–related coronavirus is a species of virus consisting of many known strains. Two strains of the virus have caused outbreaks of severe respiratory diseases in humans: severe acute respiratory syndrome coronavirus 1, the cause of the 2002–2004 outbreak of severe acute respiratory syndrome (SARS), and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of the pandemic of COVID-19. There are hundreds of other strains of SARSr-CoV, which are only known to infect non-human mammal species: bats are a major reservoir of many strains of SARSr-CoV; several strains have been identified in Himalayan palm civets, which were likely ancestors of SARS-CoV-1.

Non-structural Protein 6 (NSP6) is one of the two non-structural proteins that gene 11 in rotavirus encodes for alongside NSP5. It is a putative transmembrane domain protein. NSP6 is composed of six transmembrane domains and a C terminal tail. In contrast to the other rotavirus non-structural proteins, NSP6 was found to have a high rate of turnover, being completely degraded within 2 hours of synthesis. NSP6 was found to be a sequence-independent nucleic acid binding protein, with similar affinities for ssRNA and dsRNA

<span class="mw-page-title-main">ORF7a</span> Gene found in coronaviruses of the Betacoronavirus genus

ORF7a is a gene found in coronaviruses of the Betacoronavirus genus. It expresses the Betacoronavirus NS7A protein, a type I transmembrane protein with an immunoglobulin-like protein domain. It was first discovered in SARS-CoV, the virus that causes severe acute respiratory syndrome (SARS). The homolog in SARS-CoV-2, the virus that causes COVID-19, has about 85% sequence identity to the SARS-CoV protein.

An overlapping gene is a gene whose expressible nucleotide sequence partially overlaps with the expressible nucleotide sequence of another gene. In this way, a nucleotide sequence may make a contribution to the function of one or more gene products. Overlapping genes are present in and a fundamental feature of both cellular and viral genomes. The current definition of an overlapping gene varies significantly between eukaryotes, prokaryotes, and viruses. In prokaryotes and viruses overlap must be between coding sequences but not mRNA transcripts, and is defined when these coding sequences share a nucleotide on either the same or opposite strands. In eukaryotes, gene overlap is almost always defined as mRNA transcript overlap. Specifically, a gene overlap in eukaryotes is defined when at least one nucleotide is shared between the boundaries of the primary mRNA transcripts of two or more genes, such that a DNA base mutation at any point of the overlapping region would affect the transcripts of all genes involved. This definition includes 5′ and 3′ untranslated regions (UTRs) along with introns.

<span class="mw-page-title-main">SARS-CoV-2</span> Virus that causes COVID-19

Severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) is a strain of coronavirus that causes COVID-19, the respiratory illness responsible for the COVID-19 pandemic. The virus previously had the provisional name 2019 novel coronavirus (2019-nCoV), and has also been called human coronavirus 2019. First identified in the city of Wuhan, Hubei, China, the World Health Organization designated the outbreak a public health emergency of international concern from January 30, 2020, to May 5, 2023. SARS‑CoV‑2 is a positive-sense single-stranded RNA virus that is contagious in humans.

ORF3b is a gene found in coronaviruses of the subgenus Sarbecovirus, encoding a short non-structural protein. It is present in both SARS-CoV and SARS-CoV-2, though the protein product has very different lengths in the two viruses. The encoded protein is significantly shorter in SARS-CoV-2, at only 22 amino acid residues compared to 153–155 in SARS-CoV. Both the longer SARS-CoV and shorter SARS-CoV-2 proteins have been reported as interferon antagonists. It is unclear whether the SARS-CoV-2 gene expresses a functional protein.

ORF3d is a gene found in SARS-CoV-2 and at least one closely related coronavirus found in pangolins, though it is not found in other closely related viruses within the Sarbecovirus subgenus. It is 57 codons long and encodes a novel 57 amino acid residue protein of unknown function. At least two isoforms have been described, of which the shorter 33-residue form, ORF3d-2, may be more highly expressed, or even the only form expressed. It is reported to be antigenic and antibodies to the ORF3d protein occur in patients recovered from COVID-19. There is no homolog in the genome of the otherwise closely related SARS-CoV.

RmYN02 is a bat-derived strain of Severe acute respiratory syndrome–related coronavirus. It was discovered in bat droppings collected between May and October 2019 from sites in Mengla County, Yunnan Province, China. It is the second-closest known relative of SARS-CoV-2, the virus strain that causes COVID-19, sharing 93.3% nucleotide identity at the scale of the complete virus genome. RmYN02 contains an insertion at the S1/S2 cleavage site in the spike protein, similar to SARS-CoV-2, suggesting that such insertion events can occur naturally.

Bat coronavirus RpYN06 is a SARS-like betacoronavirus that infects the horseshoe bat Rhinolophus pusillus. It is a close relative of SARS-CoV-2 with a 94.48% sequence identity.

<span class="mw-page-title-main">Coronavirus envelope protein</span> Major structure in coronaviruses

The envelope (E) protein is the smallest and least well-characterized of the four major structural proteins found in coronavirus virions. It is an integral membrane protein less than 110 amino acid residues long; in SARS-CoV-2, the causative agent of Covid-19, the E protein is 75 residues long. Although it is not necessarily essential for viral replication, absence of the E protein may produce abnormally assembled viral capsids or reduced replication. E is a multifunctional protein and, in addition to its role as a structural protein in the viral capsid, it is thought to be involved in viral assembly, likely functions as a viroporin, and is involved in viral pathogenesis.

<span class="mw-page-title-main">Coronavirus membrane protein</span> Major structure in coronaviruses

The membrane (M) protein is an integral membrane protein that is the most abundant of the four major structural proteins found in coronaviruses. The M protein organizes the assembly of coronavirus virions through protein-protein interactions with other M protein molecules as well as with the other three structural proteins, the envelope (E), spike (S), and nucleocapsid (N) proteins.

ORF3c is a gene found in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It was first identified in the SARS-CoV-2 genome and encodes a 41 amino acid non-structural protein of unknown function. It is also present in the SARS-CoV genome, but was not recognized until the identification of the SARS-CoV-2 homolog.

<span class="mw-page-title-main">ORF3a</span> Gene found in coronaviruses of the subgenus Sarbecovirus

ORF3a is a gene found in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It encodes an accessory protein about 275 amino acid residues long, which is thought to function as a viroporin. It is the largest accessory protein and was the first of the SARS-CoV accessory proteins to be described.

ORF7b is a gene found in coronaviruses of the genus Betacoronavirus, which expresses the accessory protein Betacoronavirus NS7b protein. It is a short, highly hydrophobic transmembrane protein of unknown function.

<span class="mw-page-title-main">ORF8</span> Gene that encodes a viral accessory protein

ORF8 is a gene that encodes a viral accessory protein, Betacoronavirus NS8 protein, in coronaviruses of the subgenus Sarbecovirus. It is one of the least well conserved and most variable parts of the genome. In some viruses, a deletion splits the region into two smaller open reading frames, called ORF8a and ORF8b - a feature present in many SARS-CoV viral isolates from later in the SARS epidemic, as well as in some bat coronaviruses. For this reason the full-length gene and its protein are sometimes called ORF8ab. The full-length gene, exemplified in SARS-CoV-2, encodes a protein with an immunoglobulin domain of unknown function, possibly involving interactions with the host immune system. It is similar in structure to the ORF7a protein, suggesting it may have originated through gene duplication.

ORF6 is a gene that encodes a viral accessory protein in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It is not present in MERS-CoV. It is thought to reduce the immune system response to viral infection through interferon antagonism.

<span class="mw-page-title-main">ORF9b</span> Gene

ORF9b is a gene that encodes a viral accessory protein in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It is an overlapping gene whose open reading frame is entirely contained within the N gene, which encodes coronavirus nucleocapsid protein. The encoded protein is 97 amino acid residues long in SARS-CoV and 98 in SARS-CoV-2, in both cases forming a protein dimer.

ORF9c is an open reading frame (ORF) in coronavirus genomes of the subgenus Sarbecovirus. It is 73 codons long in the SARS-CoV-2 genome. Although it is often included in lists of Sarbecovirus viral accessory protein genes, experimental and bioinformatics evidence suggests ORF9c may not be a functional protein-coding gene.

ORF1ab refers collectively to two open reading frames (ORFs), ORF1a and ORF1b, that are conserved in the genomes of nidoviruses, a group of viruses that includes coronaviruses. The genes express large polyproteins that undergo proteolysis to form several nonstructural proteins with various functions in the viral life cycle, including proteases and the components of the replicase-transcriptase complex (RTC). Together the two ORFs are sometimes referred to as the replicase gene. They are related by a programmed ribosomal frameshift that allows the ribosome to continue translating past the stop codon at the end of ORF1a, in a -1 reading frame. The resulting polyproteins are known as pp1a and pp1ab.

References

  1. 1 2 Redondo N, Zaldívar-López S, Garrido JJ, Montoya M (7 July 2021). "SARS-CoV-2 Accessory Proteins in Viral Pathogenesis: Knowns and Unknowns". Frontiers in Immunology. 12: 708264. doi: 10.3389/fimmu.2021.708264 . PMC   8293742 . PMID   34305949.
  2. Xu J, Zhao S, Teng T, Abdalla AE, Zhu W, Xie L, et al. (February 2020). "Systematic Comparison of Two Animal-to-Human Transmitted Human Coronaviruses: SARS-CoV-2 and SARS-CoV". Viruses. 12 (2): 244. doi: 10.3390/v12020244 . PMC   7077191 . PMID   32098422.
  3. 1 2 Cagliani R, Forni D, Clerici M, Sironi M (September 2020). "Coding potential and sequence conservation of SARS-CoV-2 and related animal viruses". Infection, Genetics and Evolution. 83: 104353. Bibcode:2020InfGE..8304353C. doi:10.1016/j.meegid.2020.104353. PMC   7199688 . PMID   32387562.
  4. 1 2 3 4 Jungreis I, Sealfon R, Kellis M (May 2021). "SARS-CoV-2 gene content and COVID-19 mutation impact by comparing 44 Sarbecovirus genomes". Nature Communications. 12 (1): 2642. Bibcode:2021NatCo..12.2642J. doi:10.1038/s41467-021-22905-7. PMC   8113528 . PMID   33976134.
  5. Finkel Y, Mizrahi O, Nachshon A, Weingarten-Gabbay S, Morgenstern D, Yahalom-Ronen Y, et al. (January 2021). "The coding capacity of SARS-CoV-2". Nature. 589 (7840): 125–130. Bibcode:2021Natur.589..125F. doi: 10.1038/s41586-020-2739-1 . PMID   32906143. S2CID   221624633.
  6. Gordon DE, Jang GM, Bouhaddou M, Xu J, Obernier K, White KM, et al. (July 2020). "A SARS-CoV-2 protein interaction map reveals targets for drug repurposing". Nature. 583 (7816): 459–468. Bibcode:2020Natur.583..459G. doi:10.1038/s41586-020-2286-9. PMC   7431030 . PMID   32353859.
  7. Mena EL, Donahue CJ, Vaites LP, Li J, Rona G, O'Leary C, et al. (April 2021). "ORF10-Cullin-2-ZYG11B complex is not required for SARS-CoV-2 infection". Proceedings of the National Academy of Sciences of the United States of America. 118 (17): e2023157118. Bibcode:2021PNAS..11823157M. doi: 10.1073/pnas.2023157118 . PMC   8092598 . PMID   33827988.
  8. 1 2 Pancer K, Milewska A, Owczarek K, Dabrowska A, Kowalski M, Łabaj PP, et al. (December 2020). "The SARS-CoV-2 ORF10 is not essential in vitro or in vivo in humans". PLOS Pathogens. 16 (12): e1008959. doi: 10.1371/journal.ppat.1008959 . PMC   7755277 . PMID   33301543.