ORF10

Last updated
Orf10 protein, SARS-CoV-2
Identifiers
SymbolOrf10_SARS-CoV-2
InterPro IPR044342

ORF10 is an open reading frame (ORF) found in the genome of the SARS-CoV-2 coronavirus. It is 38 codons long. [1] It is not conserved in all Sarbecoviruses (including SARS-CoV). In studies prompted by the COVID-19 pandemic, ORF10 attracted research interest as one of two viral accessory protein genes not conserved between SARS-CoV and SARS-CoV-2 [2] and was initially described as a protein-coding gene likely under positive selection. [3] However, although it is sometimes included in lists of SARS-CoV-2 accessory genes, experimental and bioinformatics evidence suggests ORF10 is likely not a functional protein-coding gene. [4]

Contents

Properties

ORF10 is located downstream of the N gene, which encodes coronavirus nucleocapsid protein. It is the annotated open reading frame furthest to the 3' end of the genome. It encodes a 38-amino acid hypothetical protein. [1]

Expression and function

It is unlikely that ORF10 is translated under natural conditions, since subgenomic RNA containing the ORF10 region is not detected, though there is some ribosome footprinting signal. [5] When experimentally overexpressed, the ORF10 protein has been reported to interact with ZYG11B and its cullin-RING ligase protein complex. [6] However, this interaction has been shown to be dispensable in in vitro studies of the viral life cycle. [7]

Evolution

Some studies of SARS-CoV-2 genomes have described ORF10 as likely to be functional and under positive selection. [3] However, premature stop codons have been identified in SARS-CoV-2 variants [8] and in many Sarbecovirus sequences, suggesting that the putative protein product is not essential for viral replication. [4] Loss of ORF10 has also shown no effect on replication under experimental conditions in vitro . [8] It has been suggested through bioinformatics analysis that apparent sequence conservation in SARS-CoV-2 ORF10 may not be due to a protein-coding function, but instead due to conserved RNA secondary structure in the region. [4] The conserved region, which extends beyond ORF10 itself, overlaps with the coronavirus 3' UTR pseudoknot region, a secondary structure known to be involved in genome replication. [4]

Related Research Articles

<span class="mw-page-title-main">SARS-related coronavirus</span> Species of coronavirus causing SARS and COVID-19

Severe acute respiratory syndrome–related coronavirus is a species of virus consisting of many known strains phylogenetically related to severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1) that have been shown to possess the capability to infect humans, bats, and certain other mammals. These enveloped, positive-sense single-stranded RNA viruses enter host cells by binding to the angiotensin-converting enzyme 2 (ACE2) receptor. The SARSr-CoV species is a member of the genus Betacoronavirus and of the subgenus Sarbecovirus.

Putative transmembrane domain more commonly known as Non-structural Protein 6 (NSP6) is one of the two non-structural proteins that gene 11 in rotavirus encodes for alongside NSP5. NSP6 is composed of six transmembrane domains and a C terminal tail. In contrast to the other rotavirus non-structural proteins, NSP6 was found to have a high rate of turnover, being completely degraded within 2 hours of synthesis. NSP6 was found to be a sequence-independent nucleic acid binding protein, with similar affinities for ssRNA and dsRNA

<span class="mw-page-title-main">ORF7a</span> Gene found in coronaviruses of the Betacoronavirus genus

ORF7a is a gene found in coronaviruses of the Betacoronavirus genus. It expresses the Betacoronavirus NS7A protein, a type I transmembrane protein with an immunoglobulin-like protein domain. It was first discovered in SARS-CoV, the virus that causes severe acute respiratory syndrome (SARS). The homolog in SARS-CoV-2, the virus that causes COVID-19, has about 85% sequence identity to the SARS-CoV protein.

An overlapping gene is a gene whose expressible nucleotide sequence partially overlaps with the expressible nucleotide sequence of another gene. In this way, a nucleotide sequence may make a contribution to the function of one or more gene products. Overlapping genes are present in and a fundamental feature of both cellular and viral genomes. The current definition of an overlapping gene varies significantly between eukaryotes, prokaryotes, and viruses. In prokaryotes and viruses overlap must be between coding sequences but not mRNA transcripts, and is defined when these coding sequences share a nucleotide on either the same or opposite strands. In eukaryotes, gene overlap is almost always defined as mRNA transcript overlap. Specifically, a gene overlap in eukaryotes is defined when at least one nucleotide is shared between the boundaries of the primary mRNA transcripts of two or more genes, such that a DNA base mutation at any point of the overlapping region would affect the transcripts of all genes involved. This definition includes 5′ and 3′ untranslated regions (UTRs) along with introns.

<span class="mw-page-title-main">SARS-CoV-2</span> Virus that causes COVID-19

Severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) is a strain of coronavirus that causes COVID-19, the respiratory illness responsible for the COVID-19 pandemic. The virus previously had the provisional name 2019 novel coronavirus (2019-nCoV), and has also been called human coronavirus 2019. First identified in the city of Wuhan, Hubei, China, the World Health Organization designated the outbreak a public health emergency of international concern from January 30, 2020, to May 5, 2023. SARS‑CoV‑2 is a positive-sense single-stranded RNA virus that is contagious in humans.

ORF3b is a gene found in coronaviruses of the subgenus Sarbecovirus, encoding a short non-structural protein. It is present in both SARS-CoV and SARS-CoV-2, though the protein product has very different lengths in the two viruses. The encoded protein is significantly shorter in SARS-CoV-2, at only 22 amino acid residues compared to 153–155 in SARS-CoV. Both the longer SARS-CoV and shorter SARS-CoV-2 proteins have been reported as interferon antagonists. It is unclear whether the SARS-CoV-2 gene expresses a functional protein.

ORF3d is a gene found in SARS-CoV-2 and at least one closely related coronavirus found in pangolins, though it is not found in other closely related viruses within the Sarbecovirus subgenus. It is 57 codons long and encodes a novel 57 amino acid residue protein of unknown function. At least two isoforms have been described, of which the shorter 33-residue form, ORF3d-2, may be more highly expressed, or even the only form expressed. It is reported to be antigenic and antibodies to the ORF3d protein occur in patients recovered from COVID-19. There is no homolog in the genome of the otherwise closely related SARS-CoV.

RmYN02 is a bat-derived strain of Severe acute respiratory syndrome–related coronavirus. It was discovered in bat droppings collected between May and October 2019 from sites in Mengla County, Yunnan Province, China. It is the second-closest known relative of SARS-CoV-2, the virus strain that causes COVID-19, sharing 93.3% nucleotide identity at the scale of the complete virus genome. RmYN02 contains an insertion at the S1/S2 cleavage site in the spike protein, similar to SARS-CoV-2, suggesting that such insertion events can occur naturally.

RacCS203 is a bat-derived strain of severe acute respiratory syndrome–related coronavirus collected in acuminate horseshoe bats from sites in Thailand and sequenced by Lin-Fa Wang's team. It has 91.5% sequence similarity to SARS-CoV-2 and is most related to the RmYN02 strain. Its spike protein is closely related to RmYN02's spike, both highly divergent from SARS-CoV-2's spike.

Bat coronavirus RpYN06 is a SARS-like betacoronavirus that infects the horseshoe bat Rhinolophus pusillus, it is a close relative of SARS-CoV-2 with a 94.48% sequence identity.

ORF3c is a gene found in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It was first identified in the SARS-CoV-2 genome and encodes a 41 amino acid non-structural protein of unknown function. It is also present in the SARS-CoV genome, but was not recognized until the identification of the SARS-CoV-2 homolog.

<span class="mw-page-title-main">ORF3a</span> Gene found in coronaviruses of the subgenus Sarbecovirus

ORF3a is a gene found in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It encodes an accessory protein about 275 amino acid residues long, which is thought to function as a viroporin. It is the largest accessory protein and was the first of the SARS-CoV accessory proteins to be described.

ORF7b is a gene found in coronaviruses of the genus Betacoronavirus, which expresses the accessory protein Betacoronavirus NS7b protein. It is a short, highly hydrophobic transmembrane protein of unknown function.

<span class="mw-page-title-main">ORF8</span> Gene that encodes a viral accessory protein

ORF8 is a gene that encodes a viral accessory protein, Betacoronavirus NS8 protein, in coronaviruses of the subgenus Sarbecovirus. It is one of the least well conserved and most variable parts of the genome. In some viruses, a deletion splits the region into two smaller open reading frames, called ORF8a and ORF8b - a feature present in many SARS-CoV viral isolates from later in the SARS epidemic, as well as in some bat coronaviruses. For this reason the full-length gene and its protein are sometimes called ORF8ab. The full-length gene, exemplified in SARS-CoV-2, encodes a protein with an immunoglobulin domain of unknown function, possibly involving interactions with the host immune system. It is similar in structure to the ORF7a protein, suggesting it may have originated through gene duplication.

ORF6 is a gene that encodes a viral accessory protein in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It is not present in MERS-CoV. It is thought to reduce the immune system response to viral infection through interferon antagonism.

<span class="mw-page-title-main">ORF9b</span> Gene

ORF9b is a gene that encodes a viral accessory protein in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It is an overlapping gene whose open reading frame is entirely contained within the N gene, which encodes coronavirus nucleocapsid protein. The encoded protein is 97 amino acid residues long in SARS-CoV and 98 in SARS-CoV-2, in both cases forming a protein dimer.

ORF9c is an open reading frame (ORF) in coronavirus genomes of the subgenus Sarbecovirus. It is 73 codons long in the SARS-CoV-2 genome. Although it is often included in lists of Sarbecovirus viral accessory protein genes, experimental and bioinformatics evidence suggests ORF9c may not be a functional protein-coding gene.

ORF1ab refers collectively to two open reading frames (ORFs), ORF1a and ORF1b, that are conserved in the genomes of nidoviruses, a group of viruses that includes coronaviruses. The genes express large polyproteins that undergo proteolysis to form several nonstructural proteins with various functions in the viral life cycle, including proteases and the components of the replicase-transcriptase complex (RTC). Together the two ORFs are sometimes referred to as the replicase gene. They are related by a programmed ribosomal frameshift that allows the ribosome to continue translating past the stop codon at the end of ORF1a, in a -1 reading frame. The resulting polyproteins are known as pp1a and pp1ab.

<span class="mw-page-title-main">Nidoviral papain-like protease</span> Papain-like protease protein domain

The nidoviral papain-like protease is a papain-like protease protein domain encoded in the genomes of nidoviruses. It is expressed as part of a large polyprotein from the ORF1a gene and has cysteine protease enzymatic activity responsible for proteolytic cleavage of some of the N-terminal viral nonstructural proteins within the polyprotein. A second protease also encoded by ORF1a, called the 3C-like protease or main protease, is responsible for the majority of further cleavages. Coronaviruses have one or two papain-like protease domains; in SARS-CoV and SARS-CoV-2, one PLPro domain is located in coronavirus nonstructural protein 3 (nsp3). Arteriviruses have two to three PLP domains. In addition to their protease activity, PLP domains function as deubiquitinating enzymes (DUBs) that can cleave the isopeptide bond found in ubiquitin chains. They are also "deISGylating" enzymes that remove the ubiquitin-like domain interferon-stimulated gene 15 (ISG15) from cellular proteins. These activities are likely responsible for antagonizing the activity of the host innate immune system. Because they are essential for viral replication, papain-like protease domains are considered drug targets for the development of antiviral drugs against human pathogens such as MERS-CoV, SARS-CoV, and SARS-CoV-2.

<span class="mw-page-title-main">Nsp12</span> Protein in the Coronavirus genome

Nsp12 is a non-structural protein in the Coronavirus genome. Its gene is part of the ORF1ab reading frame and it is part of the pp1ab polyprotein; it is cleaved by 3CLpro.

References

  1. 1 2 Redondo, Natalia; Zaldívar-López, Sara; Garrido, Juan J.; Montoya, Maria (7 July 2021). "SARS-CoV-2 Accessory Proteins in Viral Pathogenesis: Knowns and Unknowns". Frontiers in Immunology. 12: 708264. doi: 10.3389/fimmu.2021.708264 . PMC   8293742 . PMID   34305949.
  2. Xu, Jiabao; Zhao, Shizhe; Teng, Tieshan; Abdalla, Abualgasim Elgaili; Zhu, Wan; Xie, Longxiang; Wang, Yunlong; Guo, Xiangqian (22 February 2020). "Systematic Comparison of Two Animal-to-Human Transmitted Human Coronaviruses: SARS-CoV-2 and SARS-CoV". Viruses. 12 (2): 244. doi: 10.3390/v12020244 . PMC   7077191 . PMID   32098422.
  3. 1 2 Cagliani, Rachele; Forni, Diego; Clerici, Mario; Sironi, Manuela (September 2020). "Coding potential and sequence conservation of SARS-CoV-2 and related animal viruses". Infection, Genetics and Evolution. 83: 104353. doi:10.1016/j.meegid.2020.104353. PMC   7199688 . PMID   32387562.
  4. 1 2 3 4 Jungreis, Irwin; Sealfon, Rachel; Kellis, Manolis (December 2021). "SARS-CoV-2 gene content and COVID-19 mutation impact by comparing 44 Sarbecovirus genomes". Nature Communications. 12 (1): 2642. Bibcode:2021NatCo..12.2642J. doi:10.1038/s41467-021-22905-7. PMC   8113528 . PMID   33976134.
  5. Finkel, Yaara; Mizrahi, Orel; Nachshon, Aharon; Weingarten-Gabbay, Shira; Morgenstern, David; Yahalom-Ronen, Yfat; Tamir, Hadas; Achdout, Hagit; Stein, Dana; Israeli, Ofir; Beth-Din, Adi; Melamed, Sharon; Weiss, Shay; Israely, Tomer; Paran, Nir; Schwartz, Michal; Stern-Ginossar, Noam (7 January 2021). "The coding capacity of SARS-CoV-2". Nature. 589 (7840): 125–130. Bibcode:2021Natur.589..125F. doi: 10.1038/s41586-020-2739-1 . PMID   32906143. S2CID   221624633.
  6. Gordon, David E.; et al. (16 July 2020). "A SARS-CoV-2 protein interaction map reveals targets for drug repurposing". Nature. 583 (7816): 459–468. Bibcode:2020Natur.583..459G. doi:10.1038/s41586-020-2286-9. PMC   7431030 . PMID   32353859.
  7. Mena, Elijah L.; Donahue, Callie J.; Vaites, Laura Pontano; Li, Jie; Rona, Gergely; O’Leary, Colin; Lignitto, Luca; Miwatani-Minter, Bearach; Paulo, Joao A.; Dhabaria, Avantika; Ueberheide, Beatrix; Gygi, Steven P.; Pagano, Michele; Harper, J. Wade; Davey, Robert A.; Elledge, Stephen J. (27 April 2021). "ORF10–Cullin-2–ZYG11B complex is not required for SARS-CoV-2 infection". Proceedings of the National Academy of Sciences. 118 (17): e2023157118. Bibcode:2021PNAS..11823157M. doi: 10.1073/pnas.2023157118 . PMC   8092598 . PMID   33827988.
  8. 1 2 Pancer, Katarzyna; Milewska, Aleksandra; Owczarek, Katarzyna; Dabrowska, Agnieszka; Kowalski, Michał; Łabaj, Paweł P.; Branicki, Wojciech; Sanak, Marek; Pyrc, Krzysztof (10 December 2020). "The SARS-CoV-2 ORF10 is not essential in vitro or in vivo in humans". PLOS Pathogens. 16 (12): e1008959. doi: 10.1371/journal.ppat.1008959 . PMC   7755277 . PMID   33301543.