ORF9c

Last updated
Betacoronavirus uncharacterised protein 14
Identifiers
SymbolbCoV_Orf14
Pfam PF17635
InterPro IPR035113
Available protein structures:
Pfam   structures / ECOD  
PDB RCSB PDB; PDBe; PDBj
PDBsum structure summary

ORF9c (formerly also called ORF14) is an open reading frame (ORF) in coronavirus genomes of the subgenus Sarbecovirus . [1] It is 73 codons long in the SARS-CoV-2 genome. [2] Although it is often included in lists of Sarbecovirus viral accessory protein genes, experimental and bioinformatics evidence suggests ORF9c may not be a functional protein-coding gene. [3]

Contents

Nomenclature

There has been inconsistency in the nomenclature used for this gene in the scientific literature. In some work on SARS-CoV, it has been referred to as ORF14. [4] It has sometimes been referred to as ORF9b, while its longer upstream neighbor ORF9b was given the name ORF9a. The current recommended nomenclature refers to this gene as ORF9c, and the upstream gene as ORF9b. [2]

Expression and interactions

ORF9c is one of two overlapping genes fully contained within the open reading frame of the N gene encoding coronavirus nucleocapsid protein, the other being ORF9b. It is unclear if ORF9c is functionally expressed during SARS-CoV-2 infections; it is reportedly not translated under experimental conditions. [5] When experimentally overexpressed, the protein interacts with sigma receptors and with the NF-kB pathway. [1] [6] The SARS-CoV protein forms self-interactions suggesting protein dimer or higher-order oligomer formation. [7]

Evolution

ORF9c has about 74% sequence identity between SARS-CoV and SARS-CoV-2. [1]

SARS-CoV-2 variants have been identified in which premature stop codons are introduced or where its start codon was lost, and the amino acid sequence is poorly conserved, supporting the hypothesis that it does not encode a functional protein. [3] [6]

Related Research Articles

<span class="mw-page-title-main">SARS-related coronavirus</span> Species of coronavirus causing SARS and COVID-19

Severe acute respiratory syndrome–related coronavirus is a species of virus consisting of many known strains phylogenetically related to severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1) that have been shown to possess the capability to infect humans, bats, and certain other mammals. These enveloped, positive-sense single-stranded RNA viruses enter host cells by binding to the angiotensin-converting enzyme 2 (ACE2) receptor. The SARSr-CoV species is a member of the genus Betacoronavirus and of the subgenus Sarbecovirus.

<span class="mw-page-title-main">ORF7a</span> Gene found in coronaviruses of the Betacoronavirus genus

ORF7a is a gene found in coronaviruses of the Betacoronavirus genus. It expresses the Betacoronavirus NS7A protein, a type I transmembrane protein with an immunoglobulin-like protein domain. It was first discovered in SARS-CoV, the virus that causes severe acute respiratory syndrome (SARS). The homolog in SARS-CoV-2, the virus that causes COVID-19, has about 85% sequence identity to the SARS-CoV protein.

An overlapping gene is a gene whose expressible nucleotide sequence partially overlaps with the expressible nucleotide sequence of another gene. In this way, a nucleotide sequence may make a contribution to the function of one or more gene products. Overlapping genes are present and a fundamental feature of both cellular and viral genomes. The current definition of an overlapping gene varies significantly between eukaryotes, prokaryotes, and viruses. In prokaryotes and viruses overlap must be between coding sequences but not mRNA transcripts, and is defined when these coding sequences share a nucleotide on either the same or opposite strands. In eukaryotes, gene overlap is almost always defined as mRNA transcript overlap. Specifically, a gene overlap in eukaryotes is defined when at least one nucleotide is shared between the boundaries of the primary mRNA transcripts of two or more genes, such that a DNA base mutation at any point of the overlapping region would affect the transcripts of all genes involved. This definition includes 5′ and 3′ untranslated regions (UTRs) along with introns.

<i>Betacoronavirus</i> Genus of viruses

Betacoronavirus is one of four genera of coronaviruses. Member viruses are enveloped, positive-strand RNA viruses that infect mammals. The natural reservoir for betacoronaviruses are bats and rodents. Rodents are the reservoir for the subgenus Embecovirus, while bats are the reservoir for the other subgenera.

<span class="mw-page-title-main">SARS-CoV-2</span> Virus that causes COVID-19

Severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) is a strain of coronavirus that causes COVID-19, the respiratory illness responsible for the ongoing COVID-19 pandemic. The virus previously had a provisional name, 2019 novel coronavirus (2019-nCoV), and has also been called the human coronavirus 2019. First identified in the city of Wuhan, Hubei, China, the World Health Organization declared the outbreak a public health emergency of international concern on January 30, 2020, and a pandemic on March 11, 2020. SARS‑CoV‑2 is a positive-sense single-stranded RNA virus that is contagious in humans.

Transmembrane protein 39B (TMEM39B) is a protein that in humans is encoded by the gene TMEM39B. TMEM39B is a multi-pass membrane protein with eight transmembrane domains. The protein localizes to the plasma membrane and vesicles. The precise function of TMEM39B is not yet well-understood by the scientific community, but differential expression is associated with survival of B cell lymphoma, and knockdown of TMEM39B is associated with decreased autophagy in cells infected with the Sindbis virus. Furthermore, the TMEM39B protein been found to interact with the SARS-CoV-2 ORF9C protein. TMEM39B is expressed at moderate levels in most tissues, with higher expression in the testis, placenta, white blood cells, adrenal gland, thymus, and fetal brain.

Bat coronavirus RaTG13 is a SARS-like betacoronavirus that infects the horseshoe bat Rhinolophus affinis. It was discovered in 2013 in bat droppings from a mining cave near the town of Tongguan in Mojiang county in Yunnan, China. In February 2020, it was identified as the closest known relative of SARS-CoV-2, the virus that causes COVID-19, sharing 96.1% nucleotide similarity. However, in 2022, scientists found three closer matches in bats found 530 km south, in Feuang, Laos, designated as BANAL-52, BANAL-103 and BANAL-236.

ORF3b is a gene found in coronaviruses of the subgenus Sarbecovirus, encoding a short non-structural protein. It is present in both SARS-CoV and SARS-CoV-2, though the protein product has very different lengths in the two viruses. The encoded protein is significantly shorter in SARS-CoV-2, at only 22 amino acid residues compared to 153-155 in SARS-CoV. Both the longer SARS-CoV and shorter SARS-CoV-2 proteins have been reported as interferon antagonists. It is unclear whether the SARS-CoV-2 gene expresses a functional protein.

ORF3d is a gene found in SARS-CoV-2 and at least one closely related coronavirus found in pangolins, though it is not found in other closely related viruses within the Sarbecovirus subgenus. It is 57 codons long and encodes a novel 57 amino acid residue protein of unknown function. At least two isoforms have been described, of which the shorter 33-residue form, ORF3d-2, may be more highly expressed, or even the only form expressed. It is reported to be antigenic and antibodies to the ORF3d protein occur in patients recovered from COVID-19. There is no homolog in the genome of the otherwise closely related SARS-CoV.

RmYN02 is a bat-derived strain of Severe acute respiratory syndrome–related coronavirus. It was discovered in bat droppings collected between May and October 2019 from sites in Mengla County, Yunnan Province, China. It is the second-closest known relative of SARS-CoV-2, the virus strain that causes COVID-19, sharing 93.3% nucleotide identity at the scale of the complete virus genome. RmYN02 contains an insertion at the S1/S2 cleavage site in the spike protein, similar to SARS-CoV-2, suggesting that such insertion events can occur naturally.

<span class="mw-page-title-main">Translation regulation by 5′ transcript leader cis-elements</span>

Translation regulation by 5′ transcript leader cis-elements is a process in cellular translation.

ORF3c is a gene found in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It was first identified in the SARS-CoV-2 genome and encodes a 41 amino acid non-structural protein of unknown function. It is also present in the SARS-CoV genome, but was not recognized until the identification of the SARS-CoV-2 homolog.

<span class="mw-page-title-main">ORF3a</span> Gene found in coronaviruses of the subgenus Sarbecovirus

ORF3a is a gene found in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It encodes an accessory protein about 275 amino acid residues long, which is thought to function as a viroporin. It is the largest accessory protein and was the first of the SARS-CoV accessory proteins to be described.

ORF7b is a gene found in coronaviruses of the genus Betacoronavirus, which expresses the accessory protein Betacoronavirus NS7b protein. It is a short, highly hydrophobic transmembrane protein of unknown function.

<span class="mw-page-title-main">ORF8</span> Gene that encodes a viral accessory protein

ORF8 is a gene that encodes a viral accessory protein, Betacoronavirus NS8 protein, in coronaviruses of the subgenus Sarbecovirus. It is one of the least well conserved and most variable parts of the genome. In some viruses, a deletion splits the region into two smaller open reading frames, called ORF8a and ORF8b - a feature present in many SARS-CoV viral isolates from later in the SARS epidemic, as well as in some bat coronaviruses. For this reason the full-length gene and its protein are sometimes called ORF8ab. The full-length gene, exemplified in SARS-CoV-2, encodes a protein with an immunoglobulin domain of unknown function, possibly involving interactions with the host immune system. It is similar in structure to the ORF7a protein, suggesting it may have originated through gene duplication.

ORF6 is a gene that encodes a viral accessory protein in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It is not present in MERS-CoV. It is thought to reduce the immune system response to viral infection through interferon antagonism.

<span class="mw-page-title-main">ORF9b</span> Gene

ORF9b is a gene that encodes a viral accessory protein in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It is an overlapping gene whose open reading frame is entirely contained within the N gene, which encodes coronavirus nucleocapsid protein. The encoded protein is 97 amino acid residues long in SARS-CoV and 98 in SARS-CoV-2, in both cases forming a protein dimer.

ORF10 is an open reading frame (ORF) found in the genome of the SARS-CoV-2 coronavirus. It is 38 codons long. It is not conserved in all Sarbecoviruses. In studies prompted by the COVID-19 pandemic, ORF10 attracted research interest as one of two viral accessory protein genes not conserved between SARS-CoV and SARS-CoV-2 and was initially described as a protein-coding gene likely under positive selection. However, although it is sometimes included in lists of SARS-CoV-2 accessory genes, experimental and bioinformatics evidence suggests ORF10 is likely not a functional protein-coding gene.

ORF1ab refers collectively to two open reading frames (ORFs), ORF1a and ORF1b, that are conserved in the genomes of nidoviruses, a group of viruses that includes coronaviruses. The genes express large polyproteins that undergo proteolysis to form several nonstructural proteins with various functions in the viral life cycle, including proteases and the components of the replicase-transcriptase complex (RTC). Together the two ORFs are sometimes referred to as the replicase gene. They are related by a programmed ribosomal frameshift that allows the ribosome to continue translating past the stop codon at the end of ORF1a, in a -1 reading frame. The resulting polyproteins are known as pp1a and pp1ab.

LYRa11 is a SARS-like coronavirus (SL-COV) which was identified in 2011 in samples of intermediate horseshoe bats in Baoshan, Yunnan, China. The genome of this virus strain is 29805nt long, and the similarity to the whole genome sequence of SARS-CoV that caused the SARS outbreak is 91%. It was published in 2014. Like SARS-CoV and SARS-CoV-2, LYRa11 virus uses ACE2 as a receptor for infecting cells.

References

  1. 1 2 3 Redondo, Natalia; Zaldívar-López, Sara; Garrido, Juan J.; Montoya, Maria (7 July 2021). "SARS-CoV-2 Accessory Proteins in Viral Pathogenesis: Knowns and Unknowns". Frontiers in Immunology. 12: 708264. doi: 10.3389/fimmu.2021.708264 . PMC   8293742 . PMID   34305949.
  2. 1 2 Jungreis, Irwin; Nelson, Chase W.; Ardern, Zachary; Finkel, Yaara; Krogan, Nevan J.; Sato, Kei; Ziebuhr, John; Stern-Ginossar, Noam; Pavesi, Angelo; Firth, Andrew E.; Gorbalenya, Alexander E.; Kellis, Manolis (June 2021). "Conflicting and ambiguous names of overlapping ORFs in the SARS-CoV-2 genome: A homology-based resolution". Virology. 558: 145–151. doi:10.1016/j.virol.2021.02.013. PMC   7967279 . PMID   33774510.
  3. 1 2 Jungreis, Irwin; Sealfon, Rachel; Kellis, Manolis (December 2021). "SARS-CoV-2 gene content and COVID-19 mutation impact by comparing 44 Sarbecovirus genomes". Nature Communications. 12 (1): 2642. Bibcode:2021NatCo..12.2642J. doi:10.1038/s41467-021-22905-7. PMC   8113528 . PMID   33976134.
  4. Marra, Marco A.; Jones, Steven J. M.; Astell, Caroline R.; Holt, Robert A.; Brooks-Wilson, Angela; Butterfield, Yaron S. N.; Khattra, Jaswinder; Asano, Jennifer K.; Barber, Sarah A.; Chan, Susanna Y.; Cloutier, Alison; Coughlin, Shaun M.; Freeman, Doug; Girn, Noreen; Griffith, Obi L.; Leach, Stephen R.; Mayo, Michael; McDonald, Helen; Montgomery, Stephen B.; Pandoh, Pawan K.; Petrescu, Anca S.; Robertson, A. Gordon; Schein, Jacqueline E.; Siddiqui, Asim; Smailus, Duane E.; Stott, Jeff M.; Yang, George S.; Plummer, Francis; Andonov, Anton; Artsob, Harvey; Bastien, Nathalie; Bernard, Kathy; Booth, Timothy F.; Bowness, Donnie; Czub, Martin; Drebot, Michael; Fernando, Lisa; Flick, Ramon; Garbutt, Michael; Gray, Michael; Grolla, Allen; Jones, Steven; Feldmann, Heinz; Meyers, Adrienne; Kabani, Amin; Li, Yan; Normand, Susan; Stroher, Ute; Tipples, Graham A.; Tyler, Shaun; Vogrig, Robert; Ward, Diane; Watson, Brynn; Brunham, Robert C.; Krajden, Mel; Petric, Martin; Skowronski, Danuta M.; Upton, Chris; Roper, Rachel L. (30 May 2003). "The Genome Sequence of the SARS-Associated Coronavirus". Science. 300 (5624): 1399–1404. Bibcode:2003Sci...300.1399M. doi:10.1126/science.1085953. PMID   12730501. S2CID   5491256.
  5. Finkel, Yaara; Mizrahi, Orel; Nachshon, Aharon; Weingarten-Gabbay, Shira; Morgenstern, David; Yahalom-Ronen, Yfat; Tamir, Hadas; Achdout, Hagit; Stein, Dana; Israeli, Ofir; Beth-Din, Adi; Melamed, Sharon; Weiss, Shay; Israely, Tomer; Paran, Nir; Schwartz, Michal; Stern-Ginossar, Noam (7 January 2021). "The coding capacity of SARS-CoV-2". Nature. 589 (7840): 125–130. Bibcode:2021Natur.589..125F. doi: 10.1038/s41586-020-2739-1 . PMID   32906143. S2CID   221624633.
  6. 1 2 Gordon, David E.; et al. (16 July 2020). "A SARS-CoV-2 protein interaction map reveals targets for drug repurposing". Nature. 583 (7816): 459–468. Bibcode:2020Natur.583..459G. doi:10.1038/s41586-020-2286-9. PMC   7431030 . PMID   32353859.
  7. von Brunn, Albrecht; Teepe, Carola; Simpson, Jeremy C.; Pepperkok, Rainer; Friedel, Caroline C.; Zimmer, Ralf; Roberts, Rhonda; Baric, Ralph; Haas, Jürgen (23 May 2007). "Analysis of Intraviral Protein-Protein Interactions of the SARS Coronavirus ORFeome". PLOS ONE. 2 (5): e459. Bibcode:2007PLoSO...2..459V. doi: 10.1371/journal.pone.0000459 . PMC   1868897 . PMID   17520018.