ORF3d

Last updated
ORF3d
Identifiers
Organism SARS-CoV-2
SymbolORF3d
UniProt P0DTG0
Search for
Structures Swiss-model
Domains InterPro

ORF3d is a gene found in SARS-CoV-2 (the virus that causes COVID-19) and at least one closely related coronavirus found in pangolins, though it is not found in other closely related viruses within the Sarbecovirus subgenus. It is 57 codons long and encodes a novel 57 amino acid residue protein of unknown function. [1] At least two isoforms have been described, of which the shorter 33-residue form, ORF3d-2, may be more highly expressed, or even the only form expressed. [1] [2] It is reported to be antigenic and antibodies to the ORF3d protein occur in patients recovered from COVID-19. [3] There is no homolog in the genome of the otherwise closely related SARS-CoV (which causes the disease SARS). [1] [4]

Contents

Nomenclature

There has been significant confusion in the scientific literature around the nomenclature used for the accessory proteins of SARS-CoV-2, especially several overlapping genes with ORF3a. [4] Many scientific papers have referred to ORF3d and its protein product as ORF3b, due to confusion caused by differences in the length of ORF3b in SARS-CoV (about 155 codons) and SARS-CoV-2 (only 22 codons). [4] Exacerbating the confusion, both the 57-codon protein product [5] and the 22-codon protein product [6] have been described to have similar effects as interferon antagonists. [4]

The recommended nomenclature for SARS-CoV-2 uses the term ORF3b for the 22-codon gene homologous to the 5' end of ORF3b in SARS-CoV, and uses the term ORF3d for the 57-codon gene. [4]

Comparative genomics

ORF3d is an overlapping gene whose open reading frame overlaps both ORF3a and ORF3c in the SARS-CoV-2 genome. This potentially represents a rare example of all three possible reading frames of the same sequence region encoding functional proteins. [1] [4] ORF3d is not present in SARS-CoV or other related coronaviruses, except for a coronavirus found in pangolins. [1] SARS-CoV-2 genome sequences have been extensively sampled throughout the COVID-19 pandemic, and examples of SARS-CoV-2 variants with truncations in ORF3d due to the introduction of a stop codon have been identified with relatively high prevalence. [1] [7]

Bioinformatics analysis of the ORF3d region suggests that the sequence of the predicted protein product is not well conserved and raises the possibility that the gene does not encode a functional protein, despite experimental evidence of protein expression. [8]

Expression

The ORF3d protein has two isoforms, one 57 amino acid residues long and one 33 residues long, the latter of which is known as ORF3d-2. [1] There is experimental evidence from studies such as ribosome profiling for expression of at least ORF3d-2, without clear evidence for the full-length ORF3d. [1] [2]

Function

The function of the ORF3d protein is not known, and it is possible that the gene does not code for a protein with any functional role in the viral life cycle. [8] When expressed under experimental conditions in cell culture, the ORF3d protein appears to be an interferon antagonist. [5]

Robust antibody responses to peptides from ORF3d have been reported in patients recovered from COVID-19. [3]

Related Research Articles

<span class="mw-page-title-main">SARS-related coronavirus</span> Species of coronavirus causing SARS and COVID-19

Severe acute respiratory syndrome–related coronavirus is a species of virus consisting of many known strains phylogenetically related to severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1) that have been shown to possess the capability to infect humans, bats, and certain other mammals. These enveloped, positive-sense single-stranded RNA viruses enter host cells by binding to the angiotensin-converting enzyme 2 (ACE2) receptor. The SARSr-CoV species is a member of the genus Betacoronavirus and of the subgenus Sarbecovirus.

<i>Henipavirus</i> Genus of RNA viruses

Henipavirus is a genus of negative-strand RNA viruses in the family Paramyxoviridae, order Mononegavirales containing six established species, and numerous others still under study. Henipaviruses are naturally harboured by several species of small mammals, notably pteropid fruit bats, microbats of several species, and shrews. Henipaviruses are characterised by long genomes and a wide host range. Their recent emergence as zoonotic pathogens capable of causing illness and death in domestic animals and humans is a cause of concern.

<span class="mw-page-title-main">ORF7a</span> Gene found in coronaviruses of the Betacoronavirus genus

ORF7a is a gene found in coronaviruses of the Betacoronavirus genus. It expresses the Betacoronavirus NS7A protein, a type I transmembrane protein with an immunoglobulin-like protein domain. It was first discovered in SARS-CoV, the virus that causes severe acute respiratory syndrome (SARS). The homolog in SARS-CoV-2, the virus that causes COVID-19, has about 85% sequence identity to the SARS-CoV protein.

<span class="mw-page-title-main">SARS-CoV-2</span> Virus that causes COVID-19

Severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) is a strain of coronavirus that causes COVID-19, the respiratory illness responsible for the COVID-19 pandemic. The virus previously had the provisional name 2019 novel coronavirus (2019-nCoV), and has also been called human coronavirus 2019. First identified in the city of Wuhan, Hubei, China, the World Health Organization designated the outbreak a public health emergency of international concern from January 30, 2020, to May 5, 2023. SARS‑CoV‑2 is a positive-sense single-stranded RNA virus that is contagious in humans.

Transmembrane protein 39B (TMEM39B) is a protein that in humans is encoded by the gene TMEM39B. TMEM39B is a multi-pass membrane protein with eight transmembrane domains. The protein localizes to the plasma membrane and vesicles. The precise function of TMEM39B is not yet well-understood by the scientific community, but differential expression is associated with survival of B cell lymphoma, and knockdown of TMEM39B is associated with decreased autophagy in cells infected with the Sindbis virus. Furthermore, the TMEM39B protein been found to interact with the SARS-CoV-2 ORF9C protein. TMEM39B is expressed at moderate levels in most tissues, with higher expression in the testis, placenta, white blood cells, adrenal gland, thymus, and fetal brain.

ORF3b is a gene found in coronaviruses of the subgenus Sarbecovirus, encoding a short non-structural protein. It is present in both SARS-CoV and SARS-CoV-2, though the protein product has very different lengths in the two viruses. The encoded protein is significantly shorter in SARS-CoV-2, at only 22 amino acid residues compared to 153–155 in SARS-CoV. Both the longer SARS-CoV and shorter SARS-CoV-2 proteins have been reported as interferon antagonists. It is unclear whether the SARS-CoV-2 gene expresses a functional protein.

RmYN02 is a bat-derived strain of Severe acute respiratory syndrome–related coronavirus. It was discovered in bat droppings collected between May and October 2019 from sites in Mengla County, Yunnan Province, China. It is the second-closest known relative of SARS-CoV-2, the virus strain that causes COVID-19, sharing 93.3% nucleotide identity at the scale of the complete virus genome. RmYN02 contains an insertion at the S1/S2 cleavage site in the spike protein, similar to SARS-CoV-2, suggesting that such insertion events can occur naturally.

RacCS203 is a bat-derived strain of severe acute respiratory syndrome–related coronavirus collected in acuminate horseshoe bats from sites in Thailand and sequenced by Lin-Fa Wang's team. It has 91.5% sequence similarity to SARS-CoV-2 and is most related to the RmYN02 strain. Its spike protein is closely related to RmYN02's spike, both highly divergent from SARS-CoV-2's spike.

Rc-o319 is a bat-derived strain of severe acute respiratory syndrome–related coronavirus collected in Little Japanese horseshoe bats from sites in Iwate, Japan. Its has 81% similarity to SARS-CoV-2 and is the earliest strain branch of the SARS-CoV-2 related coronavirus.

<span class="mw-page-title-main">Coronavirus nucleocapsid protein</span> Most expressed structure in coronaviruses

The nucleocapsid (N) protein is a protein that packages the positive-sense RNA genome of coronaviruses to form ribonucleoprotein structures enclosed within the viral capsid. The N protein is the most highly expressed of the four major coronavirus structural proteins. In addition to its interactions with RNA, N forms protein-protein interactions with the coronavirus membrane protein (M) during the process of viral assembly. N also has additional functions in manipulating the cell cycle of the host cell. The N protein is highly immunogenic and antibodies to N are found in patients recovered from SARS and COVID-19.

ORF3c is a gene found in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It was first identified in the SARS-CoV-2 genome and encodes a 41 amino acid non-structural protein of unknown function. It is also present in the SARS-CoV genome, but was not recognized until the identification of the SARS-CoV-2 homolog.

<span class="mw-page-title-main">ORF3a</span> Gene found in coronaviruses of the subgenus Sarbecovirus

ORF3a is a gene found in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It encodes an accessory protein about 275 amino acid residues long, which is thought to function as a viroporin. It is the largest accessory protein and was the first of the SARS-CoV accessory proteins to be described.

<span class="mw-page-title-main">ORF8</span> Gene that encodes a viral accessory protein

ORF8 is a gene that encodes a viral accessory protein, Betacoronavirus NS8 protein, in coronaviruses of the subgenus Sarbecovirus. It is one of the least well conserved and most variable parts of the genome. In some viruses, a deletion splits the region into two smaller open reading frames, called ORF8a and ORF8b - a feature present in many SARS-CoV viral isolates from later in the SARS epidemic, as well as in some bat coronaviruses. For this reason the full-length gene and its protein are sometimes called ORF8ab. The full-length gene, exemplified in SARS-CoV-2, encodes a protein with an immunoglobulin domain of unknown function, possibly involving interactions with the host immune system. It is similar in structure to the ORF7a protein, suggesting it may have originated through gene duplication.

ORF6 is a gene that encodes a viral accessory protein in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It is not present in MERS-CoV. It is thought to reduce the immune system response to viral infection through interferon antagonism.

<span class="mw-page-title-main">ORF9b</span> Gene

ORF9b is a gene that encodes a viral accessory protein in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It is an overlapping gene whose open reading frame is entirely contained within the N gene, which encodes coronavirus nucleocapsid protein. The encoded protein is 97 amino acid residues long in SARS-CoV and 98 in SARS-CoV-2, in both cases forming a protein dimer.

ORF9c is an open reading frame (ORF) in coronavirus genomes of the subgenus Sarbecovirus. It is 73 codons long in the SARS-CoV-2 genome. Although it is often included in lists of Sarbecovirus viral accessory protein genes, experimental and bioinformatics evidence suggests ORF9c may not be a functional protein-coding gene.

ORF10 is an open reading frame (ORF) found in the genome of the SARS-CoV-2 coronavirus. It is 38 codons long. It is not conserved in all Sarbecoviruses. In studies prompted by the COVID-19 pandemic, ORF10 attracted research interest as one of two viral accessory protein genes not conserved between SARS-CoV and SARS-CoV-2 and was initially described as a protein-coding gene likely under positive selection. However, although it is sometimes included in lists of SARS-CoV-2 accessory genes, experimental and bioinformatics evidence suggests ORF10 is likely not a functional protein-coding gene.

ORF1ab refers collectively to two open reading frames (ORFs), ORF1a and ORF1b, that are conserved in the genomes of nidoviruses, a group of viruses that includes coronaviruses. The genes express large polyproteins that undergo proteolysis to form several nonstructural proteins with various functions in the viral life cycle, including proteases and the components of the replicase-transcriptase complex (RTC). Together the two ORFs are sometimes referred to as the replicase gene. They are related by a programmed ribosomal frameshift that allows the ribosome to continue translating past the stop codon at the end of ORF1a, in a -1 reading frame. The resulting polyproteins are known as pp1a and pp1ab.

LYRa11 is a SARS-like coronavirus (SL-COV) which was identified in 2011 in samples of intermediate horseshoe bats in Baoshan, Yunnan, China. The genome of this virus strain is 29805nt long, and the similarity to the whole genome sequence of SARS-CoV that caused the SARS outbreak is 91%. It was published in 2014. Like SARS-CoV and SARS-CoV-2, LYRa11 virus uses ACE2 as a receptor for infecting cells.

ZC45 and ZXC21, sometimes known as the Zhoushan virus, are two bat-derived strains of severe acute respiratory syndrome–related coronavirus. They were collected from least horseshoe bats by personnel from military laboratories in the Third Military Medical University and the Research Institute for Medicine of Nanjing Command between July 2015 and February 2017 from sites in Zhoushan, Zhejiang, China, and published in 2018. These two virus strains belong to the clade of SARS-CoV-2, the virus strain that causes COVID-19, sharing 88% nucleotide identity at the scale of the complete virus genome.

References

  1. 1 2 3 4 5 6 7 8 Nelson CW, Ardern Z, Goldberg TL, Meng C, Kuo CH, Ludwig C, et al. (October 2020). "Dynamically evolving novel overlapping gene as a factor in the SARS-CoV-2 pandemic". eLife. 9: e59633. doi: 10.7554/eLife.59633 . PMC   7655111 . PMID   33001029.
  2. 1 2 Finkel Y, Mizrahi O, Nachshon A, Weingarten-Gabbay S, Morgenstern D, Yahalom-Ronen Y, et al. (January 2021). "The coding capacity of SARS-CoV-2" (PDF). Nature. 589 (7840): 125–130. Bibcode:2021Natur.589..125F. doi:10.1038/s41586-020-2739-1. PMID   32906143. S2CID   218582461.
  3. 1 2 Hachim A, Kavian N, Cohen CA, Chin AW, Chu DK, Mok CK, et al. (October 2020). "ORF8 and ORF3b antibodies are accurate serological markers of early and late SARS-CoV-2 infection". Nature Immunology. 21 (10): 1293–1301. doi: 10.1038/s41590-020-0773-7 . PMID   32807944. S2CID   221136730.
  4. 1 2 3 4 5 6 Jungreis I, Nelson CW, Ardern Z, Finkel Y, Krogan NJ, Sato K, et al. (June 2021). "Conflicting and ambiguous names of overlapping ORFs in the SARS-CoV-2 genome: A homology-based resolution". Virology. 558: 145–151. doi:10.1016/j.virol.2021.02.013. hdl: 1721.1/130363 . PMC   7967279 . PMID   33774510.
  5. 1 2 Lu R, Zhao X, Li J, Niu P, Yang B, Wu H, et al. (February 2020). "Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding". Lancet. 395 (10224): 565–574. doi:10.1016/S0140-6736(20)30251-8. PMC   7159086 . PMID   32007145.
  6. Konno Y, Kimura I, Uriu K, Fukushi M, Irie T, Koyanagi Y, et al. (September 2020). "SARS-CoV-2 ORF3b Is a Potent Interferon Antagonist Whose Activity Is Increased by a Naturally Occurring Elongation Variant". Cell Reports. 32 (12): 108185. doi:10.1016/j.celrep.2020.108185. PMC   7473339 . PMID   32941788.
  7. Lam JY, Yuen CK, Ip JD, Wong WM, To KK, Yuen KY, Kok KH (December 2020). "Loss of orf3b in the circulating SARS-CoV-2 strains". Emerging Microbes & Infections. 9 (1): 2685–2696. doi:10.1080/22221751.2020.1852892. PMC   7782295 . PMID   33205709.
  8. 1 2 Jungreis I, Sealfon R, Kellis M (May 2021). "SARS-CoV-2 gene content and COVID-19 mutation impact by comparing 44 Sarbecovirus genomes". Nature Communications. 12 (1): 2642. Bibcode:2021NatCo..12.2642J. doi:10.1038/s41467-021-22905-7. hdl: 1721.1/130581 . PMC   8113528 . PMID   33976134.