Betacoronavirus lipid binding protein | |||||||||
---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||
Symbol | bCoV_lipid_BD | ||||||||
Pfam | PF09399 | ||||||||
InterPro | IPR018542 | ||||||||
|
ORF9b (formerly sometimes called ORF13) is a gene that encodes a viral accessory protein in coronaviruses of the subgenus Sarbecovirus , including SARS-CoV and SARS-CoV-2. It is an overlapping gene whose open reading frame is entirely contained within the N gene, which encodes coronavirus nucleocapsid protein. [2] [3] [4] The encoded protein is 97 amino acid residues long in SARS-CoV [2] [3] and 98 in SARS-CoV-2, [4] in both cases forming a protein dimer.
There has been inconsistency in the nomenclature used for this gene in the scientific literature. In some work on SARS-CoV, it has been referred to as ORF13. It has also sometimes been referred to as ORF9a, resulting in a downstream ORF of 76 codons in SARS-CoV, also overlapping with the N gene, being designated ORF9b. The recommended nomenclature refers to the longer ORF as 9b and the downstream, shorter ORF as ORF9c. [5]
The ORF9b protein is 97 amino acid residues long in SARS-CoV [2] [3] and 98 in SARS-CoV-2. [4] It forms a beta sheet-rich homodimer with a hydrophobic cavity in the center that binds lipids. [2] [3] [4] The lipid-binding cavity may serve as an unusual mechanism for anchoring the protein to membranes. [1]
A fragment of the SARS-CoV-2 ORF9b protein has been structurally characterized in a protein complex with Tom70 in which ORF9b forms an alpha helix rather than the beta-sheet structure observed in isolation. [6] This fold switching behavior is also consistent with bioinformatics predictions and may also occur for the SARS-CoV homolog. [7]
ORF9b is one of two overlapping genes fully contained within the open reading frame of the N gene encoding coronavirus nucleocapsid protein, the other being ORF9c. ORF9b is expressed by ribosome leaky scanning from its bicistronic subgenomic RNA. [2] [3] [8] Unlike its neighbor ORF9c, its length is well conserved in sarbecoviruses and there is strong evidence it is a functional protein-coding gene. [9]
In SARS-CoV, the protein is localized to the endoplasmic reticulum (ER) [3] and to intracellular vesicles. [2] [1] It does not have a nuclear localization sequence but can enter the cell nucleus by passive diffusion; it does however have a nuclear export sequence for exit from the nucleus. [2] [3] In SARS-CoV-2, it is reportedly associated with the mitochondrial membrane. [4]
The function of the ORF9b protein is not well characterized. It is not essential for viral replication. [2]
The ORF9b protein has been reported to interact with a number of other viral proteins, including ORF6, non-structural protein 5, non-structural protein 14, and coronavirus envelope protein. [2] It has been detected in mature SARS-CoV virions and thus may be a minor viral structural protein. [2] [3] [8]
The ORF9b protein may be involved in modulating the host's immune system response. The SARS-CoV-2 protein has been reported to suppress interferon response via its interactions with Tom70, a component of the mitochondrial translocase of the outer membrane (TOM) complex. [6] [10]
Coronaviruses are a group of related RNA viruses that cause diseases in mammals and birds. In humans and birds, they cause respiratory tract infections that can range from mild to lethal. Mild illnesses in humans include some cases of the common cold, while more lethal varieties can cause SARS, MERS and COVID-19, which is causing an ongoing pandemic. In cows and pigs they cause diarrhea, while in mice they cause hepatitis and encephalomyelitis.
Severe acute respiratory syndrome–related coronavirus is a species of virus consisting of many known strains phylogenetically related to severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1) that have been shown to possess the capability to infect humans, bats, and certain other mammals. These enveloped, positive-sense single-stranded RNA viruses enter host cells by binding to the angiotensin-converting enzyme 2 (ACE2) receptor. The SARSr-CoV species is a member of the genus Betacoronavirus and of the subgenus Sarbecovirus.
ORF7a is a gene found in coronaviruses of the Betacoronavirus genus. It expresses the Betacoronavirus NS7A protein, a type I transmembrane protein with an immunoglobulin-like protein domain. It was first discovered in SARS-CoV, the virus that causes severe acute respiratory syndrome (SARS). The homolog in SARS-CoV-2, the virus that causes COVID-19, has about 85% sequence identity to the SARS-CoV protein.
Alphacoronaviruses (Alpha-CoV) are members of the first of the four genera of coronaviruses. They are positive-sense, single-stranded RNA viruses that infect mammals, including humans. They have spherical virions with club-shaped surface projections formed by trimers of the spike protein, and a viral envelope.
Transmembrane protein 39B (TMEM39B) is a protein that in humans is encoded by the gene TMEM39B. TMEM39B is a multi-pass membrane protein with eight transmembrane domains. The protein localizes to the plasma membrane and vesicles. The precise function of TMEM39B is not yet well-understood by the scientific community, but differential expression is associated with survival of B cell lymphoma, and knockdown of TMEM39B is associated with decreased autophagy in cells infected with the Sindbis virus. Furthermore, the TMEM39B protein been found to interact with the SARS-CoV-2 ORF9C protein. TMEM39B is expressed at moderate levels in most tissues, with higher expression in the testis, placenta, white blood cells, adrenal gland, thymus, and fetal brain.
Bat coronavirus RaTG13 is a SARS-like betacoronavirus that infects the horseshoe bat Rhinolophus affinis. It was discovered in 2013 in bat droppings from a mining cave near the town of Tongguan in Mojiang county in Yunnan, China. It is the closest known relative of SARS-CoV-2, the virus that causes COVID-19, sharing 96.1% nucleotide similarity. Preprint from September 2021 suggested even a closer match with a strain of coronavirus found in bats in Laos.
ORF3b is a gene found in coronaviruses of the subgenus Sarbecovirus, encoding a short non-structural protein. It is present in both SARS-CoV and SARS-CoV-2, though the protein product has very different lengths in the two viruses. The encoded protein is significantly shorter in SARS-CoV-2, at only 22 amino acid residues compared to 153-155 in SARS-CoV. Both the longer SARS-CoV and shorter SARS-CoV-2 proteins have been reported as interferon antagonists. It is unclear whether the SARS-CoV-2 gene expresses a functional protein.
ORF3d is a gene found in SARS-CoV-2 and at least one closely related coronavirus found in pangolins, though it is not found in other closely related viruses within the Sarbecovirus subgenus. It is 57 codons long and encodes a novel 57 amino acid residue protein of unknown function. At least two isoforms have been described, of which the shorter 33-residue form, ORF3d-2, may be more highly expressed, or even the only form expressed. It is reported to be antigenic and antibodies to the ORF3d protein occur in patients recovered from COVID-19. There is no homolog in the genome of the otherwise closely related SARS-CoV.
The envelope (E) protein is the smallest and least well-characterized of the four major structural proteins found in coronavirus virions. It is an integral membrane protein less than 110 amino acid residues long; in SARS-CoV-2, the causative agent of Covid-19, the E protein is 75 residues long. Although it is not necessarily essential for viral replication, absence of the E protein may produce abnormally assembled viral capsids or reduced replication. E is a multifunctional protein and, in addition to its role as a structural protein in the viral capsid, it is thought to be involved in viral assembly, likely functions as a viroporin, and is involved in viral pathogenesis.
The membrane (M) protein is an integral membrane protein that is the most abundant of the four major structural proteins found in coronaviruses. The M protein organizes the assembly of coronavirus virions through protein-protein interactions with other M protein molecules as well as with the other three structural proteins, the envelope (E), spike (S), and nucleocapsid (N) proteins.
The nucleocapsid (N) protein is a protein that packages the positive-sense RNA genome of coronaviruses to form ribonucleoprotein structures enclosed within the viral capsid. The N protein is the most highly expressed of the four major coronavirus structural proteins. In addition to its interactions with RNA, N forms protein-protein interactions with the coronavirus membrane protein (M) during the process of viral assembly. N also has additional functions in manipulating the cell cycle of the host cell. The N protein is highly immunogenic and antibodies to N are found in patients recovered from SARS and Covid-19.
Spike (S) glycoprotein is the largest of the four major structural proteins found in coronaviruses. The spike protein assembles into trimers that form large structures, called spikes or peplomers, that project from the surface of the virion. The distinctive appearance of these spikes when visualized using negative stain transmission electron microscopy, "recalling the solar corona", gives the virus family its name.
ORF3c is a gene found in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It was first identified in the SARS-CoV-2 genome and encodes a 41 amino acid non-structural protein of unknown function. It is also present in the SARS-CoV genome, but was not recognized until the identification of the SARS-CoV-2 homolog.
ORF3a is a gene found in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It encodes an accessory protein about 275 amino acid residues long, which is thought to function as a viroporin. It is the largest accessory protein and was the first of the SARS-CoV accessory proteins to be described.
ORF7b is a gene found in coronaviruses of the genus Betacoronavirus, which expresses the accessory protein Betacoronavirus NS7b protein. It is a short, highly hydrophobic transmembrane protein of unknown function.
ORF8 is a gene that encodes a viral accessory protein, Betacoronavirus NS8 protein, in coronaviruses of the subgenus Sarbecovirus. It is one of the least well conserved and most variable parts of the genome. In some viruses, a deletion splits the region into two smaller open reading frames, called ORF8a and ORF8b - a feature present in many SARS-CoV viral isolates from later in the SARS epidemic, as well as in some bat coronaviruses. For this reason the full-length gene and its protein are sometimes called ORF8ab. The full-length gene, exemplified in SARS-CoV-2, encodes a protein with an immunoglobulin domain of unknown function, possibly involving interactions with the host immune system. It is similar in structure to the ORF7a protein, suggesting it may have originated through gene duplication.
ORF6 is a gene that encodes a viral accessory protein in coronaviruses of the subgenus Sarbecovirus, including SARS-CoV and SARS-CoV-2. It is not present in MERS-CoV. It is thought to reduce the immune system response to viral infection through interferon antagonism.
ORF9c is an open reading frame (ORF) in coronavirus genomes of the subgenus Sarbecovirus. It is 73 codons long in the SARS-CoV-2 genome. Although it is often included in lists of Sarbecovirus viral accessory protein genes, experimental and bioinformatics evidence suggests ORF9c may not be a functional protein-coding gene.
ORF10 is an open reading frame (ORF) found in the genome of the SARS-CoV-2 coronavirus. It is 38 codons long. It is not conserved in all Sarbecoviruses. In studies prompted by the COVID-19 pandemic, ORF10 attracted research interest as one of two viral accessory protein genes not conserved between SARS-CoV and SARS-CoV-2 and was initially described as a protein-coding gene likely under positive selection. However, although it is sometimes included in lists of SARS-CoV-2 accessory genes, experimental and bioinformatics evidence suggests ORF10 is likely not a functional protein-coding gene.
ORF1ab refers collectively to two open reading frames (ORFs), ORF1a and ORF1b, that are conserved in the genomes of nidoviruses, a group of viruses that includes coronaviruses. The genes express large polyproteins that undergo proteolysis to form several nonstructural proteins with various functions in the viral life cycle, including proteases and the components of the replicase-transcriptase complex (RTC). Together the two ORFs are sometimes referred to as the replicase gene. They are related by a programmed ribosomal frameshift that allows the ribosome to continue translating past the stop codon at the end of ORF1a, in a -1 reading frame. The resulting polyproteins are known as pp1a and pp1ab.