The human identical sequence (HIS) is a sequence of RNA elements, 24-27 nucleotides in length, that coronavirus genomes share with the human genome. [1] In pathogenic progression, HIS acts as a NamiRNA (nuclear activating miRNA) through the NamiRNA-enhancer network to activate neighboring host genes. [2] [3] The first HIS elements was identified in the SARS-CoV-2 genome, which has five HIS elements; other human coronaviruses have one to five. [1] It has been suggested that these sequences can be more generally termed "host identical sequences" since similar correlations have been found between the genome of SARS-CoV-2 and multiple potential hosts (bats, pangolins, ferrets, and cats). [1]
name | length | sequence | location in virus genome | location in human genome | neighboring genes | note |
---|---|---|---|---|---|---|
HIS-SARS2-1 | 26 | UGUCUAUGCUAAUGGAGGUAAAGGCU | 7570–7595 in ORF1a | Chr3: 124017420-124017395 | KALRN | |
HIS-SARS2-2 | 24 | UAUAACACAUATAAAAAUACGUGU | 12494–12517 in ORF1a | Chr3: 176597319-176597342 | ||
HIS-SARS2-3 | 24 | UUAUAUGCCUUAUUUCUUUACUUU | 6766–6789 in ORF1a | Chr5: 28949255-28949232 | ||
HIS-SARS2-4 | 27 | AGGAGAAUGACAAAAAAAAAAAAAAAA | 29860–29886 in 3' UTR | Chr18: 73670168-73670142 | FBXO15, TIMM21, CYB5A | same as HIS-SARS1-2 |
HIS-SARS2-5 | 24 | UUGUUGCUGCUAUUUUCUAUUUAA | 8610–8633 in ORF1a | ChrX: 99693480-99693457 |
name | length | sequence | location in virus genome | location in human genome | neighboring genes | note |
---|---|---|---|---|---|---|
HIS-SARS-1 | 25 | UAACAUGCUUAGGAUAAUGGCCUCU | 15251–15275 in ORF1b | Chr4: 172887105–172887129 Chr8: 122356667-122356690 | HAS2, ZHX2 | |
HIS-SARS-2 | 27 | AGGAGAAUGACAAAAAAAAAAAAAAAA | 29717–29743 in 3' UTR | Chr18: 73670168-73670142 | same as HIS-SARS2-4 |
name | length | sequence | location in virus genome | location in human genome | neighboring genes | note |
---|---|---|---|---|---|---|
HIS-MERS-1 | 24 | UUCCAUUUGCACAGAGUAUCUUUU | 24364–24387 in S | ChrX: 25635779-25635802 |
name | length | sequence | location in virus genome | location in human genome | neighboring genes | note |
---|---|---|---|---|---|---|
HIS-HKU1-1 | 24 | UUAGAAUUGUUCAAAUGUUAUCUG | 18656-18679 | chr1:106816197-106816220 | ||
HIS-HKU1-2 | 24 | UUUUCUAAGAAAGAUUGGUAUGAU | 14044-14067 | chr1:226438633-226438656 chr4:151100495-151100518 chr5:79284823-79284846 chr5:111192947-111192970 chr7:94695722-94695745 chr7:98386489-98386512 chr15:59768424-59768447 chr22:30137367-30137390 | ||
HIS-HKU1-3 | 24 | AUUUGACUUUAAAUCUUCAUACUA | 26693-26716 | chr4:11718458-11718481 | ||
HIS-HKU1-4 | 24 | GAUUGGUUGUAUUUUCAUUUUUAU | 23527-23550 | chr4:33759646-33759669 | ||
HIS-HKU1-5 | 24 | UAGAUACUGUUAUUUUUAAAAAUA | 19844-19867 | chrX:81711130-81711153 |
name | length | sequence | location in virus genome | location in human genome | neighboring genes | note |
---|---|---|---|---|---|---|
HIS-NL63-1 | 24 | UUAUGAUUUUGGUGAUUUUGUUGU | 13044-13067 | chr1:215311768-215311791 | ||
HIS-NL63-2 | 24 | GGUGUUUUUGUUGAUGAUGUUGUU | 14920-14943 | chr4:28254452-28254475 | ||
HIS-NL63-3 | 24 | AUAGGCUUAAAUGCUUCUGUUACU | 20754-20777 | chr6:30469931-30469954 | ||
HIS-NL63-4 | 24 | AAGUAAUUGUAUUAAGAUGUUAUC | 12124-12147 | chr7:19853545-19853568 | ||
HIS-NL63-5 | 24 | AACUUUUAUGAUUUUGGUGAUUUU | 13039-13062 | chr9:1525276-1525299 |
name | length | sequence | location in virus genome | location in human genome | neighboring genes | note |
---|---|---|---|---|---|---|
HIS-OC43-1 | 24 | UACAGCUCUUUGUAAAUCUGGUAG | 22827-22850 | chr8:122471006-122471029 | HAS2, ZHX2 | |
HIS-OC43-2 | 24 | UUGUAUGAGUGAUUUUAUGAGUGA | 24509-24532 | chr13:30510223-30510246 |
name | length | sequence | location in virus genome | location in human genome | neighboring genes | note |
---|---|---|---|---|---|---|
HIS-229E-1 | 24 | AAUAUUUUAACAGUACCACGUUAU | 19817-19840 | chr8:42865576-42865599 | ||
HIS-229E-2 | 24 | ACUUUGUAUUGUGUCCUCCUGGAA | 13139-13162 | chr11:112451251-112451274 |
Severe-acute-respiratory-syndrome–related coronavirus is a species of virus consisting of many known strains. Two strains of the virus have caused outbreaks of severe respiratory diseases in humans: severe acute respiratory syndrome coronavirus 1, which caused the 2002–2004 outbreak of severe acute respiratory syndrome (SARS), and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which is causing the ongoing pandemic of COVID-19. There are hundreds of other strains of SARSr-CoV, which are only known to infect non-human mammal species: bats are a major reservoir of many strains of SARSr-CoV; several strains have been identified in Himalayan palm civets, which were likely ancestors of SARS-CoV-1.
Severe acute respiratory syndrome coronavirus 1 (SARS-CoV-1), previously known as severe acute respiratory syndrome coronavirus (SARS-CoV), is a strain of coronavirus that causes severe acute respiratory syndrome (SARS), the respiratory illness responsible for the 2002–2004 SARS outbreak. It is an enveloped, positive-sense, single-stranded RNA virus that infects the epithelial cells within the lungs. The virus enters the host cell by binding to angiotensin-converting enzyme 2. It infects humans, bats, and palm civets. The SARS-CoV-1 outbreak was largely brought under control by simple public health measures. Testing people with symptoms, isolating and quarantining suspected cases, and restricting travel all had an effect. SARS-CoV-1 was most transmissible when patients were sick, so its spread could be effectively suppressed by isolating patients with symptoms.
RNA activation (RNAa) is a small RNA-guided and Argonaute (Ago)-dependent gene regulation phenomenon in which promoter-targeted short double-stranded RNAs (dsRNAs) induce target gene expression at the transcriptional/epigenetic level. RNAa was first reported in a 2006 PNAS paper by Li et al. who also coined the term "RNAa" as a contrast to RNA interference (RNAi) to describe such gene activation phenomenon. dsRNAs that trigger RNAa have been termed small activating RNA (saRNA). Since the initial discovery of RNAa in human cells, many other groups have made similar observations in different mammalian species including human, non-human primates, rat and mice, plant and C. elegans, suggesting that RNAa is an evolutionarily conserved mechanism of gene regulation.
The Coronavirus packaging signal is a conserved cis-regulatory element found in Betacoronavirus. It has an important role in regulating the packaging of the viral genome into the capsid. As part of the viral life cycle, within the infected cell, the viral genome becomes associated with viral proteins and assembles into new infective progeny viruses. This process is called packaging and is vital for viral replication.
mir-133 is a type of non-coding RNA called a microRNA that was first experimentally characterised in mice. Homologues have since been discovered in several other species including invertebrates such as the fruitfly Drosophila melanogaster. Each species often encodes multiple microRNAs with identical or similar mature sequence. For example, in the human genome there are three known miR-133 genes: miR-133a-1, miR-133a-2 and miR-133b found on chromosomes 18, 20 and 6 respectively. The mature sequence is excised from the 3' arm of the hairpin. miR-133 is expressed in muscle tissue and appears to repress the expression of non-muscle genes.
Inhibitor of growth protein 4 is a protein that in humans is encoded by the ING4 gene.
Serine/threonine-protein kinase PAK 6 is an enzyme that in humans is encoded by the PAK6 gene.
Therapeutic Target Database (TTD) is a pharmaceutical and medical repository constructed by the Innovative Drug Research and Bioinformatics Group (IDRB) at Zhejiang University, China and the Bioinformatics and Drug Design Group at the National University of Singapore. It provides information about known and explored therapeutic protein and nucleic acid targets, the targeted disease, pathway information and the corresponding drugs directed at each of these targets. Detailed knowledge about target function, sequence, 3D structure, ligand binding properties, enzyme nomenclature and drug structure, therapeutic class, and clinical development status. TTD is freely accessible without any login requirement at https://idrblab.org/ttd/.
SOAP is a suite of bioinformatics software tools from the BGI Bioinformatics department enabling the assembly, alignment, and analysis of next generation DNA sequencing data. It is particularly suited to short read sequencing data.
Bat SARS-like coronavirus WIV1, also sometimes called SARS-like coronavirus WIV1, is a strain of severe acute respiratory syndrome–related coronavirus (SARSr-CoV) isolated from Chinese rufous horseshoe bats in 2013. Like all coronaviruses, virions consist of single-stranded positive-sense RNA enclosed within an envelope.
NamiRNAs are a type of miRNAs present in the nucleus, which can activate gene expression by binding to the enhancer, and therefore were named nuclear activating miRNAs (NamiRNAs), such as miR-24-1 and miR-26. These miRNAs loci are enriched with epigenetic markers that display enhancer activity like histone H3K27ac, P300/CBP, and DNaseI high-sensitivity loci. These NamiRNAs are able to activate the related enhancers and co-work with them to up-regulate the expression of neighboring genes. NamiRNAs are able to promote global gene transcription by binding their targeted enhancers in whole genome level.
Severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) is a strain of coronavirus that causes COVID-19, the respiratory illness responsible for the COVID-19 pandemic. The virus previously had the provisional name 2019 novel coronavirus (2019-nCoV), and has also been called human coronavirus 2019. First identified in the city of Wuhan, Hubei, China, the World Health Organization designated the outbreak a public health emergency of international concern from January 30, 2020, to May 5, 2023. SARS‑CoV‑2 is a positive-sense single-stranded RNA virus that is contagious in humans.
SHC014-CoV is a SARS-like coronavirus (SL-COV) which infects horseshoe bats. It was discovered in Kunming in Yunnan Province, China. It was discovered along with SL-CoV Rs3367, which was the first bat SARS-like coronavirus shown to directly infect a human cell line. The line of Rs3367 that infected human cells was named Bat SARS-like coronavirus WIV1.
Bat coronavirus RaTG13 is a SARS-like betacoronavirus identified in the droppings of the horseshoe bat Rhinolophus affinis. It was discovered in 2013 in bat droppings from a mining cave near the town of Tongguan in Mojiang county in Yunnan, China. In February 2020, it was identified as the closest known relative of SARS-CoV-2, the virus that causes COVID-19, sharing 96.1% nucleotide identity. However, in 2022, scientists found three closer matches in bats found 530 km south, in Feuang, Laos, designated as BANAL-52, BANAL-103 and BANAL-236.
Civet SARS-CoV is a coronavirus associated with severe acute respiratory syndrome coronavirus (SARS-CoV), which infected humans and caused SARS events from 2002 to 2003. It infected the masked palm civet. The severe acute respiratory syndrome coronavirus (SARS-CoV) is highly similar, with a genome sequence similarity of about 99.8%. Because several patients infected at the early stage of the epidemic had contact with fruit-eating Japanese raccoon dog in the market, tanuki may be a direct source of human SARS coronavirus. At the end of 2003, four more people in Guangzhou, China, were infected with the disease. Sequence analysis found that the similarity with the tanuki virus reached 99.9%, and the SARS coronavirus was also caused by cases of tanuki transmission.
16BO133 is a SARS-like coronavirus (SL-COV) which was found in the greater horseshoe bat in South Korea. It was published in 2019 and its genome was completely sequenced. The sequenced Korean SARSr-CoV strain belongs to the severe acute respiratory syndrome coronavirus 1, and its genome sequence similarity is 82.8%.
LYRa11 is a SARS-like coronavirus (SL-COV) which was identified in 2011 in samples of intermediate horseshoe bats in Baoshan, Yunnan, China. The genome of this virus strain is 29805nt long, and the similarity to the whole genome sequence of SARS-CoV that caused the SARS outbreak is 91%. It was published in 2014. Like SARS-CoV and SARS-CoV-2, LYRa11 virus uses ACE2 as a receptor for infecting cells.