Established | April 2020 |
---|---|
Focus | COVID-19 genomic sequencing |
Key people | Sharon Peacock |
Budget | £32.2 million [1] |
Location |
The COVID-19 Genomics UK (COG-UK) consortium was a group of academic institutions and public health agencies in the United Kingdom created in April 2020 [1] [2] [3] to collect, sequence and analyse genomes of SARS-CoV-2 at scale, as part of COVID-19 pandemic response.
The genome data generated by COG-UK was integrated with epidemiological data and patient health records to monitor introductions into the UK, community transmission and outbreaks of SARS-CoV-2; to assess changes in transmissibility and virulence; and to evaluate the impact of treatments and non-pharmaceutical interventions. COG-UK members also undertook research that integrated human genomic and health data to understand the biology of SARS-CoV-2 and its impact on those infected. [4]
The consortium identified the SARS-CoV-2 Alpha variant (at the time, referred to as Variant of Concern 202012/01) in November 2020, which became the subject of subsequent investigations by the UK public health agencies, coordinated by Public Health England and supported by COG-UK. [5] [6]
Between April and September 2021, SARS-CoV-2 sequencing transitioned to become a public health-led national service, [7] after which COG-UK focused on data linkage, research and international training. [8]
The consortium formally closed at the end of March 2023. [4]
COG-UK was supported by £20 million funding from the Department of Health and Social Care, UK Research and Innovation (UKRI), and the Wellcome Sanger Institute. [1]
The consortium received a further £12.2 million from the Department of Health and Social Care's Testing Innovation Fund in November 2020 to facilitate the genome sequencing capacity needed to meet the increasing number of COVID-19 cases in the UK over the 2020–2021 winter period. [9]
Together with Wellcome Connecting Science, COG-UK was also awarded a Foreign, Commonwealth & Development Office/Wellcome Epidemic Preparedness Coronavirus grant of nearly £1 million to develop COG-Train, [10] a learning programme to support the global scientific and public health community in SARS-CoV-2 genome sequencing. [11]
The consortium comprised the four UK public health agencies, the Wellcome Sanger Institute, sixteen academic partners, and National Health Service organisations. It adopted a Hub and Spoke model with a centralised administrative hub (University of Cambridge), a national sequencing hub (Sanger Institute), and sequencing and analysis capacity regionally distributed across all partners. [12]
Samples from COVID-19 patients were collected by NHS laboratories, public health laboratories and national coronavirus testing centres and sent to partner sites for sequencing. Viral genome sequencing was undertaken by the four UK public health agencies, the Wellcome Sanger Institute, the Quadram Institute, and fifteen universities including Queen's University Belfast, the University of Birmingham, Cardiff University, the University of Cambridge, the University of Edinburgh, the University of Exeter, the University of Glasgow, the University of Liverpool, Northumbria University, the University of Nottingham, the University of Oxford, the University of Portsmouth, University College London, Imperial College London and the University of Sheffield. [13]
The rapid launch and delivery of SARS-CoV-2 genome sequencing data was enabled by use of the pre-existing infrastructure and expertise of COG-UK partners and members. For example, COG-UK relied on the Cloud Infrastructure for Microbial Bioinformatics (CLIMB), [14] which was launched in 2014 with support from the Medical Research Council. CLIMB is an open, cloud-based computing infrastructure for developing and sharing datasets and bioinformatics software, tools and methods to interpret ‘big data’. [15] CLIMB is a partnership between the Universities of Bath, Birmingham, Cardiff, Leicester, Swansea and Warwick, the London School of Hygiene and Tropical Medicine and the Quadram Institute. [16]
Genome data for each positive sample sequenced was linked to the person who provided the sample using an anonymous code. These were uploaded to CLIMB-COVID where the genomes could be characterised, which included assigning each genome to a lineage and detecting mutations. Public health agencies were then able to access this information and link it to detailed public health information. [17]
The executive director and chair of the consortium was Sharon Peacock, a professor and microbiologist at the University of Cambridge. [18] [19]
More than 600 consortium members contributed to the work of COG-UK between April 2020 and the end of March 2023, [20] with key roles fulfilled by researchers from across the consortium partners, [21] and a management team based at the University of Cambridge. [22]
COG-UK pioneered early and large-scale coordinated national sequencing of SARS-CoV-2 viral genomes, along with the open and rapid sharing of genomic data.
By December 2020, COG-UK had sequenced the genomes of more than 150,000 samples of SARS-CoV-2 virus and uploaded them to GISAID, representing around 5% of all UK COVID-19 cases. Approximately 60% of these were sequenced at the Wellcome Sanger Institute. By December 2021, the UK had sequenced over 1.8 million SARS-CoV-2 genomes.
The insights arising from analysis of these genome data had a range of impacts during the COVID-19 pandemic response: [23]
By September 2021, COG-UK pivoted to focus on three strategic areas: [8]
By linking and analysing different types of data, it is possible to undertake more in depth research and generate new understanding about the differences in the transmission of SARS-CoV-2 variants and the disease that they cause.
COG-UK collaborated with partners through the UK Health Data Research Alliance and the Outbreak Data Analysis Platform (ODAP) to link viral genome data with other complex datasets. [37] [38]
Through these collaborations, SARS-CoV-2 sequencing data has been linked to a number of different datasets including:
All such linked data are stored securely with controlled access through approved pathways to maintain patient privacy and ensure appropriate usage and attribution. [39]
COG-UK provided seed funding for consortium members to undertake research projects that would generate novel insights relevant to the COVID-19 pandemic, and to prepare for future pandemics. [40]
These projects focussed on:
COG-UK also provided seed funding for consortium members to support early career scientists reinvigorate research projects delayed owing to time spent on the pandemic response. [4]
Together with Wellcome Connecting Science, COG-UK established a free international online learning programme in using genomics to investigate SARS-CoV-2 and other infectious diseases. Known as COG-Train, the programme sought to ‘facilitate an increase global genome sequencing and analysis capacity, reduce sequencing inequality and enhance pathogen surveillance’. Its training courses were built through partnerships with international researchers, public health experts and surveillance networks. Outputs of COG-Train included a series of massive online, open-access courses on all aspects of SARS-CoV-2 sequencing, as well as week-long intensive virtual training courses, short expert workshops and concurrent distributed classrooms. [41]
Funded by the Wellcome Trust and the Foreign, Commonwealth & Development Office, COG-Train delivered free online training to thousands of learners in more than 100 countries worldwide. Its three-week course 'The Power of Genomics to Understand the COVID-19 Pandemic' remains available through the FutureLearn platform. [42] This and all other COG-UK training resources have been preserved via the GitHub repository and remain easily and openly accessible for the long-term.
COG-Train was recognised at the Cambridge Independent Science and Technology Awards 2024, receiving Highly Commended in the ‘Tech For Good’ award category. [43]
Protocols and tools developed by COG-UK consortium members were used globally during the first three years of the COVID-19 pandemic and continue to be of utility today. These include:
By December 2020, the number of sequences uploaded to GISAID by COG-UK was just under 5% of all UK COVID-19 cases, compared to 3.2% for the United States and 60% for Australia. [19] Approximately 60% of these were sequenced at the Wellcome Sanger Institute [18] and the COG-UK consortium was reported to have understood 'the genetic history of more than 150,000 samples of SARS-CoV-2 virus'. [52]
According to the COG-UK website, by December 2021, the UK had sequenced over 1.8 million SARS-CoV-2 genomes. [53]
COG-UK ran a series of events to share the advances and discoveries arising from the work of consortium members and to provide forums for discussion. These included seminars and science showcases, [54] [55] where consortium members have shared their analyses and insights with each other and the public.
In October 2021, the COG-UK Together event saw members from across the consortium join a mixed online/in person event to share their experiences during the first 18 months of the pandemic. [56]
Consortium members also organised and hosted the ‘Women in COG’ interview series, which showcased the lives and work of inspirational women (and male supporters/allies) from the COG-UK network and outside of the consortium. Interviewees included:
This series was also distilled into a book titled 'Snapshots of Women in COG: Scientific Excellence during the COVID-19 pandemic'. [70]
COG-UK members produced more than 100 publications [71] using the data, analysis and tools developed between April 2020 and March 2023. [72]
An independent evaluation undertaken by Rand Europe reported that COG-UK made ‘diverse contributions to understanding and responding to the COVID-19 pandemic’ and ‘a significant and valuable contribution to the UK's public health genomics landscape’. [73]
The report identified the key COG-UK achievements as:
The story of the consortium has been captured in 'Cracking Covid: The history of COG-UK', an exhibition curated by historian of medicine Lara Marks which documents the work and achievements of COG-UK based on interviews with more than eighty consortium members. [74] The exhibition is free to access through the 'What is biotechnology' platform.
BGI Group, formerly Beijing Genomics Institute, is a Chinese genomics company with headquarters in Yantian, Shenzhen. The company was originally formed in 1999 as a genetics research center to participate in the Human Genome Project. It also sequences the genomes of other animals, plants and microorganisms.
The Wellcome Sanger Institute, previously known as The Sanger Centre and Wellcome Trust Sanger Institute, is a non-profit British genomics and genetics research institute, primarily funded by the Wellcome Trust.
GISAID, the Global Initiative on Sharing All Influenza Data, previously the Global Initiative on Sharing Avian Influenza Data, is a global science initiative established in 2008 to provide access to genomic data of influenza viruses. The database was expanded to include the coronavirus responsible for the COVID-19 pandemic, as well as other pathogens. The database has been described as "the world's largest repository of COVID-19 sequences". GISAID facilitates genomic epidemiology and real-time surveillance to monitor the emergence of new COVID-19 viral strains across the planet.
CSIR Institute of Genomics and Integrative Biology (CSIR-IGIB) is a scientific research institute devoted primarily to biological research. It is a part of Council of Scientific and Industrial Research (CSIR), India.
Dr Vinod Scaria FRSB, FRSPH is an Indian biologist, medical researcher pioneering in Precision Medicine and Clinical Genomics in India. He is best known for sequencing the first Indian genome. He was also instrumental in the sequencing of The first Sri Lankan Genome, analysis of the first Malaysian Genome sequencing and analysis of the Wild-type strain of Zebrafish and the IndiGen programme on Genomics for Public Health in India.
The 100,000 Genomes Project is a now-completed UK Government project managed by Genomics England that is sequencing whole genomes from National Health Service patients. The project is focusing on rare diseases, some common types of cancer, and infectious diseases. Participants give consent for their genome data to be linked to information about their medical condition and health records. The medical and genomic data is shared with researchers to improve knowledge of the causes, treatment, and care of diseases. The project has received over £300 million from public and private investment.
Mark J. Pallen is a research leader at the Quadram Institute and Professor of Microbial Genomics at the University of East Anglia. In recent years, he has been at the forefront of efforts to apply next-generation sequencing to problems in microbiology and ancient DNA research.
Sharon Jayne Peacock is a British microbiologist who is Professor of Public Health and Microbiology in the Department of Medicine at the University of Cambridge, and Master of Churchill College, Cambridge.
Tanja Stadler is a mathematician and professor of computational evolution at the Swiss Federal Institute of Technology. She’s the current president of the Swiss Scientific Advisory Panel COVID-19 and Vize-Chair of the Department of Biosystems Science and Engineering at ETH Zürich.
Severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) is a strain of coronavirus that causes COVID-19, the respiratory illness responsible for the COVID-19 pandemic. The virus previously had the provisional name 2019 novel coronavirus (2019-nCoV), and has also been called human coronavirus 2019. First identified in the city of Wuhan, Hubei, China, the World Health Organization designated the outbreak a public health emergency of international concern from January 30, 2020, to May 5, 2023. SARS‑CoV‑2 is a positive-sense single-stranded RNA virus that is contagious in humans.
The Alpha variant (B.1.1.7) was a SARS-CoV-2 variant of concern. It was estimated to be 40–80% more transmissible than the wild-type SARS-CoV-2. Scientists more widely took note of this variant in early December 2020, when a phylogenetic tree showing viral sequences from Kent, United Kingdom looked unusual.
The Beta variant, (B.1.351), was a variant of SARS-CoV-2, the virus that causes COVID-19. One of several SARS-CoV-2 variants initially believed to be of particular importance, it was first detected in the Nelson Mandela Bay metropolitan area of the Eastern Cape province of South Africa in October 2020, which was reported by the country's health department on 18 December 2020. Phylogeographic analysis suggests this variant emerged in the Nelson Mandela Bay area in July or August 2020.
Variants of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) are viruses that, while similar to the original, have genetic changes that are of enough significance to lead virologists to label them separately. SARS-CoV-2 is the virus that causes coronavirus disease 2019 (COVID-19). Some have been stated, to be of particular importance due to their potential for increased transmissibility, increased virulence, or reduced effectiveness of vaccines against them. These variants contribute to the continuation of the COVID-19 pandemic.
The Gamma variant (P.1) was one of the variants of SARS-CoV-2, the virus that causes COVID-19. This variant of SARS-CoV-2 has been named lineage P.1 and has 17 amino acid substitutions, ten of which in its spike protein, including these three designated to be of particular concern: N501Y, E484K and K417T. It was first detected by the National Institute of Infectious Diseases (NIID) of Japan, on 6 January 2021 in four people who had arrived in Tokyo having visited Amazonas, Brazil, four days earlier. It was subsequently declared to be in circulation in Brazil. Under the simplified naming scheme proposed by the World Health Organization, P.1 was labeled Gamma variant, and was considered a variant of concern until March 2022, when it was largely displaced by the delta and omicron variants.
The Phylogenetic Assignment of Named Global Outbreak Lineages (PANGOLIN) is a software tool developed by Dr. Áine O'Toole and members of the Andrew Rambaut laboratory, with an associated web application developed by the Centre for Genomic Pathogen Surveillance in South Cambridgeshire. Its purpose is to implement a dynamic nomenclature to classify genetic lineages for SARS-CoV-2, the virus that causes COVID-19. A user with a full genome sequence of a sample of SARS-CoV-2 can use the tool to submit that sequence, which is then compared with other genome sequences, and assigned the most likely lineage. Single or multiple runs are possible, and the tool can return further information regarding the known history of the assigned lineage. Additionally, it interfaces with Microreact, to show a time sequence of the location of reports of sequenced samples of the same lineage. This latter feature draws on publicly available genomes obtained from the COVID-19 Genomics UK Consortium and from those submitted to GISAID. It is named after the pangolin.
The Centre for Genomic Pathogen Surveillance is a computational genomics research institute in Oxfordshire.
Jemma Louise Geoghegan is a Scottish-born evolutionary virologist, based at the University of Otago, New Zealand, who specialises in researching emerging infectious diseases and the use of metagenomics to trace the evolution of viruses. As a leader in several government-funded research projects, Geoghegan became the public face of genomic sequencing during New Zealand's response to COVID-19. Her research has contributed to the discussion about the likely cause of COVID-19 and the challenges around predicting pandemics. She was a recipient of the Young Tall Poppy Award in 2017, a Rutherford Discovery Fellowship in 2020, and the 2021 Prime Minister's Emerging Scientist Prize.
Christian Happi is a Professor of Molecular Biology and Genomics in the Department of Biological Sciences and the Director of the African Centre of Excellence for Genomics of Infectious Diseases, both at Redeemer’s University. He is known for leading the team of scientists that used genomic sequencing to identify a single point of infection from an animal reservoir to a human in the Ebola outbreak in West Africa. His research focus is on infectious diseases, including malaria, Lassa fever, Ebola virus disease, HIV, and SARS-CoV-2.
The COVID Moonshot is a collaborative open-science project started in March 2020 with the goal of developing an un-patented oral antiviral drug to treat SARS-CoV-2, the virus causing COVID-19. COVID Moonshot researchers are targeting the proteins needed to form functioning new viral proteins. They are particularly interested in proteases such as 3C-like protease (Mpro), a coronavirus nonstructural protein that mediates the breaking and replication of proteins.
Tulio de Oliveira is a Brazilian, Portuguese, and South African permanent resident professor of bioinformatics at the University of KwaZulu-Natal and Stellenbosch University, South Africa, and associate professor of global health at the University of Washington. He has studied outbreaks of chikungunya, dengue, hepatitis B and C, HIV, SARS-CoV-2, yellow fever and Zika. During the COVID-19 pandemic he led the team that confirmed the discovery of the Beta variant of the COVID-19 virus in 2020 and the Omicron variant in 2021.
The project - called the Covid-19 Genomics UK Consortium - is a collaboration between the NHS, public health agencies and the Wellcome Sanger Institute universities. Business Secretary Alok Sharma said: "This new consortium will bring together the UK's brightest and best scientists to build our understanding of this pandemic, tackle the disease and ultimately, save lives."
{{cite web}}
: CS1 maint: numeric names: authors list (link){{cite web}}
: CS1 maint: numeric names: authors list (link){{cite journal}}
: Cite journal requires |journal=
(help){{cite journal}}
: CS1 maint: numeric names: authors list (link)