Exscalate4Cov (E4C) | |
---|---|
![]() | |
Country | European Union |
Launched | 1 April 2020 [1] |
Closed | 30 September 2021 [1] |
Funding | 2 970 875 € [1] |
Status | Project Closed |
Website | https://www.exscalate4cov.eu |
Exscalate4Cov was a public-private consortium supported by the Horizon Europe program from the European Union, aimed at leveraging high-performance computing (HPC) as a response to the coronavirus pandemic. The project utilized high-throughput, extreme-scale, computer-aided drug design software to conduct experiments. [2]
The Exsclate4Cov project, which stands for EXaSCale smArt pLatform Against paThogEns for Corona Virus, [1] was coordinated by Dompé Farmaceutici and involved 17 participants. [1] It was part of the Horizon 2020 SOCIETAL CHALLENGES - Health, demographic change and well-being founding [3] funding.
The project conducted one of the largest virtual screening [4] and drug repositioning experiments, [5] identifying a potentially effective molecule against SARS-CoV-2. [6]
Drug discovery can be a long and costly process, often taking years and requiring substantial financial investment. [7] Pharmaceutical companies have large datasets of chemical compounds, which they test against a drug target, often a protein receptor. The goal is to find compounds that interact with the targets, leading to potential therapeutic effects. [8]
Therefore, the process of finding new drugs usually involves high-throughput screening (HTS). HTS enables the rapid identification of active compounds. [9] For example, virtual screening can be used as an early stage of the drug discovery pipeline to evaluate the interactions between large datasets of small molecules and a drug target, identifying potential hit candidates. This approach helps in identifying potential hit candidates by predicting how different compounds will bind to the target protein, which will go further in the experimental validation. [9]
In an urgent computing scenario, such as a pandemic, where time to solution is critical, virtual screening is used to identify hit molecules for the latter stages of the drug discovery pipeline, such as lead optimization and clinical trial. [10] The Exscalate4Cov project was initiated after the COVID-19 pandemic outbreak. This project aimed to leverage the computational power of EU supercomputers to accelerate the discovery of effective treatments for the coronavirus. [11] By utilizing high-throughput virtual screening, Exscalate4Cov aimed to find faster solutions to the crisis.
Exscalate4Cov's approach involved screening billions of compounds against various protein targets of the SARS-CoV-2 virus, identifying those with a higher binding affinity with the target. The project's objectives were:
The Exscalate4Cov project followed the ANTAREX4ZIKA [14] project, both of which aimed to leverage HPC for drug discovery, albeit targeting different viruses. While Exscalate4Cov focused on the SARS-CoV-2 virus responsible for COVID-19, ANTAREX4ZIKA was dedicated to addressing the Zika virus. The ANTAREX4ZIKA project concluded at the end of 2018 and involved a virtual screening campaign on the CINECA Marconi machine, with a total of 10 PetaFLOPS. [14] The ANTAREX project, [15] which stands for AutoTuning and Adaptivity appRoach for Energy efficient eXascale HPC systems, emphasized auto-tuning and energy efficiency of HPC applications, making them more effective in various research scenarios, including drug discovery.
The Exscalate4Cov consortium of public-private entities has been coordinated by Dompè, and it involved 17 other institutions, from research centers to universities. [1]
Organization | Type | Industry | Country |
---|---|---|---|
Dompé Farmaceutici | Private | Pharmaceutical industry | ![]() |
CINECA | Public research center | Supercomputing | ![]() |
Politecnico di milano | Public university | Scientific and technological research, education | ![]() |
University of Milan | Public university | Scientific and technological research, education | ![]() |
Katholieke Universiteit, Leuven | Public university | Scientific and technological research, education | ![]() |
International Institute of Molecular and Cell Biology | Public research center | Research center | ![]() |
Elettra Sincrotrone Trieste | Research Organisations | Research center | ![]() |
Fraunhofer-Gesellschaft | Research Organisations | Research center | ![]() |
Barcelona Supercomputing Center | Public research center | Supercomputing | ![]() |
Forschungszentrum Jülich | Public research center | Supercomputing | ![]() |
University of Naples Federico II | Public university | Scientific and technological research, education | ![]() |
University of Cagliari | Public university | Scientific and technological research, education | ![]() |
SIB Swiss Institute of Bioinformatics | Public research center | Research center | ![]() |
KTH Royal Institute of Technology | Public university | Scientific and technological research, education | ![]() |
Lazzaro Spallanzani National Institute for Infectious Diseases | Research Organisations | Hospital | ![]() |
Associtazione Big Data | Company | Other | ![]() |
Istituto Nazionale di Fisica Nucleare | Public research center | Research center | ![]() |
Chelonia SA | Company | Other | ![]() |
Inputs at the application level consist of ligands from the chemical space and the protein target of the virtual screening campaign, specifically the spike protein in the case of Exscalate4Cov. [11] Following a molecular docking stage that generates potential ligand conformations, a scoring stage assesses the interaction strength between each ligand's pose and the protein. [4] The pipeline ultimately produces a ranking of hit compounds as its output, indicating the most promising candidates for further investigation. [4]
At the software level, the project utilizes the EXSCALATE docking platform. [4] [14] LiGen (Ligand Generator) is one of the main components of the platform, and it is used to perform molecular docking and scoring simulations. LiGen is responsible for generating and evaluating the conformations of ligands. Another relevant component at the same level is the libdpipe library, which facilitates scaling across multi-node and cores. [4]
To hinge the computational power offered by HPC centers, the docking platform uses MPI [16] to scale multi-node and CUDA acceleration to take advantage of supercomputer's GPUs. The CUDA version has undergone various optimizations, including OpenACC, OpenMP, and other techniques, [17] [18] [19] to enhance performance and efficiency.
The project's main experiment evaluated the interactions between 12 viral proteins of SARS-CoV-2 against 70 billion molecules from the EXSCALATE [12] chemical library. In November 2020, consortium members coordinated one of the largest virtual screening campaigns, harnessing the combined computational power of two supercomputers totaling 81 PFLOPS. [20]
The supercomputers used are:
The large-scale campaign used a reservation of 800 Marconi100 nodes and 1500 HP5 nodes for 60 hours. [4] Achieving an average throughput was 2400 ligands per second (lig/s) on Marconi100 and 2000 lig/s on HPC5. [4]
Another critical aspect of the experiment was data storage management. The platform leveraged efficient MPI I/O [16] operations to handle multi-node computations. The input data required 3.3 TB of space in SMILES format. [4] However, SMILES data needed to be expanded in a pre-processing step involving 100 nodes over five days. [4] Similarly, the post-processing step involved 19 nodes over five days.
The final output consisted of CSV files containing scores for each input ligand, occupying 69 TB. [4] The resulting dataset, containing 570 million hit compounds, is freely available. [4]
The Exscalate4Cov project also conducted drug repositioning experiments. [5] Drug repurposing offers an interesting approach to address unmet clinical needs in case of urgent computing, due to pandemics. Hence, repurposing existing drugs with established safety and toxicology profiles provides a significant advantage by saving time in identifying potential new treatments. [8] During the European Exscalate4Cov project activities, raloxifene was selected through a combined approach of drug repurposing and in-silico screening on SARS-CoV-2 target’s proteins, followed by subsequent in-vitro screening. [4] [5]
The project's large-scale campaign results are available through the MEDIATE (MolEcular DockIng AT homE) platform. [23] The objective of MEDIATE [24] is to collect a chemical library of Sars-COV-2 inhibitors. The MEDIATE portal provides access to a set of small molecules that research can use to start de-novo drug design from a reduced set of molecules.
Raloxifene is a known chemical compound used to treat osteoporosis. As a result of drug repositioning experiments, the E4C project identified raloxifene as a possible candidate to treat early-stage COVID-19 patients, [6] [5] aiming to prevent clinical progression. [25] In October 2020, AIFA authorized clinical trials to treat COVID-19 patients, [26] and it is currently undergoing testing for approval. [27]
The experiments, including the discovery of raloxifene as a possible drug candidate against COVID-19, gained significant interest from the scientific community, as documented in several scientific articles. [4] [6] [5]
The project's results also captured national interest in Italy, highlighted by various newspaper articles, [28] [29] [30] due to the use of Italian supercomputers during the pandemic. Additionally, the large-scale campaign results gained attention from international journals. [31] [32]
Blue Gene was an IBM project aimed at designing supercomputers that can reach operating speeds in the petaFLOPS (PFLOPS) range, with low power consumption.
Drug design, often referred to as rational drug design or simply rational design, is the inventive process of finding new medications based on the knowledge of a biological target. The drug is most commonly an organic small molecule that activates or inhibits the function of a biomolecule such as a protein, which in turn results in a therapeutic benefit to the patient. In the most basic sense, drug design involves the design of molecules that are complementary in shape and charge to the biomolecular target with which they interact and therefore will bind to it. Drug design frequently but not necessarily relies on computer modeling techniques. This type of modeling is sometimes referred to as computer-aided drug design. Finally, drug design that relies on the knowledge of the three-dimensional structure of the biomolecular target is known as structure-based drug design. In addition to small molecules, biopharmaceuticals including peptides and especially therapeutic antibodies are an increasingly important class of drugs and computational methods for improving the affinity, selectivity, and stability of these protein-based therapeutics have also been developed.
grid.org was a website and online community established in 2001 for cluster computing and grid computing software users. For six years it operated several different volunteer computing projects that allowed members to donate their spare computer cycles to worthwhile causes. In 2007, it became a community for open source cluster and grid computing software. After around 2010 it redirected to other sites.
In biology and other experimental sciences, an in silico experiment is one performed on a computer or via computer simulation software. The phrase is pseudo-Latin for 'in silicon', referring to silicon in computer chips. It was coined in 1987 as an allusion to the Latin phrases in vivo, in vitro, and in situ, which are commonly used in biology. The latter phrases refer, respectively, to experiments done in living organisms, outside living organisms, and where they are found in nature.
Selective estrogen receptor modulators (SERMs), also known as estrogen receptor agonists/antagonists (ERAAs), are a class of drugs that act on estrogen receptors (ERs). Compared to pure ER agonists-antagonists, SERMs are more tissue-specific, allowing them to selectively inhibit or stimulate estrogen-like action in various tissues.
Raloxifene, sold under the brand name Evista among others, is a medication used to prevent and treat osteoporosis in postmenopausal women and those on glucocorticoids. For osteoporosis it is less preferred than bisphosphonates. It is also used to reduce the risk of breast cancer in those at high risk. It is taken by mouth.
In the field of molecular modeling, docking is a method which predicts the preferred orientation of one molecule to a second when a ligand and a target are bound to each other to form a stable complex. Knowledge of the preferred orientation in turn may be used to predict the strength of association or binding affinity between two molecules using, for example, scoring functions.
Protein–ligand docking is a molecular modelling technique. The goal of protein–ligand docking is to predict the position and orientation of a ligand when it is bound to a protein receptor or enzyme. Pharmaceutical research employs docking techniques for a variety of purposes, most notably in the virtual screening of large databases of available chemicals in order to select likely drug candidates. There has been rapid development in computational ability to determine protein structure with programs such as AlphaFold, and the demand for the corresponding protein-ligand docking predictions is driving implementation of software that can find accurate models. Once the protein folding can be predicted accurately along with how the ligands of various structures will bind to the protein, the ability for drug development to progress at a much faster rate becomes possible.
Virtual screening (VS) is a computational technique used in drug discovery to search libraries of small molecules in order to identify those structures which are most likely to bind to a drug target, typically a protein receptor or enzyme.
Hit to lead (H2L) also known as lead generation is a stage in early drug discovery where small molecule hits from a high throughput screen (HTS) are evaluated and undergo limited optimization to identify promising lead compounds. These lead compounds undergo more extensive optimization in a subsequent step of drug discovery called lead optimization (LO). The drug discovery process generally follows the following path that includes a hit to lead stage:
In the fields of computational chemistry and molecular modelling, scoring functions are mathematical functions used to approximately predict the binding affinity between two molecules after they have been docked. Most commonly one of the molecules is a small organic compound such as a drug and the second is the drug's biological target such as a protein receptor. Scoring functions have also been developed to predict the strength of intermolecular interactions between two proteins or between protein and DNA.
Drug repositioning involves the investigation of existing drugs for new therapeutic purposes.
AutoDock is a molecular modeling simulation software. It is especially effective for protein-ligand docking. AutoDock 4 is available under the GNU General Public License. AutoDock is one of the most cited docking software applications in the research community. It is used by the FightAIDS@Home and OpenPandemics - COVID-19 projects run at World Community Grid, to search for antivirals against HIV/AIDS and COVID-19. In February 2007, a search of the ISI Citation Index showed more than 1,100 publications had been cited using the primary AutoDock method papers. As of 2009, this number surpassed 1,200.
FightAIDS@Home is a volunteer computing project operated by the Olson Laboratory at The Scripps Research Institute. It runs on internet-connected home computers, and since July 2013 also runs on Android smartphones and tablets. It aims to use biomedical software simulation techniques to search for ways to cure or prevent the spread of HIV/AIDS.
Several centers for supercomputing exist across Europe, and distributed access to them is coordinated by European initiatives to facilitate high-performance computing. One such initiative, the HPC Europa project, fits within the Distributed European Infrastructure for Supercomputing Applications (DEISA), which was formed in 2002 as a consortium of eleven supercomputing centers from seven European countries. Operating within the CORDIS framework, HPC Europa aims to provide access to supercomputers across Europe.
Molecular Operating Environment (MOE) is a drug discovery software platform that integrates visualization, modeling and simulations, as well as methodology development, in one package. MOE scientific applications are used by biologists, medicinal chemists and computational chemists in pharmaceutical, biotechnology and academic research. MOE runs on Windows, Linux, Unix, and macOS. Main application areas in MOE include structure-based design, fragment-based design, ligand-based design, pharmacophore discovery, medicinal chemistry applications, biologics applications, structural biology and bioinformatics, protein and antibody modeling, molecular modeling and simulations, virtual screening, cheminformatics & QSAR. The Scientific Vector Language (SVL) is the built-in command, scripting and application development language of MOE.
LeDock is a molecular docking software, designed for protein-ligand interactions, that is compatible with Linux, macOS, and Windows.
FlexAID is a molecular docking software that can use small molecules and peptides as ligands and proteins and nucleic acids as docking targets. As the name suggests, FlexAID supports full ligand flexibility as well side-chain flexibility of the target. It does using a soft scoring function based on the complementarity of the two surfaces.
Software for COVID-19 pandemic mitigation takes many forms. It includes mobile apps for contact tracing and notifications about infection risks, vaccine passports, software for enabling – or improving the effectiveness of – lockdowns and social distancing, Web software for the creation of related information services, and research and development software. A common issue is that few apps interoperate, reducing their effectiveness.
The COVID Moonshot is a collaborative open-science project started in March 2020 with the goal of developing an un-patented oral antiviral drug to treat SARS-CoV-2, the virus causing COVID-19. COVID Moonshot researchers are targeting the proteins needed to form functioning new viral proteins. They are particularly interested in proteases such as 3C-like protease (Mpro), a coronavirus nonstructural protein that mediates the breaking and replication of proteins.