A chemical library or compound library is a collection of stored chemicals usually used ultimately in high-throughput screening or industrial manufacture. The chemical library can consist in simple terms of a series of stored chemicals. Each chemical has associated information stored in some kind of database with information such as the chemical structure, purity, quantity, and physiochemical characteristics of the compound.
In drug discovery high-throughput screening, it is desirable to screen a drug target against a selection of chemicals that try to take advantage of as much of the appropriate chemical space as possible. The chemical space of all possible chemical structures is extraordinarily large. Most stored chemical libraries do not typically have a fully represented or sampled chemical space mostly because of storage and cost concerns. However, since many molecular interactions cannot be predicted, the wider the chemical space that is sampled by the chemical library, the better the chance that high-throughput screening will find a "hit"—a chemical with an appropriate interaction in a biological model that might be developed into a drug.
An example of a chemical library in drug discovery would be a series of chemicals known to inhibit kinases, or in industrial processes, a series of catalysts known to polymerize resins.
Chemical libraries are usually generated for a specific goal and larger chemical libraries could be made of several groups of smaller libraries stored in the same location. In the drug discovery process for instance, a wide range of organic chemicals are needed to test against models of disease in high-throughput screening. Therefore, most of the chemical synthesis needed to generate chemical libraries in drug discovery is based on organic chemistry. A company that is interested in screening for kinase inhibitors in cancer may limit their chemical libraries and synthesis to just those types of chemicals known to have affinity for ATP binding sites or allosteric sites.
Generally, however, most chemical libraries focus on large groups of varied organic chemical series where an organic chemist can make many variations on the same molecular scaffold or molecular backbone. Sometimes chemicals can be purchased from outside vendors as well and included into an internal chemical library.
Depending upon their scope and design, chemical libraries can also be classified as diverse oriented, Drug-like, Lead-like, peptide-mimetic, Natural Product-like, [1] Targeted against a specific family of biological targets such Kinases, GPCRs, Proteases, PPI etc. These chemical libraries are often used in target based drug discovery (reverse pharmacology). Among the compound libraries should be annotated the Fragment Compound Libraries, which are mainly used for Fragment-based lead discovery.
Chemical libraries are usually designed by chemists and chemoinformatics scientists and synthesized by organic chemistry and medicinal chemistry. The method of chemical library generation usually depends on the project and there are many factors to consider when using rational methods to select screening compounds. [2] Typically, a range of chemicals is screened against a particular drug target or disease model, and the preliminary "hits", or chemicals that show the desired activity, are re-screened to verify their activity. Once they are qualified as a "hit" by their repeatability and activity, these particular chemicals are registered and analysed. Chemoproteomics is a field of study that incorporates the use of chemical libraries to identify protein targets. Commonalities among the different chemical groups are studied as they are often reflective of a particular chemical subspace. Additional chemistry work may be needed to further optimize the chemical library in the active portion of the subspace. When it is needed, more synthesis is completed to extend out the chemical library in that particular subspace by generating more compounds that are very similar to the original hits. This new selection of compounds within this narrow range are further screened and then taken on to more sophisticated models for further validation in the Drug Discovery Hit to Lead process.
The "chemical space" of all possible organic chemicals is large and increases exponentially with the size of the molecule. Most chemical libraries do not typically have a fully represented chemical space mostly because of storage and cost concerns.
Because of the expense and effort involved in chemical synthesis, the chemicals must be correctly stored and banked away for later use to prevent early degradation. Each chemical has a particular shelf life and storage requirement and in a good-sized chemical library, there is a timetable by which library chemicals are disposed of and replaced on a regular basis. Some chemicals are fairly unstable, radioactive, volatile or flammable and must be stored under careful conditions in accordance with safety standards such as OSHA.
Most chemical libraries are managed with information technologies such as barcoding and relational databases. Additionally, robotics are necessary to fetch compounds in larger chemical libraries.
Because a chemical library's individual entries can easily reach up into the millions of compounds, the management of even modest-sized chemical libraries can be a full-time endeavor. Compound management is one such field that attempts to manage and upkeep these chemical libraries as well as maximizing safety and effectiveness in their management.
Combinatorial chemistry comprises chemical synthetic methods that make it possible to prepare a large number of compounds in a single process. These compound libraries can be made as mixtures, sets of individual compounds or chemical structures generated by computer software. Combinatorial chemistry can be used for the synthesis of small molecules and for peptides.
A chemical database is a database specifically designed to store chemical information. This information is about chemical and crystal structures, spectra, reactions and syntheses, and thermophysical data.
In the fields of medicine, biotechnology and pharmacology, drug discovery is the process by which new candidate medications are discovered.
Cheminformatics refers to the use of physical chemistry theory with computer and information science techniques—so called "in silico" techniques—in application to a range of descriptive and prescriptive problems in the field of chemistry, including in its applications to biology and related molecular fields. Such in silico techniques are used, for example, by pharmaceutical companies and in academic settings to aid and inform the process of drug discovery, for instance in the design of well-defined combinatorial libraries of synthetic compounds, or to assist in structure-based drug design. The methods can also be used in chemical and allied industries, and such fields as environmental science and pharmacology, where chemical processes are involved or studied.
Drug design, often referred to as rational drug design or simply rational design, is the inventive process of finding new medications based on the knowledge of a biological target. The drug is most commonly an organic small molecule that activates or inhibits the function of a biomolecule such as a protein, which in turn results in a therapeutic benefit to the patient. In the most basic sense, drug design involves the design of molecules that are complementary in shape and charge to the biomolecular target with which they interact and therefore will bind to it. Drug design frequently but not necessarily relies on computer modeling techniques. This type of modeling is sometimes referred to as computer-aided drug design. Finally, drug design that relies on the knowledge of the three-dimensional structure of the biomolecular target is known as structure-based drug design. In addition to small molecules, biopharmaceuticals including peptides and especially therapeutic antibodies are an increasingly important class of drugs and computational methods for improving the affinity, selectivity, and stability of these protein-based therapeutics have also been developed.
High-throughput screening (HTS) is a method for scientific discovery especially used in drug discovery and relevant to the fields of biology, materials science and chemistry. Using robotics, data processing/control software, liquid handling devices, and sensitive detectors, high-throughput screening allows a researcher to quickly conduct millions of chemical, genetic, or pharmacological tests. Through this process one can quickly recognize active compounds, antibodies, or genes that modulate a particular biomolecular pathway. The results of these experiments provide starting points for drug design and for understanding the noninteraction or role of a particular location.
Medicinal or pharmaceutical chemistry is a scientific discipline at the intersection of chemistry and pharmacy involved with designing and developing pharmaceutical drugs. Medicinal chemistry involves the identification, synthesis and development of new chemical entities suitable for therapeutic use. It also includes the study of existing drugs, their biological properties, and their quantitative structure-activity relationships (QSAR).
Chemical biology is a scientific discipline between the fields of chemistry and biology. The discipline involves the application of chemical techniques, analysis, and often small molecules produced through synthetic chemistry, to the study and manipulation of biological systems. Although often confused with biochemistry, which studies the chemistry of biomolecules and regulation of biochemical pathways within and between cells, chemical biology remains distinct by focusing on the application of chemical tools to address biological questions.
Chemogenomics, or chemical genomics, is the systematic screening of targeted chemical libraries of small molecules against individual drug target families with the ultimate goal of identification of novel drugs and drug targets. Typically some members of a target library have been well characterized where both the function has been determined and compounds that modulate the function of those targets have been identified. Other members of the target family may have unknown function with no known ligands and hence are classified as orphan receptors. By identifying screening hits that modulate the activity of the less well characterized members of the target family, the function of these novel targets can be elucidated. Furthermore, the hits for these targets can be used as a starting point for drug discovery. The completion of the human genome project has provided an abundance of potential targets for therapeutic intervention. Chemogenomics strives to study the intersection of all possible drugs on all of these potential targets.
A chemical compound microarray is a collection of organic chemical compounds spotted on a solid surface, such as glass and plastic. This microarray format is very similar to DNA microarray, protein microarray and antibody microarray. In chemical genetics research, they are routinely used for searching proteins that bind with specific chemical compounds, and in general drug discovery research, they provide a multiplex way to search potential drugs for therapeutic targets.
Hit to lead (H2L) also known as lead generation is a stage in early drug discovery where small molecule hits from a high throughput screen (HTS) are evaluated and undergo limited optimization to identify promising lead compounds. These lead compounds undergo more extensive optimization in a subsequent step of drug discovery called lead optimization (LO). The drug discovery process generally follows the following path that includes a hit to lead stage:
Compound management in the field of drug discovery refers to the systematic collection, storage, retrieval, and quality control of small molecule chemical compounds used in high-throughput screening and other research activities to identify hits that can be developed into candidate drugs.
A lead compound in drug discovery is a chemical compound that has pharmacological or biological activity likely to be therapeutically useful, but may nevertheless have suboptimal structure that requires modification to fit better to the target; lead drugs offer the prospect of being followed by back-up compounds. Its chemical structure serves as a starting point for chemical modifications in order to improve potency, selectivity, or pharmacokinetic parameters. Furthermore, newly invented pharmacologically active moieties may have poor druglikeness and may require chemical modification to become drug-like enough to be tested biologically or clinically.
High throughput biology is the use of automation equipment with classical cell biology techniques to address biological questions that are otherwise unattainable using conventional methods. It may incorporate techniques from optics, chemistry, biology or image analysis to permit rapid, highly parallel research into how cells function, interact with each other and how pathogens exploit them in disease.
Fragment-based lead discovery (FBLD) also known as fragment-based drug discovery (FBDD) is a method used for finding lead compounds as part of the drug discovery process. Fragments are small organic molecules which are small in size and low in molecular weight. It is based on identifying small chemical fragments, which may bind only weakly to the biological target, and then growing them or combining them to produce a lead with a higher affinity. FBLD can be compared with high-throughput screening (HTS). In HTS, libraries with up to millions of compounds, with molecular weights of around 500 Da, are screened, and nanomolar binding affinities are sought. In contrast, in the early phase of FBLD, libraries with a few thousand compounds with molecular weights of around 200 Da may be screened, and millimolar affinities can be considered useful. FBLD is a technique being used in research for discovering novel potent inhibitors. This methodology could help to design multitarget drugs for multiple diseases. The multitarget inhibitor approach is based on designing an inhibitor for the multiple targets. This type of drug design opens up new polypharmacological avenues for discovering innovative and effective therapies. Neurodegenerative diseases like Alzheimer’s (AD) and Parkinson’s, among others, also show rather complex etiopathologies. Multitarget inhibitors are more appropriate for addressing the complexity of AD and may provide new drugs for controlling the multifactorial nature of AD, stopping its progression.
DNA-encoded chemical libraries (DECL) is a technology for the synthesis and screening on an unprecedented scale of collections of small molecule compounds. DECL is used in medicinal chemistry to bridge the fields of combinatorial chemistry and molecular biology. The aim of DECL technology is to accelerate the drug discovery process and in particular early phase discovery activities such as target validation and hit identification.
Chemical genetics is the investigation of the function of proteins and signal transduction pathways in cells by the screening of chemical libraries of small molecules. Chemical genetics is analogous to classical genetic screen where random mutations are introduced in organisms, the phenotype of these mutants is observed, and finally the specific gene mutation (genotype) that produced that phenotype is identified. In chemical genetics, the phenotype is disturbed not by introduction of mutations, but by exposure to small molecule tool compounds. Phenotypic screening of chemical libraries is used to identify drug targets or to validate those targets in experimental models of disease. Recent applications of this topic have been implicated in signal transduction, which may play a role in discovering new cancer treatments. Chemical genetics can serve as a unifying study between chemistry and biology. The approach was first proposed by Tim Mitchison in 1994 in an opinion piece in the journal Chemistry & Biology entitled "Towards a pharmacological genetics".
In the field of drug discovery, classical pharmacology, also known as forward pharmacology, or phenotypic drug discovery (PDD), relies on phenotypic screening of chemical libraries of synthetic small molecules, natural products or extracts to identify substances that have a desirable therapeutic effect. Using the techniques of medicinal chemistry, the potency, selectivity, and other properties of these screening hits are optimized to produce candidate drugs.
Chemoproteomics entails a broad array of techniques used to identify and interrogate protein-small molecule interactions. Chemoproteomics complements phenotypic drug discovery, a paradigm that aims to discover lead compounds on the basis of alleviating a disease phenotype, as opposed to target-based drug discovery, in which lead compounds are designed to interact with predetermined disease-driving biological targets. As phenotypic drug discovery assays do not provide confirmation of a compound's mechanism of action, chemoproteomics provides valuable follow-up strategies to narrow down potential targets and eventually validate a molecule's mechanism of action. Chemoproteomics also attempts to address the inherent challenge of drug promiscuity in small molecule drug discovery by analyzing protein-small molecule interactions on a proteome-wide scale. A major goal of chemoproteomics is to characterize the interactome of drug candidates to gain insight into mechanisms of off-target toxicity and polypharmacology.
James Inglese is an American biochemist, the director of the Assay Development and Screening Technology laboratory at the National Center for Advancing Translational Sciences, a Center within the National Institutes of Health. His specialty is small molecule high throughput screening. Inglese's laboratory develops methods and strategies in molecular pharmacology with drug discovery applications. The work of his research group and collaborators focuses on genetic and infectious disease-associated biology.