David Jones | |
---|---|
Born | David Tudor Jones November 1966 (age 57) [1] |
Nationality | British |
Alma mater |
|
Known for | Protein Fold Recognition Protein Structure Prediction |
Awards | Royal Society University Research Fellowship (1995–1999) |
Scientific career | |
Fields | |
Institutions | University College London Birkbeck, University of London |
Thesis | Structural approaches to protein sequence analysis (1993) |
Doctoral advisor |
|
Website | http://www.cs.ucl.ac.uk/staff/d.jones/ |
David Tudor Jones FRS (born 1966) [1] is a Professor of Bioinformatics, and Head of Bioinformatics Group in the University College London. [3] He is also the director in Bloomsbury Center for Bioinformatics, which is a joint Research Centre between UCL and Birkbeck, University of London and which also provides bioinformatics training and support services to biomedical researchers. In 2013, he is a member of editorial boards for PLoS ONE , BioData Mining, Advanced Bioinformatics, Chemical Biology & Drug Design, and Protein: Structure, Function and Bioinformatics.[ citation needed ]
Jones was educated at Imperial College London where he was awarded a Bachelor of Science degree in Physics.[ when? ][ citation needed ] He moved to King's College London to complete a Master of Science degree in Biochemistry [ when? ] followed by University College London where he was awarded a PhD in 1993 [4] for research supervised by William R. Taylor and Janet Thornton.[ citation needed ]
Jones's main research interests [2] are in protein structure prediction and analysis protein folding, transmembrane protein analysis, machine learning applications in bioinformatics, and genome analysis including the application of intelligent software agents. [5] He has consulted for a few different companies, including GlaxoSmithKline, but his main industry experience was as a co-founder of Inpharmatica Limited, [1] which was founded in 1998 as a corporate spin-off from University College London. The company used a combination of bioinformatics and chemoinformatics to look at the relationships between the structure and function of proteins, and the binding of chemical groups to these proteins leading to the discovery of novel drugs.[ citation needed ]
THREADER provides a method [6] is popularly known as protein fold recognition (threading), a method of protein modeling, which is used to model those proteins which have the same fold as proteins of known structures. The input is an amino acid sequence with unknown protein structure, then THREADER will output a most probable protein structure for this sequence. The degree of compatibility between the sequence and the proposed structure is evaluated by means of set of empirical potentials derived from proteins of known structures.
This work got preceded by David Baker and his colleagues, who have taken THREADER idea further in the form of the Rosetta method which has a huge impact in the field.
MEMSAT [7] is an approach to predict the positions of transmembrane helix segments based on the recognition of the topological models of proteins. The method uses a set of statistical tables derived from well-characterized membrane protein data, and we have a dynamic programming algorithm to recognize the membrane topology models by maximizing the expectation. Since MEMSAT was originally built back in 1994, it then triggered a lot of improvements in the prediction of secondary structure. The newest version is MEMSAT3, [8] released in 2007. It uses a neural network to determine the locations of residues are on the cytoplasmic side of the membrane or in the transmembrane helices.
Jones was involved in the early stage of development of the CATH database, with Christine Orengo and Janet Thornton [9] which is a hierarchical domain classification of protein structures in the Protein Data Bank, where the 4 major levels in hierarchy are: Class, Architecture, Topology, and Homologous superfamily. The CATH database employs a combination of automatic and manual techniques. [10] [11]
GenTHREADER [12] is a faster and more powerful tool for protein fold recognition, that can be applied to either whole/individual protein sequences. The method uses a traditional sequence alignment algorithm to generate alignments, and then the alignment will be evaluated by threading techniques. As the last step, each model will be evaluated by a neural network to produce a measurement of the confidence level in the proposed prediction. The emergence of GenTHREADER has enabled a series of improvement work. [13] So far,[ when? ] there are several improved methods available now: mGenTHREADER, pDomTHREADER, and pGenTHREADER. [14] [15]
This is a server that aggregates several structure prediction methods. It includes the newly implemented method also known as PSIPRED (Predict Secondary Protein Structure), a technique for protein secondary structure prediction, and the other techniques Predict Transmembrane Topology (MEMSAT3), and Fold Recognition (GenTHREADER). Users submit a protein sequence, perform any prediction of interest, and receive the results by e-mail. [16]
Since 1996, Jones has been involved in many research committees, including: Biotechnology and Biological Sciences Research Council (BBSRC), Engineering and Physical Sciences Research Council (EPSRC), Medical Research Council (MRC), and Research Councils UK.[ citation needed ] His research has been funded by the BBSRC, The Wellcome Trust, Elsevier, the EPSRC, the MRC, The Royal Society, The European Commission, AstraZeneca, GlaxoSmithKline and Sun Microsystems. [3]
Jones held a prestigious Royal Society University Research Fellowship from 1995 to 1999. [3] In 2022, Jones was elected as a Fellow of the International Society for Computational Biology [17] and Fellow of the Royal Society in 2023. [18]
Topology of a transmembrane protein refers to locations of N- and C-termini of membrane-spanning polypeptide chain with respect to the inner or outer sides of the biological membrane occupied by the protein.
Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by computational biology; it is important in medicine and biotechnology.
Structural alignment attempts to establish homology between two or more polymer structures based on their shape and three-dimensional conformation. This process is usually applied to protein tertiary structures but can also be used for large RNA molecules. In contrast to simple structural superposition, where at least some equivalent residues of the two structures are known, structural alignment requires no a priori knowledge of equivalent positions. Structural alignment is a valuable tool for the comparison of proteins with low sequence similarity, where evolutionary relationships between proteins cannot be easily detected by standard sequence alignment techniques. Structural alignment can therefore be used to imply evolutionary relationships between proteins that share very little common sequence. However, caution should be used in using the results as evidence for shared evolutionary ancestry because of the possible confounding effects of convergent evolution by which multiple unrelated amino acid sequences converge on a common tertiary structure.
The CATH Protein Structure Classification database is a free, publicly available online resource that provides information on the evolutionary relationships of protein domains. It was created in the mid-1990s by Professor Christine Orengo and colleagues including Janet Thornton and David Jones, and continues to be developed by the Orengo group at University College London. CATH shares many broad features with the SCOP resource, however there are also many areas in which the detailed classification differs greatly.
In molecular biology, protein threading, also known as fold recognition, is a method of protein modeling which is used to model those proteins which have the same fold as proteins of known structures, but do not have homologous proteins with known structure. It differs from the homology modeling method of structure prediction as it is used for proteins which do not have their homologous protein structures deposited in the Protein Data Bank (PDB), whereas homology modeling is used for those proteins which do. Threading works by using statistical knowledge of the relationship between the structures deposited in the PDB and the sequence of the protein which one wishes to model.
Dame Janet Maureen Thornton, is a senior scientist and director emeritus at the European Bioinformatics Institute (EBI), part of the European Molecular Biology Laboratory (EMBL). She is one of the world's leading researchers in structural bioinformatics, using computational methods to understand protein structure and function. She served as director of the EBI from October 2001 to June 2015, and played a key role in ELIXIR.
The sequential structure alignment program (SSAP) in chemistry, physics, and biology is a method that uses double dynamic programming to produce a structural alignment based on atom-to-atom vectors in structure space. Instead of the alpha carbons typically used in structural alignment, SSAP constructs its vectors from the beta carbons for all residues except glycine, a method which thus takes into account the rotameric state of each residue as well as its location along the backbone. SSAP works by first constructing a series of inter-residue distance vectors between each residue and its nearest non-contiguous neighbors on each protein. A series of matrices are then constructed containing the vector differences between neighbors for each pair of residues for which vectors were constructed. Dynamic programming applied to each resulting matrix determines a series of optimal local alignments which are then summed into a "summary" matrix to which dynamic programming is applied again to determine the overall structural alignment.
Phyre and Phyre2 are free web-based services for protein structure prediction. Phyre is among the most popular methods for protein structure prediction having been cited over 1500 times. Like other remote homology recognition techniques, it is able to regularly generate reliable protein models when other widely used methods such as PSI-BLAST cannot. Phyre2 has been designed to ensure a user-friendly interface for users inexpert in protein structure prediction methods. Its development is funded by the Biotechnology and Biological Sciences Research Council.
Molecular recognition features (MoRFs) are small intrinsically disordered regions in proteins that undergo a disorder-to-order transition upon binding to their partners. MoRFs are implicated in protein-protein interactions, which serve as the initial step in molecular recognition. MoRFs are disordered prior to binding to their partners, whereas they form a common 3D structure after interacting with their partners. As MoRF regions tend to resemble disordered proteins with some characteristics of ordered proteins, they can be classified as existing in an extended semi-disordered state.
A protein superfamily is the largest grouping (clade) of proteins for which common ancestry can be inferred. Usually this common ancestry is inferred from structural alignment and mechanistic similarity, even if no sequence similarity is evident. Sequence homology can then be deduced even if not apparent. Superfamilies typically contain several protein families which show sequence similarity within each family. The term protein clan is commonly used for protease and glycosyl hydrolases superfamilies based on the MEROPS and CAZy classification systems.
Michael Joseph Ezra Sternberg is a professor at Imperial College London, where he is director of the Centre for Integrative Systems Biology and Bioinformatics and Head of the Structural bioinformatics Group.
PSI-blast based secondary structure PREDiction (PSIPRED) is a method used to investigate protein structure. It uses artificial neural network machine learning methods in its algorithm. It is a server-side program, featuring a website serving as a front-end interface, which can predict a protein's secondary structure from the primary sequence.
Christine Anne Orengo is a Professor of Bioinformatics at University College London (UCL) known for her work on protein structure, particularly the CATH database. Orengo serves as president of the International Society for Computational Biology (ISCB), the first woman to do so in the history of the society.
Mona Singh is the Wang Family Professor in Computer Science in the Lewis-Sigler Institute for Integrative Genomics and the Department of Computer Science at Princeton University. She is Editor-in-Chief of the Journal of Computational Biology.
Rita Casadio is an Adjunct Professor of Biochemistry/Biophysics in the Department of Pharmacy and Biotechnology at the University of Bologna.
Computational methods that use protein sequence and/ or protein structure to predict protein aggregation. The table below, shows the main features of software for prediction of protein aggregation