David T. Jones (scientist)

Last updated

David Jones
DavidJones.jpeg
David Jones in 2006
Born
David Tudor Jones

November 1966 (age 5657) [1]
NationalityBritish
Alma mater
Known for Protein Fold Recognition
Protein Structure Prediction
Awards Royal Society University Research Fellowship (1995–1999)
Scientific career
Fields
Institutions University College London
Birkbeck, University of London
Thesis Structural approaches to protein sequence analysis  (1993)
Doctoral advisor
Website http://www.cs.ucl.ac.uk/staff/d.jones/

David Tudor Jones FRS (born 1966) [1] is a Professor of Bioinformatics, and Head of Bioinformatics Group in the University College London. [3] He is also the director in Bloomsbury Center for Bioinformatics, which is a joint Research Centre between UCL and Birkbeck, University of London and which also provides bioinformatics training and support services to biomedical researchers. In 2013, he is a member of editorial boards for PLoS ONE , BioData Mining, Advanced Bioinformatics, Chemical Biology & Drug Design, and Protein: Structure, Function and Bioinformatics.[ citation needed ]

Contents

Education

Jones was educated at Imperial College London where he was awarded a Bachelor of Science degree in Physics.[ when? ][ citation needed ] He moved to King's College London to complete a Master of Science degree in Biochemistry [ when? ] followed by University College London where he was awarded a PhD in 1993 [4] for research supervised by William R. Taylor and Janet Thornton.[ citation needed ]

Research and career

Jones's main research interests [2] are in protein structure prediction and analysis protein folding, transmembrane protein analysis, machine learning applications in bioinformatics, and genome analysis including the application of intelligent software agents. [5] He has consulted for a few different companies, including GlaxoSmithKline, but his main industry experience was as a co-founder of Inpharmatica Limited, [1] which was founded in 1998 as a corporate spin-off from University College London. The company used a combination of bioinformatics and chemoinformatics to look at the relationships between the structure and function of proteins, and the binding of chemical groups to these proteins leading to the discovery of novel drugs.[ citation needed ]

THREADER

THREADER provides a method [6] is popularly known as protein fold recognition (threading), a method of protein modeling, which is used to model those proteins which have the same fold as proteins of known structures. The input is an amino acid sequence with unknown protein structure, then THREADER will output a most probable protein structure for this sequence. The degree of compatibility between the sequence and the proposed structure is evaluated by means of set of empirical potentials derived from proteins of known structures.
This work got preceded by David Baker and his colleagues, who have taken THREADER idea further in the form of the Rosetta method which has a huge impact in the field.

MEMSAT

MEMSAT [7] is an approach to predict the positions of transmembrane helix segments based on the recognition of the topological models of proteins. The method uses a set of statistical tables derived from well-characterized membrane protein data, and we have a dynamic programming algorithm to recognize the membrane topology models by maximizing the expectation. Since MEMSAT was originally built back in 1994, it then triggered a lot of improvements in the prediction of secondary structure. The newest version is MEMSAT3, [8] released in 2007. It uses a neural network to determine the locations of residues are on the cytoplasmic side of the membrane or in the transmembrane helices.

CATH database

Jones was involved in the early stage of development of the CATH database, with Christine Orengo and Janet Thornton [9] which is a hierarchical domain classification of protein structures in the Protein Data Bank, where the 4 major levels in hierarchy are: Class, Architecture, Topology, and Homologous superfamily. The CATH database employs a combination of automatic and manual techniques. [10] [11]

GenTHREADER

GenTHREADER [12] is a faster and more powerful tool for protein fold recognition, that can be applied to either whole/individual protein sequences. The method uses a traditional sequence alignment algorithm to generate alignments, and then the alignment will be evaluated by threading techniques. As the last step, each model will be evaluated by a neural network to produce a measurement of the confidence level in the proposed prediction. The emergence of GenTHREADER has enabled a series of improvement work. [13] So far,[ when? ] there are several improved methods available now: mGenTHREADER, pDomTHREADER, and pGenTHREADER. [14] [15]

PSIPRED

This is a server that aggregates several structure prediction methods. It includes the newly implemented method also known as PSIPRED (Predict Secondary Protein Structure), a technique for protein secondary structure prediction, and the other techniques Predict Transmembrane Topology (MEMSAT3), and Fold Recognition (GenTHREADER). Users submit a protein sequence, perform any prediction of interest, and receive the results by e-mail. [16]

Academic service

Since 1996, Jones has been involved in many research committees, including: Biotechnology and Biological Sciences Research Council (BBSRC), Engineering and Physical Sciences Research Council (EPSRC), Medical Research Council (MRC), and Research Councils UK.[ citation needed ] His research has been funded by the BBSRC, The Wellcome Trust, Elsevier, the EPSRC, the MRC, The Royal Society, The European Commission, AstraZeneca, GlaxoSmithKline and Sun Microsystems. [3]

Awards and honours

Jones held a prestigious Royal Society University Research Fellowship from 1995 to 1999. [3] In 2022, Jones was elected as a Fellow of the International Society for Computational Biology [17] and Fellow of the Royal Society in 2023. [18]

Related Research Articles

<span class="mw-page-title-main">Membrane topology</span>

Topology of a transmembrane protein refers to locations of N- and C-termini of membrane-spanning polypeptide chain with respect to the inner or outer sides of the biological membrane occupied by the protein.

<span class="mw-page-title-main">Protein structure prediction</span> Type of biological prediction

Protein structure prediction is the inference of the three-dimensional structure of a protein from its amino acid sequence—that is, the prediction of its secondary and tertiary structure from primary structure. Structure prediction is different from the inverse problem of protein design. Protein structure prediction is one of the most important goals pursued by computational biology; and it is important in medicine and biotechnology.

<span class="mw-page-title-main">Structural alignment</span> Aligning molecular sequences using sequence and structural information

Structural alignment attempts to establish homology between two or more polymer structures based on their shape and three-dimensional conformation. This process is usually applied to protein tertiary structures but can also be used for large RNA molecules. In contrast to simple structural superposition, where at least some equivalent residues of the two structures are known, structural alignment requires no a priori knowledge of equivalent positions. Structural alignment is a valuable tool for the comparison of proteins with low sequence similarity, where evolutionary relationships between proteins cannot be easily detected by standard sequence alignment techniques. Structural alignment can therefore be used to imply evolutionary relationships between proteins that share very little common sequence. However, caution should be used in using the results as evidence for shared evolutionary ancestry because of the possible confounding effects of convergent evolution by which multiple unrelated amino acid sequences converge on a common tertiary structure.

<span class="mw-page-title-main">CATH database</span>

The CATH Protein Structure Classification database is a free, publicly available online resource that provides information on the evolutionary relationships of protein domains. It was created in the mid-1990s by Professor Christine Orengo and colleagues including Janet Thornton and David Jones, and continues to be developed by the Orengo group at University College London. CATH shares many broad features with the SCOP resource, however there are also many areas in which the detailed classification differs greatly.

In molecular biology, protein threading, also known as fold recognition, is a method of protein modeling which is used to model those proteins which have the same fold as proteins of known structures, but do not have homologous proteins with known structure. It differs from the homology modeling method of structure prediction as it is used for proteins which do not have their homologous protein structures deposited in the Protein Data Bank (PDB), whereas homology modeling is used for those proteins which do. Threading works by using statistical knowledge of the relationship between the structures deposited in the PDB and the sequence of the protein which one wishes to model.

<span class="mw-page-title-main">Janet Thornton</span> British bioinformatician and academic

Dame Janet Maureen Thornton, is a senior scientist and director emeritus at the European Bioinformatics Institute (EBI), part of the European Molecular Biology Laboratory (EMBL). She is one of the world's leading researchers in structural bioinformatics, using computational methods to understand protein structure and function. She served as director of the EBI from October 2001 to June 2015, and played a key role in ELIXIR.

The sequential structure alignment program (SSAP) in chemistry, physics, and biology is a method that uses double dynamic programming to produce a structural alignment based on atom-to-atom vectors in structure space. Instead of the alpha carbons typically used in structural alignment, SSAP constructs its vectors from the beta carbons for all residues except glycine, a method which thus takes into account the rotameric state of each residue as well as its location along the backbone. SSAP works by first constructing a series of inter-residue distance vectors between each residue and its nearest non-contiguous neighbors on each protein. A series of matrices are then constructed containing the vector differences between neighbors for each pair of residues for which vectors were constructed. Dynamic programming applied to each resulting matrix determines a series of optimal local alignments which are then summed into a "summary" matrix to which dynamic programming is applied again to determine the overall structural alignment.

Phyre and Phyre2 are free web-based services for protein structure prediction. Phyre is among the most popular methods for protein structure prediction having been cited over 1500 times. Like other remote homology recognition techniques, it is able to regularly generate reliable protein models when other widely used methods such as PSI-BLAST cannot. Phyre2 has been designed to ensure a user-friendly interface for users inexpert in protein structure prediction methods. Its development is funded by the Biotechnology and Biological Sciences Research Council.

Molecular recognition features (MoRFs) are small intrinsically disordered regions in proteins that undergo a disorder-to-order transition upon binding to their partners. MoRFs are implicated in protein-protein interactions, which serve as the initial step in molecular recognition. MoRFs are disordered prior to binding to their partners, whereas they form a common 3D structure after interacting with their partners. As MoRF regions tend to resemble disordered proteins with some characteristics of ordered proteins, they can be classified as existing in an extended semi-disordered state.

A protein superfamily is the largest grouping (clade) of proteins for which common ancestry can be inferred. Usually this common ancestry is inferred from structural alignment and mechanistic similarity, even if no sequence similarity is evident. Sequence homology can then be deduced even if not apparent. Superfamilies typically contain several protein families which show sequence similarity within each family. The term protein clan is commonly used for protease and glycosyl hydrolases superfamilies based on the MEROPS and CAZy classification systems.

Michael Joseph Ezra Sternberg is a professor at Imperial College London, where he is director of the Centre for Integrative Systems Biology and Bioinformatics and Head of the Structural bioinformatics Group.

PSI-blast based secondary structure PREDiction (PSIPRED) is a method used to investigate protein structure. It uses artificial neural network machine learning methods in its algorithm. It is a server-side program, featuring a website serving as a front-end interface, which can predict a protein's secondary structure from the primary sequence.

<span class="mw-page-title-main">Christine Orengo</span> Professor of Bioinformatics

Christine Anne Orengo is a Professor of Bioinformatics at University College London (UCL) known for her work on protein structure, particularly the CATH database. Orengo serves as president of the International Society for Computational Biology (ISCB), the first woman to do so in the history of the society.

Mona Singh is a Professor of Computer Science in the Lewis-Sigler Institute for Integrative Genomics at Princeton University.

Rita Casadio is a Professor of Biochemistry at the University of Bologna.

Computational methods that use protein sequence and/ or protein structure to predict protein aggregation. The table below, shows the main features of software for prediction of protein aggregation

References

  1. 1 2 3 n (2012). "David JONES Inpharmatica". companieshouse.gov.uk. Companies House. Archived from the original on 7 March 2017.
  2. 1 2 David T. Jones publications indexed by Google Scholar OOjs UI icon edit-ltr-progressive.svg
  3. 1 2 3 Jones, David (2015). "Professor David Jones UCL Computer Science". ucl.ac.uk. University College London. Archived from the original on 7 May 2016.
  4. Jones, David Tudor (1993). Structural approaches to protein sequence analysis. london.ac.uk (PhD thesis). University of London. OCLC   941025790.
  5. Jones, David T.; Taylor, William R.; Thornton, Janet M. (1992). "The rapid generation of mutation data matrices from protein sequences". Bioinformatics. 8 (3): 275–282. doi:10.1093/bioinformatics/8.3.275. ISSN   1367-4803. PMID   1633570.
  6. Jones, D. T.; Taylor, W. R.; Thornton, J. M. (1992). "A new approach to protein fold recognition". Nature. 358 (6381): 86–89. Bibcode:1992Natur.358...86J. doi:10.1038/358086a0. ISSN   0028-0836. PMID   1614539. S2CID   4266346.
  7. Jones, D. T.; Taylor, W. R.; Thornton, J. M. (1994). "A Model Recognition Approach to the Prediction of All-Helical Membrane Protein Structure and Topology". Biochemistry. 33 (10): 3038–3049. doi:10.1021/bi00176a037. ISSN   0006-2960. PMID   8130217.
  8. Jones, D. T. (2007). "Improving the accuracy of transmembrane protein topology prediction using evolutionary information". Bioinformatics. 23 (5): 538–544. doi: 10.1093/bioinformatics/btl677 . ISSN   1367-4803. PMID   17237066.
  9. Orengo, CA; Michie, AD; Jones, S; Jones, DT; Swindells, MB; Thornton, JM (1997). "CATH – a hierarchic classification of protein domain structures". Structure. 5 (8): 1093–1109. doi: 10.1016/S0969-2126(97)00260-8 . ISSN   0969-2126. PMID   9309224.
  10. Orengo, C.A.; Martin, A.M.; Hutchinson, G.; Jones, S.; Jones, D.T.; Michie, A.D.; Swindells, M.B.; Thornton, J.M. (1998). "Classifying a protein in the CATH database of domain structures". Acta Crystallogr. D. 54 (6): 1155–1167. doi: 10.1107/s0907444998007501 . PMID   10089492.
  11. Cuff, A. L.; Sillitoe, I.; Lewis, T.; Clegg, A. B.; Rentzsch, R.; Furnham, N.; Pellegrini-Calace, M.; Jones, D.; Thornton, J.; Orengo, C. A. (2010). "Extending CATH: increasing coverage of the protein structure universe and linking structure with function". Nucleic Acids Research . 39 (Database): D420–D426. doi:10.1093/nar/gkq1001. ISSN   0305-1048. PMC   3013636 . PMID   21097779.
  12. Jones, David T. (1999). "GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences". Journal of Molecular Biology . 287 (4): 797–815. doi:10.1006/jmbi.1999.2583. ISSN   0022-2836. PMID   10191147. S2CID   6057225.
  13. "UCL-CS Bioinformatics: PSIPRED overview". Bioinf.cs.ucl.ac.uk. Retrieved 7 March 2017.
  14. McGuffin, L. J.; Jones, D. T. (2003). "Improvement of the GenTHREADER method for genomic fold recognition". Bioinformatics. 19 (7): 874–881. doi: 10.1093/bioinformatics/btg097 . ISSN   1367-4803. PMID   12724298.
  15. Lobley, A.; Sadowski, M. I.; Jones, D. T. (2009). "pGenTHREADER and pDomTHREADER: new methods for improved protein fold recognition and superfamily discrimination". Bioinformatics. 25 (14): 1761–1767. doi: 10.1093/bioinformatics/btp302 . ISSN   1367-4803. PMID   19429599.
  16. McGuffin, L. J.; Bryson, K.; Jones, D. T. (2000). "The PSIPRED protein structure prediction server". Bioinformatics. 16 (4): 404–405. doi: 10.1093/bioinformatics/16.4.404 . ISSN   1367-4803. PMID   10869041.
  17. "April 28, 2022: ISCB Congratulates and Introduces the 2022 Class of Fellows!". www.iscb.org. Retrieved 17 June 2022.
  18. "David Jones". royalsociety.org. Retrieved 24 May 2023.