Richard M. Durbin

Last updated

Richard Durbin

FRS
Richard Durbin, Wellcome-Sanger, Cambridge, UK (2698443141).jpg
Durbin in 2008
Born
Richard Michael Durbin

(1960-12-30) 30 December 1960 (age 63) [1]
NationalityBritish
Education Highgate School
Alma mater University of Cambridge (BA, PhD)
Known for
  • Biological Sequence Analysis [2]
Spouse
(m. 1996)
[1]
Awards
Scientific career
Fields
Institutions
Thesis Studies on the development and organisation of the nervous system of Caenorhabditis elegans  (1987)
Doctoral advisor John G. White [5]
Doctoral students Ewan Birney [6]
Other notable students
Website www.gen.cam.ac.uk/directory/richard-durbin OOjs UI icon edit-ltr-progressive.svg

Richard Michael Durbin FRS [17] (born 1960) [1] is a British computational biologist [18] [19] [4] and Al-Kindi Professor of Genetics at the University of Cambridge. [20] [21] [22] [23] He also serves as an associate faculty member at the Wellcome Sanger Institute where he was previously a senior group leader. [24] [25] [26] [27]

Contents

Education

Durbin was educated at The Hall School, Hampstead [ citation needed ] and Highgate School in London. [1] After competing in the 1978/9 International Mathematical Olympiad, [28] he went on to study at the University of Cambridge graduating in 1982 [29] with a second class honours degree in the Cambridge Mathematical Tripos. After graduating, he continued to study for a PhD [5] at St John's College, Cambridge [1] studying the development and organisation of the nervous system of Caenorhabditis elegans whilst working at the Laboratory of Molecular Biology (LMB) in Cambridge, supervised by John Graham White. [5]

Career and research

Durbin's early work included developing the primary instrument software for one of the first X-ray crystallography area detectors [30] and the MRC Biorad confocal microscope, alongside contributions to neural modelling. [31] [32]

He then led the informatics for the Caenorhabditis elegans genome project, [33] and alongside Jean Thierry-Mieg developed the genome database AceDB, which evolved into the WormBase web resource. Following this he played an important role in data collection for and interpretation of the human genome sequence. [34]

He has developed numerous methods for computational sequence analysis. [35] [36] These include gene finding (e.g. GeneWise) with Ewan Birney [37] and Hidden Markov models for protein and nucleic acid alignment and matching (e.g. HMMER) with Sean Eddy and Graeme Mitchison. A standard textbook Biological Sequence analysis coauthored with Sean Eddy, Anders Krogh and Graeme Mitchison [2] describes some of this work. Using these methods Durbin worked with colleagues to build a series of important genomic data resources, including the protein family database Pfam, [38] the genome database Ensembl, [39] and the gene family database TreeFam. [11]

More recently Durbin has returned to sequencing and has developed low coverage approaches to population genome sequencing, applied first to yeast, [40] [41] and has been one of the leaders in the application of new sequencing technology to study human genome variation. [42] [43] Durbin currently co-leads the international 1000 Genomes Project to characterise variation down to 1% allele frequency as a foundation for human genetics.

Awards and honours

Durbin was a joint winner of the Mullard Award of the Royal Society in 1994 (for work on the confocal microscope), won the Lord Lloyd of Kilgerran Award of the Foundation for Science and Technology in 2004, and was elected a Fellow of the Royal Society (FRS) in 2004 [17] and a member of the European Molecular Biology Organization (EMBO) in 2009. The Royal Society awarded its Gabor Medal to Durbin in 2017 for his contributions to computational biology. [44] In 2023 he received the International Prize for Biology for his work on the Biology of Genomes.

Durbin's certificate of election for the Royal Society reads:

Durbin is distinguished for his powerful contribution to computational biology. In particular, he played a leading role in establishing the new field of bioinformatics. This allows the handling of biological data on an unprecedented scale, enabling genomics to prosper. He led the analysis of the C. elegans genome, and with Thierry-Mieg developed the database software AceDB. In the international genome project he led the analysis of protein coding genes. He introduced key computational tools in software and data handling. His Pfam database allowed the identification of domains in new protein sequences; it used hidden Markov models to which approach generally he brought rigour and which led to covariance models for RNA sequence. [45]

Personal life

Durbin is the son of James Durbin and is married to Julie Ahringer, a scientist at the Gurdon Institute. They have two children. [1]

Related Research Articles

<span class="mw-page-title-main">UniProt</span> Database of protein sequences and functional information

UniProt is a freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from the research literature. It is maintained by the UniProt consortium, which consists of several European bioinformatics organisations and a foundation from Washington, DC, United States.

<span class="mw-page-title-main">Wellcome Sanger Institute</span> British genomics research institute

The Wellcome Sanger Institute, previously known as The Sanger Centre and Wellcome Trust Sanger Institute, is a non-profit British genomics and genetics research institute, primarily funded by the Wellcome Trust.

<span class="mw-page-title-main">Pfam</span> Database of protein families

Pfam is a database of protein families that includes their annotations and multiple sequence alignments generated using hidden Markov models. The most recent version, Pfam 36.0, was released in September 2023 and contains 20,795 families.

<span class="mw-page-title-main">Ewan Birney</span> English businessman

John Frederick William Birney is joint director of EMBL's European Bioinformatics Institute (EMBL-EBI), in Hinxton, Cambridgeshire and deputy director general of the European Molecular Biology Laboratory (EMBL). He also serves as non-executive director of Genomics England, chair of the Global Alliance for Genomics and Health (GA4GH) and honorary professor of bioinformatics at the University of Cambridge. Birney has made significant contributions to genomics, through his development of innovative bioinformatics and computational biology tools. He previously served as an associate faculty member at the Wellcome Trust Sanger Institute.

<span class="mw-page-title-main">Amos Bairoch</span>

Amos Bairoch is a Swiss bioinformatician and Professor of Bioinformatics at the Department of Human Protein Sciences of the University of Geneva where he leads the CALIPHO group at the Swiss Institute of Bioinformatics (SIB) combining bioinformatics, curation, and experimental efforts to functionally characterize human proteins.

InterPro is a database of protein families, protein domains and functional sites in which identifiable features found in known proteins can be applied to new protein sequences in order to functionally characterise them.

Rfam is a database containing information about non-coding RNA (ncRNA) families and other structured RNA elements. It is an annotated, open access database originally developed at the Wellcome Trust Sanger Institute in collaboration with Janelia Farm, and currently hosted at the European Bioinformatics Institute. Rfam is designed to be similar to the Pfam database for annotating protein families.

<span class="mw-page-title-main">MicrobesOnline</span>

MicrobesOnline is a publicly and freely accessible website that hosts multiple comparative genomic tools for comparing microbial species at the genomic, transcriptomic and functional levels. MicrobesOnline was developed by the Virtual Institute for Microbial Stress and Survival, which is based at the Lawrence Berkeley National Laboratory in Berkeley, California. The site was launched in 2005, with regular updates until 2011.

Anders Krogh is a bioinformatician at the University of Copenhagen, where he leads the university's bioinformatics center. He is known for his pioneering work on the use of hidden Markov models in bioinformatics, and is co-author of a widely used textbook in bioinformatics. In addition, he also co-authored one of the early textbooks on neural networks. His current research interests include promoter analysis, non-coding RNA, gene prediction and protein structure prediction.

SUPERFAMILY is a database and search platform of structural and functional annotation for all proteins and genomes. It classifies amino acid sequences into known structural domains, especially into SCOP superfamilies. Domains are functional, structural, and evolutionary units that form proteins. Domains of common Ancestry are grouped into superfamilies. The domains and domain superfamilies are defined and described in SCOP. Superfamilies are groups of proteins which have structural evidence to support a common evolutionary ancestor but may not have detectable sequence homology.

A domain of unknown function (DUF) is a protein domain that has no characterised function. These families have been collected together in the Pfam database using the prefix DUF followed by a number, with examples being DUF2992 and DUF1220. As of 2019, there are almost 4,000 DUF families within the Pfam database representing over 22% of known families. Some DUFs are not named using the nomenclature due to popular usage but are nevertheless DUFs.

<span class="mw-page-title-main">Sean Eddy</span> American professor at Harvard University

Sean Roberts Eddy is Professor of Molecular & Cellular Biology and of Applied Mathematics at Harvard University. Previously he was based at the Janelia Research Campus from 2006 to 2015 in Virginia. His research interests are in bioinformatics, computational biology and biological sequence analysis. As of 2016 projects include the use of Hidden Markov models in HMMER, Infernal Pfam and Rfam.

<span class="mw-page-title-main">ProtCID</span>

The Protein Common Interface Database (ProtCID) is a database of similar protein-protein interfaces in crystal structures of homologous proteins.

αr9 is a family of bacterial small non-coding RNAs with representatives in a broad group of α-proteobacteria from the order Hyphomicrobiales. The first member of this family (Smr9C) was found in a Sinorhizobium meliloti 1021 locus located in the chromosome (C). Further homology and structure conservation analysis have identified full-length Smr9C homologs in several nitrogen-fixing symbiotic rhizobia, in the plant pathogens belonging to Agrobacterium species as well as in a broad spectrum of Brucella species. αr9C RNA species are 144-158 nt long and share a well defined common secondary structure consisting of seven conserved regions. Most of the αr9 transcripts can be catalogued as trans-acting sRNAs expressed from well-defined promoter regions of independent transcription units within intergenic regions (IGRs) of the α-proteobacterial genomes.

<span class="mw-page-title-main">European Nucleotide Archive</span> Online database from the EBI on Nucleotides

The European Nucleotide Archive (ENA) is a repository providing free and unrestricted access to annotated DNA and RNA sequences. It also stores complementary information such as experimental procedures, details of sequence assembly and other metadata related to sequencing projects. The archive is composed of three main databases: the Sequence Read Archive, the Trace Archive and the EMBL Nucleotide Sequence Database. The ENA is produced and maintained by the European Bioinformatics Institute and is a member of the International Nucleotide Sequence Database Collaboration (INSDC) along with the DNA Data Bank of Japan and GenBank.

<span class="mw-page-title-main">Tim Hubbard</span> Professor of Bioinformatics at Kings College London

Timothy John Phillip Hubbard is a Professor of Bioinformatics at King's College London, Head of Genome Analysis at Genomics England and Honorary Faculty at the Wellcome Trust Sanger Institute in Cambridge, UK. Starting March 1, 2024, Tim will become the director of Europe's Life Science Data Infrastructure ELIXIR.

<span class="mw-page-title-main">Alex Bateman</span> British bioinformatician

Alexander George Bateman is a computational biologist and Head of Protein Sequence Resources at the European Bioinformatics Institute (EBI), part of the European Molecular Biology Laboratory (EMBL) in Cambridge, UK. He has led the development of the Pfam biological database and introduced the Rfam database of RNA families. He has also been involved in the use of Wikipedia for community-based annotation of biological databases.

Donna R. Maglott is a staff scientist at the National Center for Biotechnology Information known for her research on large-scale genomics projects, including the mouse genome and development of databases required for genomics research.

In molecular phylogenetics, relationships among individuals are determined using character traits, such as DNA, RNA or protein, which may be obtained using a variety of sequencing technologies. High-throughput next-generation sequencing has become a popular technique in transcriptomics, which represent a snapshot of gene expression. In eukaryotes, making phylogenetic inferences using RNA is complicated by alternative splicing, which produces multiple transcripts from a single gene. As such, a variety of approaches may be used to improve phylogenetic inference using transcriptomic data obtained from RNA-Seq and processed using computational phylogenetics.

References

  1. 1 2 3 4 5 6 Anon (2016). "Durbin, Richard Michael" . Who's Who (online edition via Oxford University Press  ed.). A & C Black. doi:10.1093/ww/9780199540884.013.U45024.(Subscription or UK public library membership required.)
  2. 1 2 Durbin, Richard M.; Eddy, Sean R.; Krogh, Anders; Mitchison, Graeme (1998), Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids (1st ed.), Cambridge, New York: Cambridge University Press, ISBN   0-521-62971-3, OCLC   593254083
  3. "EMBO welcomes 66 leading life scientists as members". biochemist.org.
  4. 1 2 Richard M. Durbin publications indexed by Google Scholar OOjs UI icon edit-ltr-progressive.svg
  5. 1 2 3 Durbin, Richard (1987). Studies on the development and organisation of the nervous system of Caenorhabditis elegans. cam.ac.uk (PhD thesis). University of Cambridge. OCLC   499178924. EThOS   uk.bl.ethos.233920.
  6. Birney, Ewan (2000). Sequence alignment in bioinformatics. cam.ac.uk (PhD thesis). University of Cambridge. OCLC   894597337. EThOS   uk.bl.ethos.621653.
  7. Eddy, S. R.; Mitchison, G; Durbin, R (1995). "Maximum discrimination hidden Markov models of sequence consensus". Journal of Computational Biology. 2 (1): 9–23. doi:10.1089/cmb.1995.2.9. PMID   7497123.
  8. Eddy, S. R.; Durbin, R (1994). "RNA sequence analysis using covariance models". Nucleic Acids Research. 22 (11): 2079–88. doi:10.1093/nar/22.11.2079. PMC   308124 . PMID   8029015.
  9. Li, H.; Durbin, R. (2009). "Fast and accurate short read alignment with Burrows–Wheeler transform". Bioinformatics. 25 (14): 1754–1760. doi:10.1093/bioinformatics/btp324. PMC   2705234 . PMID   19451168.
  10. Li, H.; Ruan, J.; Durbin, R. (2008). "Mapping short DNA sequencing reads and calling variants using mapping quality scores". Genome Research. 18 (11): 1851–1858. doi:10.1101/gr.078212.108. PMC   2577856 . PMID   18714091.
  11. 1 2 Li, H.; Coghlan, A.; Ruan, J.; Coin, L. J.; Hériché, J. K.; Osmotherly, L.; Li, R.; Liu, T.; Zhang, Z.; Bolund, L.; Wong, G. K.; Zheng, W.; Dehal, P.; Wang, J.; Durbin, R. (2006). "TreeFam: A curated database of phylogenetic trees of animal gene families". Nucleic Acids Research. 34 (90001): D572–D580. doi:10.1093/nar/gkj118. PMC   1347480 . PMID   16381935.
  12. "Heng Li Credits Durbin Pedigree in Accepting Franklin Award". bio-itworld.com. Archived from the original on 27 February 2013.
  13. Bateman, A.; Coin, L.; Durbin, R.; Finn, R. D.; Hollich, V.; Griffiths-Jones, S.; Khanna, A.; Marshall, M.; Moxon, S.; Sonnhammer, E. L.; Studholme, D. J.; Yeats, C.; Eddy, S. R. (2004). "The Pfam protein families database". Nucleic Acids Research . 32 (Database issue): 138D–1141. doi:10.1093/nar/gkh121. ISSN   0305-1048. PMC   308855 . PMID   14681378. Open Access logo PLoS transparent.svg
  14. Bateman, A.; Birney, E.; Cerruti, L.; Durbin, R.; Etwiller, L.; Eddy, S.; Griffiths-Jones, S.; Howe, K.; Marshall, M.; Sonnhammer, E. L. (2002). "The Pfam protein families database". Nucleic Acids Research. 30 (1): 276–280. doi:10.1093/nar/30.1.276. PMC   99071 . PMID   11752314.
  15. Bateman, A; Birney, E; Durbin, R; Eddy, S. R.; Howe, K. L.; Sonnhammer, E. L. (2000). "The Pfam protein families database". Nucleic Acids Research. 28 (1): 263–6. doi:10.1093/nar/28.1.263. PMC   102420 . PMID   10592242.
  16. Bateman, A.; Birney, E.; Durbin, R.; Eddy, S.; Finn, R.; Sonnhammer, E. (1999). "Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins". Nucleic Acids Research. 27 (1): 260–262. doi:10.1093/nar/27.1.260. PMC   148151 . PMID   9847196.
  17. 1 2 "Professor Richard Durbin FRS Fellow". London: Royal Society. Archived from the original on 19 November 2015.
  18. anon. "Durbin, Richard". sanger.ac.uk. Retrieved 2 January 2019.
  19. Richard Durbin on Twitter OOjs UI icon edit-ltr-progressive.svg
  20. Richard M. Durbin at DBLP Bibliography Server OOjs UI icon edit-ltr-progressive.svg
  21. "Honorary Professors". Cambridge University Reporter . University of Cambridge. CXLV (5). 12 December 2014. Archived from the original on 15 March 2015.
  22. Anon. "Durbin Group — Department of Genetics". gen.cam.ac.uk. Retrieved 2 January 2019.
  23. Anon (27 September 2017). "Professor Richard Durbin — Department of Genetics". gen.cam.ac.uk. Retrieved 2 January 2019.
  24. "Dr Richard Durbin – Wellcome Trust Sanger Institute". Archived from the original on 28 February 2012.
  25. Durbin, Richard M.'s publications indexed by the Scopus bibliographic database. (subscription required)
  26. Richard Durbin archive collection Richard Durbin entry in the Wellcome Library archive.
  27. Richard M. Durbin publications from Europe PubMed Central
  28. Richard M. Durbin's results at International Mathematical Olympiad
  29. "The BioInformer nr. 1, 1997 – Interview with Dr. Richard Durbin". Archived from the original on 2 October 2011. Retrieved 30 July 2011.
  30. Durbin, R. M.; Burns, R.; Moulai, J.; Metcalf, P.; Freymann, D.; Blum, M.; Anderson, J. E.; Harrison, S. C.; Wiley, D. C. (1986). "Protein, DNA, and virus crystallography with a focused imaging proportional counter". Science. 232 (4754): 1127–1132. Bibcode:1986Sci...232.1127D. doi:10.1126/science.3704639. PMID   3704639.
  31. Durbin, R.; Willshaw, D. (1987). "An analogue approach to the travelling salesman problem using an elastic net method". Nature. 326 (6114): 689–691. Bibcode:1987Natur.326..689D. doi:10.1038/326689a0. PMID   3561510. S2CID   4321691.
  32. Durbin, R.; Mitchison, G. (1990). "A dimension reduction framework for understanding cortical maps". Nature. 343 (6259): 644–647. Bibcode:1990Natur.343..644D. doi:10.1038/343644a0. PMID   2304536. S2CID   4352870.
  33. c. Elegans Sequencing, C. (1998). "Genome sequence of the nematode C. Elegans: A platform for investigating biology". Science. 282 (5396): 2012–2018. Bibcode:1998Sci...282.2012.. doi:10.1126/science.282.5396.2012. PMID   9851916.
  34. Lander, E. S.; Linton, M.; Birren, B.; Nusbaum, C.; Zody, C.; Baldwin, J.; Devon, K.; Dewar, K.; Doyle, M.; Fitzhugh, W.; Funke, R.; Gage, D.; Harris, K.; Heaford, A.; Howland, J.; Kann, L.; Lehoczky, J.; Levine, R.; McEwan, P.; McKernan, K.; Meldrim, J.; Mesirov, J. P.; Miranda, C.; Morris, W.; Naylor, J.; Raymond, C.; Rosetti, M.; Santos, R.; Sheridan, A.; et al. (February 2001). "Initial sequencing and analysis of the human genome" (PDF). Nature. 409 (6822): 860–921. Bibcode:2001Natur.409..860L. doi: 10.1038/35057062 . ISSN   0028-0836. PMID   11237011.
  35. Simpson, J. T.; Durbin, R. (2011). "Efficient de novo assembly of large genomes using compressed data structures". Genome Research. 22 (3): 549–556. doi:10.1101/gr.126953.111. PMC   3290790 . PMID   22156294.
  36. Eilbeck, K.; Lewis, S. E.; Mungall, C. J.; Yandell, M.; Stein, L.; Durbin, R.; Ashburner, M. (2005). "The Sequence Ontology: A tool for the unification of genome annotations". Genome Biology. 6 (5): R44. doi: 10.1186/gb-2005-6-5-r44 . PMC   1175956 . PMID   15892872.
  37. Birney, E.; Durbin, R. (2000). "Using GeneWise in the Drosophila annotation experiment". Genome Research. 10 (4): 547–548. doi:10.1101/gr.10.4.547. PMC   310858 . PMID   10779496.
  38. Sonnhammer, E. L. L.; Eddy, S. R.; Durbin, R. (1997). "Pfam: A comprehensive database of protein domain families based on seed alignments". Proteins: Structure, Function, and Genetics. 28 (3): 405–420. doi:10.1002/(SICI)1097-0134(199707)28:3<405::AID-PROT10>3.0.CO;2-L. PMID   9223186. S2CID   9569028.
  39. Hubbard, T.; Barker, D.; Birney, E.; Cameron, G.; Chen, Y.; Clark, L.; Cox, T.; Cuff, J.; Curwen, V.; Down, T.; Durbin, R.; Eyras, E.; Gilbert, J.; Hammond, M.; Huminiecki, L.; Kasprzyk, A.; Lehvaslaiho, H.; Lijnzaad, P.; Melsopp, C.; Mongin, E.; Pettett, R.; Pocock, M.; Potter, S.; Rust, A.; Schmidt, E.; Searle, S.; Slater, G.; Smith, J.; Spooner, W.; Stabenau, A. (2002). "The Ensembl genome database project". Nucleic Acids Research. 30 (1): 38–41. doi:10.1093/nar/30.1.38. PMC   99161 . PMID   11752248.
  40. Liti, G.; Carter, D. M.; Moses, A. M.; Warringer, J.; Parts, L.; James, S. A.; Davey, R. P.; Roberts, I. N.; Burt, A.; Koufopanou, V.; Tsai, I. J.; Bergman, C. M.; Bensasson, D.; O'Kelly, M. J. T.; Van Oudenaarden, A.; Barton, D. B. H.; Bailes, E.; Nguyen, A. N.; Jones, M.; Quail, M. A.; Goodhead, I.; Sims, S.; Smith, F.; Blomberg, A.; Durbin, R.; Louis, E. J. (2009). "Population genomics of domestic and wild yeasts". Nature. 458 (7236): 337–341. Bibcode:2009Natur.458..337L. doi:10.1038/nature07743. PMC   2659681 . PMID   19212322.
  41. Warringer, J.; Zörgö, E.; Cubillos, F. A.; Zia, A.; Gjuvsland, A.; Simpson, J. T.; Forsmark, A.; Durbin, R.; Omholt, S. W.; Louis, E. J.; Liti, G.; Moses, A.; Blomberg, A. (2011). Kruglyak, Leonid (ed.). "Trait variation in yeast is defined by population history". PLOS Genetics. 7 (6): e1002111. doi: 10.1371/journal.pgen.1002111 . PMC   3116910 . PMID   21698134.
  42. Bentley, D. R.; Balasubramanian, S.; Swerdlow, H. P.; Smith, G. P.; Milton, J.; Brown, C. G.; Hall, K. P.; Evers, D. J.; Barnes, C. L.; Bignell, H. R.; Boutell, J. M.; Bryant, J.; Carter, R. J.; Keira Cheetham, R.; Cox, A. J.; Ellis, D. J.; Flatbush, M. R.; Gormley, N. A.; Humphray, S. J.; Irving, L. J.; Karbelashvili, M. S.; Kirk, S. M.; Li, H.; Liu, X.; Maisinger, K. S.; Murray, L. J.; Obradovic, B.; Ost, T.; Parkinson, M. L.; et al. (2008). "Accurate whole human genome sequencing using reversible terminator chemistry". Nature. 456 (7218): 53–59. Bibcode:2008Natur.456...53B. doi:10.1038/nature07517. PMC   2581791 . PMID   18987734.
  43. Li, H.; Durbin, R. (2011). "Inference of human population history from individual whole-genome sequences". Nature. 475 (7357): 493–496. doi:10.1038/nature10231. PMC   3154645 . PMID   21753753.
  44. "Gabor Medal - Gabor Medallist 2017". The Royal Society. 2017. Retrieved 10 October 2017.
  45. "Library and Archive Catalogue". London: The Royal Society. Retrieved 14 November 2013.