David Sankoff

Last updated
David Sankoff
David Sankoff.JPG
David Sankoff at "Models and Algorithms for Genome Evolution" in 2013, Bromont, Quebec.
Born (1942-12-31) December 31, 1942 (age 81)
Montreal, Quebec, Canada
NationalityFlag of Canada (Pantone).svg  Canadian
Alma mater McGill University (BSc, MSc, PhD)
Known for
Awards
Scientific career
Fields
Institutions
Thesis Historical Linguistics as a Stochastic Process  (1969)
Doctoral advisor Donald Andrew Dawson [5]
Website albuquerque.bioinformatics.uottawa.ca

David Sankoff (born December 31, 1942) is a Canadian mathematician, bioinformatician, computer scientist and linguist. He holds the Canada Research Chair in Mathematical Genomics in the Mathematics and Statistics Department at the University of Ottawa, and is cross-appointed to the Biology Department and the School of Information Technology and Engineering. He was founding editor of the scientific journal Language Variation and Change (Cambridge) [6] and serves on the editorial boards of a number of bioinformatics, computational biology and linguistics journals. [7] [8] [9] [10] Sankoff is best known for his pioneering contributions in computational linguistics and computational genomics. [3] He is considered to be one of the founders of bioinformatics. In particular, he had a key role in introducing dynamic programming [11] for sequence alignment and other problems in computational biology. In Pavel Pevzner's words, [2] "[ Michael Waterman ] and David Sankoff are responsible for transforming bioinformatics from a ‘stamp collection' of ill-defined problems into a rigorous discipline with important biological applications."

Contents

Education

Sankoff published his first paper in 1963 [12] while he was an undergraduate student in Mathematics at McGill University. Starting with his doctoral research, he developed mathematical formulations to a number of pivotal concepts in socio- and historical linguistics, including glottochronology, [13] variable rules analysis (with Henrietta Cedergren), [14] the linguistic marketplace [15] and code switching. [16]

Career and research

After completing his Ph.D. in Mathematics, Sankoff began his academic career at the University of Montreal in 1969. In 1971, Sankoff became interested in molecular sequence comparison [11] and devised the first quadratic-time variant of the Needleman–Wunsch algorithm for pairwise sequence alignment. [17] In 1973, Sankoff and Robert Cedergren developed a joint estimation method for phylogeny and multiple sequence alignment of 5S ribosomal RNA, [18] laying the algorithmic foundations of comparative genomics. In 1975, Sankoff and Václav Chvátal studied the behavior of the longest common subsequence problem on random inputs; [19] the constants of proportionality arising in this study have come to be known as the Chvátal–Sankoff constants. In 1980, Robert Cedergen and David Sankoff created the first research group in bioinformatics at the University of Montreal. [20] Sankoff's work in bioinformatics addresses RNA secondary structure, genome rearrangements, sequence alignment, genome evolution and phylogenetics. [21]

Awards and honors

Related Research Articles

<span class="mw-page-title-main">Computational biology</span> Branch of biology

Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has foundations in applied mathematics, chemistry, and genetics. It differs from biological computing, a subfield of computer science and engineering which uses bioengineering to build computers.

<span class="mw-page-title-main">Longest common subsequence</span> Algorithmic problem on pairs of sequences

A longest common subsequence (LCS) is the longest subsequence common to all sequences in a set of sequences. It differs from the longest common substring: unlike substrings, subsequences are not required to occupy consecutive positions within the original sequences. The problem of computing longest common subsequences is a classic computer science problem, the basis of data comparison programs such as the diff utility, and has applications in computational linguistics and bioinformatics. It is also widely used by revision control systems such as Git for reconciling multiple changes made to a revision-controlled collection of files.

Variable rules analysis is a set of statistical analysis methods in linguistics that are commonly used in sociolinguistics and historical linguistics to describe patterns of variation between alternative forms in language use. It is also sometimes known as Varbrul analysis, after the name of a software package dedicated to carrying out the relevant statistical computations. The method goes back to a theoretical approach developed by the sociolinguist William Labov in the late 1960s and early 1970s, and its mathematical implementation was developed by Henrietta Cedergren and David Sankoff in 1974.

Nucleic acid structure prediction is a computational method to determine secondary and tertiary nucleic acid structure from its sequence. Secondary structure can be predicted from one or several nucleic acid sequences. Tertiary structure can be predicted from the sequence, or by comparative modeling.

Paulien Hogeweg is a Dutch theoretical biologist and complex systems researcher studying biological systems as dynamic information processing systems at many interconnected levels. In 1970, together with Ben Hesper, she defined the term bioinformatics as "the study of informatic processes in biotic systems".

<span class="mw-page-title-main">Michael Waterman</span> American mathematician

Michael Spencer Waterman is a Professor of Biology, Mathematics and Computer Science at the University of Southern California (USC), where he holds an Endowed Associates Chair in Biological Sciences, Mathematics and Computer Science. He previously held positions at Los Alamos National Laboratory and Idaho State University.

<span class="mw-page-title-main">David Haussler</span> American bioinformatician

David Haussler is an American bioinformatician known for his work leading the team that assembled the first human genome sequence in the race to complete the Human Genome Project and subsequently for comparative genome analysis that deepens understanding the molecular function and evolution of the genome.

<span class="mw-page-title-main">Václav Chvátal</span> Czech-Canadian mathematician

Václav (Vašek) Chvátal is a Professor Emeritus in the Department of Computer Science and Software Engineering at Concordia University in Montreal, Quebec, Canada, and a visiting professor at Charles University in Prague. He has published extensively on topics in graph theory, combinatorics, and combinatorial optimization.

<span class="mw-page-title-main">Temple F. Smith</span> American academic

Temple Ferris Smith is an emeritus professor in biomedical engineering who helped to develop the Smith-Waterman algorithm with Michael Waterman in 1981. The Smith-Waterman algorithm serves as the basis for multi sequence comparisons, identifying the segment with the maximum local sequence similarity, see sequence alignment. This algorithm is used for identifying similar DNA, RNA and protein segments. He was director of the BioMolecular Engineering Research Center at Boston University for twenty years and is now professor emeritus.

<span class="mw-page-title-main">Eugene Myers</span> American scientist

Eugene Wimberly "Gene" Myers, Jr. is an American computer scientist and bioinformatician, who is best known for contributing to the early development of the NCBI's BLAST tool for sequence analysis.

Anders Krogh is a bioinformatician at the University of Copenhagen, where he leads the university's bioinformatics center. He is known for his pioneering work on the use of hidden Markov models in bioinformatics, and is co-author of a widely used textbook in bioinformatics. In addition, he also co-authored one of the early textbooks on neural networks. His current research interests include promoter analysis, non-coding RNA, gene prediction and protein structure prediction.

<span class="mw-page-title-main">Tandy Warnow</span> American computer scientist (active 1984–)

Tandy Warnow is an American computer scientist and Grainger Distinguished Chair in Engineering at the University of Illinois at Urbana–Champaign. She is known for her work on the reconstruction of evolutionary trees, both in biology and in historical linguistics, and also for multiple sequence alignment methods.

<span class="mw-page-title-main">Ron Shamir</span> Israeli professor of computer science (born 1953)

Ron Shamir is an Israeli professor of computer science known for his work in graph theory and in computational biology. He holds the Raymond and Beverly Sackler Chair in Bioinformatics, and is the founder and former head of the Edmond J. Safra Center for Bioinformatics at Tel Aviv University.

<span class="mw-page-title-main">Gary Stormo</span> American geneticist (born 1950)

Gary Stormo is an American geneticist and currently Joseph Erlanger Professor in the Department of Genetics and the Center for Genome Sciences and Systems Biology at Washington University School of Medicine in St Louis. He is considered one of the pioneers of bioinformatics and genomics. His research combines experimental and computational approaches in order to identify and predict regulatory sequences in DNA and RNA, and their contributions to the regulatory networks that control gene expression.

In mathematics, the Chvátal–Sankoff constants are mathematical constants that describe the lengths of longest common subsequences of random strings. Although the existence of these constants has been proven, their exact values are unknown. They are named after Václav Chvátal and David Sankoff, who began investigating them in the mid-1970s.

<span class="mw-page-title-main">Alex Bateman</span> British bioinformatician

Alexander George Bateman is a computational biologist and Head of Protein Sequence Resources at the European Bioinformatics Institute (EBI), part of the European Molecular Biology Laboratory (EMBL) in Cambridge, UK. He has led the development of the Pfam biological database and introduced the Rfam database of RNA families. He has also been involved in the use of Wikipedia for community-based annotation of biological databases.

Daniel Mier Gusfield is an American computer scientist, Distinguished Professor of Computer Science at the University of California, Davis. Gusfield is known for his research in combinatorial optimization and computational biology.

Mona Singh is a Professor of Computer Science in the Lewis-Sigler Institute for Integrative Genomics at Princeton University.

<span class="mw-page-title-main">Hanah Margalit</span>

Hanah Margalit is a Professor in the faculty of medicine at the Hebrew University of Jerusalem. Her research combines bioinformatics, computational biology and systems biology, specifically in the fields of gene regulation in bacteria and eukaryotes.

<span class="mw-page-title-main">Christophe Dessimoz</span>

Christophe Dessimoz is a Swiss National Science Foundation (SNSF) Professor at the University of Lausanne, Associate Professor at University College London and a group leader at the Swiss Institute of Bioinformatics. He was awarded the Overton Prize in 2019 for his contributions to computational biology. Starting in April 2022, he will be joint executive director of the SIB Swiss Institute of Bioinformatics, along with Ron Appel.

References

  1. 1 2 Anon (2017). "ISCB Fellows". iscb.org. International Society for Computational Biology. Archived from the original on 2017-03-20.
  2. 1 2 3 Maisel, M. (2006). "ISCB Honors Michael S. Waterman and Mathieu Blanchette". PLOS Computational Biology . 2 (8): e105. Bibcode:2006PLSCB...2..105M. doi: 10.1371/journal.pcbi.0020105 . PMC   1526462 .
  3. 1 2 David Sankoff publications indexed by Google Scholar OOjs UI icon edit-ltr-progressive.svg
  4. Sankoff, David (2008). "How to Predict the Evolution of a Bilingual Community". In Meyerhoff, Miriam and Naomi Nagy (eds.), Social Lives in Language – Sociolinguistics and multilingual speech communities: Celebrating the work of Gillian Sankoff (pp. 179–194). Amsterdam: John Benjamins.
  5. David Sankoff at the Mathematics Genealogy Project
  6. Sali, Tagliamonte (2015-11-02). Making waves : the story of variationist sociolinguistics. Chichester, West Sussex, United Kingdom. ISBN   9781118455166. OCLC   921307274.{{cite book}}: CS1 maint: location missing publisher (link)
  7. "BMC Bioinformatics" . Retrieved 8 June 2019.
  8. "Journal of Bioinformatics and Computational Biology" . Retrieved 8 June 2019.
  9. "Journal of Computational Biology" . Retrieved 8 June 2019.
  10. "Molecular Phylogenetics and Evolution, Volume 43" . Retrieved 8 June 2019.
  11. 1 2 Sankoff, D. (2000). "The early introduction of dynamic programming into computational biology". Bioinformatics. 16 (1): 41–47. doi: 10.1093/bioinformatics/16.1.41 . PMID   10812476.
  12. Friesen, J. D.; Sankoff, D.; Siminovitch, L. (1963). "Radiobiological Studies of Vaccinia Virus". Virology. 21 (3): 411–424. doi:10.1016/0042-6822(63)90203-4. PMID   14081366.
  13. Sankoff, David (1970). "On the rate of replacement of word-meaning relationships". Language. 46 (3): 564–569. CiteSeerX   10.1.1.667.3279 . doi:10.2307/412307. JSTOR   412307.
  14. Cedergren, H. J.; D. Sankoff (1974). "Variable rules: performance as a statistical reflection of competence". Language. 50 (2): 333–355. CiteSeerX   10.1.1.665.3156 . doi:10.2307/412441. JSTOR   412441.
  15. Sankoff, D.; S. Laberge (1978). "The linguistic market and the statistical explanation of variability". In D. Sankoff (ed.), Linguistic Variation: Models and Methods (pp. 239-250). New York: Academic Press.
  16. Sankoff, David; Shana Poplack (1981). "A formal grammar for code switching". Papers in Linguistics. 14 (1): 3–46. CiteSeerX   10.1.1.667.3175 . doi:10.1080/08351818109370523.
  17. Sankoff, D. (1972). "Matching sequences under deletion-insertion constraints". Proceedings of the National Academy of Sciences of the United States of America. 69 (1): 4–6. Bibcode:1972PNAS...69....4S. doi: 10.1073/pnas.69.1.4 . PMC   427531 . PMID   4500555.
  18. Sankoff, D; C. Morel; R. J. Cedergren (1973). "Evolution of 5S RNA and the non-randomness of base replacement". Nature New Biology. 245 (147): 232–234. doi:10.1038/newbio245232a0. PMID   4201431.
  19. Chvatal, Václáv; Sankoff, David (1975), "Longest common subsequences of two random sequences", Journal of Applied Probability, 12 (2): 306–315, doi:10.2307/3212444, JSTOR   3212444, MR   0405531, S2CID   250345191 .
  20. "History of the Robert Cedergren Centre" . Retrieved 25 August 2013.
  21. Anon (2003). "ISCB Senior Scientist Award to Sankoff". iscb.org/iscb-awards. Archived from the original on 2016-03-03.
  22. 1 2 "David Sankoff". Research. Retrieved 2020-11-03.
  23. "Prix Acfas Thérèse Gouin-Décarie (Prix Marcel-Vincent before 2013)". Acfas PRIX (in French). 2019. Retrieved 19 September 2020.
  24. Professor Angela McLean awarded the Weldon Medal. 2018-05-17.
  25. "David Sankoff - Excellence in Research Award". Archived from the original on 24 September 2014. Retrieved 25 August 2013.
  26. "Co-founder of bioinformatics to receive an honorary doctorate". Mathematics. 2019-06-11. Retrieved 2020-11-03.