David Sankoff | |
---|---|
Born | |
Nationality | Canadian |
Alma mater | McGill University (BSc, MSc, PhD) |
Known for | |
Awards |
|
Scientific career | |
Fields | |
Institutions | |
Thesis | Historical Linguistics as a Stochastic Process (1969) |
Doctoral advisor | Donald Andrew Dawson [5] |
Website | albuquerque |
David Sankoff (born December 31, 1942) is a Canadian mathematician, bioinformatician, computer scientist and linguist. He holds the Canada Research Chair in Mathematical Genomics in the Mathematics and Statistics Department at the University of Ottawa, and is cross-appointed to the Biology Department and the School of Information Technology and Engineering. He was founding editor of the scientific journal Language Variation and Change (Cambridge) [6] and serves on the editorial boards of a number of bioinformatics, computational biology and linguistics journals. [7] [8] [9] [10] Sankoff is best known for his pioneering contributions in computational linguistics and computational genomics. [3] He is considered to be one of the founders of bioinformatics. In particular, he had a key role in introducing dynamic programming [11] for sequence alignment and other problems in computational biology. In Pavel Pevzner's words, [2] "[ Michael Waterman ] and David Sankoff are responsible for transforming bioinformatics from a ‘stamp collection' of ill-defined problems into a rigorous discipline with important biological applications."
Sankoff published his first paper in 1963 [12] while he was an undergraduate student in Mathematics at McGill University. Starting with his doctoral research, he developed mathematical formulations to a number of pivotal concepts in socio- and historical linguistics, including glottochronology, [13] variable rules analysis (with Henrietta Cedergren), [14] the linguistic marketplace [15] and code switching. [16]
After completing his Ph.D. in Mathematics, Sankoff began his academic career at the University of Montreal in 1969. In 1971, Sankoff became interested in molecular sequence comparison [11] and devised the first quadratic-time variant of the Needleman–Wunsch algorithm for pairwise sequence alignment. [17] In 1973, Sankoff and Robert Cedergren developed a joint estimation method for phylogeny and multiple sequence alignment of 5S ribosomal RNA, [18] laying the algorithmic foundations of comparative genomics. In 1975, Sankoff and Václav Chvátal studied the behavior of the longest common subsequence problem on random inputs; [19] the constants of proportionality arising in this study have come to be known as the Chvátal–Sankoff constants. In 1980, Robert Cedergen and David Sankoff created the first research group in bioinformatics at the University of Montreal. [20] Sankoff's work in bioinformatics addresses RNA secondary structure, genome rearrangements, sequence alignment, genome evolution and phylogenetics. [21]
Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has foundations in applied mathematics, chemistry, and genetics. It differs from biological computing, a subfield of computer science and engineering which uses bioengineering to build computers.
A longest common subsequence (LCS) is the longest subsequence common to all sequences in a set of sequences. It differs from the longest common substring: unlike substrings, subsequences are not required to occupy consecutive positions within the original sequences. The problem of computing longest common subsequences is a classic computer science problem, the basis of data comparison programs such as the diff
utility, and has applications in computational linguistics and bioinformatics. It is also widely used by revision control systems such as Git for reconciling multiple changes made to a revision-controlled collection of files.
Variable rules analysis is a set of statistical analysis methods in linguistics that are commonly used in sociolinguistics and historical linguistics to describe patterns of variation between alternative forms in language use. It is also sometimes known as Varbrul analysis, after the name of a software package dedicated to carrying out the relevant statistical computations. The method goes back to a theoretical approach developed by the sociolinguist William Labov in the late 1960s and early 1970s, and its mathematical implementation was developed by Henrietta Cedergren and David Sankoff in 1974.
Nucleic acid structure prediction is a computational method to determine secondary and tertiary nucleic acid structure from its sequence. Secondary structure can be predicted from one or several nucleic acid sequences. Tertiary structure can be predicted from the sequence, or by comparative modeling.
Paulien Hogeweg is a Dutch theoretical biologist and complex systems researcher studying biological systems as dynamic information processing systems at many interconnected levels. In 1970, together with Ben Hesper, she defined the term bioinformatics as "the study of informatic processes in biotic systems".
Michael Spencer Waterman is a Professor of Biology, Mathematics and Computer Science at the University of Southern California (USC), where he holds an Endowed Associates Chair in Biological Sciences, Mathematics and Computer Science. He previously held positions at Los Alamos National Laboratory and Idaho State University.
David Haussler is an American bioinformatician known for his work leading the team that assembled the first human genome sequence in the race to complete the Human Genome Project and subsequently for comparative genome analysis that deepens understanding the molecular function and evolution of the genome.
Václav (Vašek) Chvátal is a Professor Emeritus in the Department of Computer Science and Software Engineering at Concordia University in Montreal, Quebec, Canada, and a visiting professor at Charles University in Prague. He has published extensively on topics in graph theory, combinatorics, and combinatorial optimization.
Temple Ferris Smith is an emeritus professor in biomedical engineering who helped to develop the Smith-Waterman algorithm with Michael Waterman in 1981. The Smith-Waterman algorithm serves as the basis for multi sequence comparisons, identifying the segment with the maximum local sequence similarity, see sequence alignment. This algorithm is used for identifying similar DNA, RNA and protein segments. He was director of the BioMolecular Engineering Research Center at Boston University for twenty years and is now professor emeritus.
Eugene Wimberly "Gene" Myers, Jr. is an American computer scientist and bioinformatician, who is best known for contributing to the early development of the NCBI's BLAST tool for sequence analysis.
Anders Krogh is a bioinformatician at the University of Copenhagen, where he leads the university's bioinformatics center. He is known for his pioneering work on the use of hidden Markov models in bioinformatics, and is co-author of a widely used textbook in bioinformatics. In addition, he also co-authored one of the early textbooks on neural networks. His current research interests include promoter analysis, non-coding RNA, gene prediction and protein structure prediction.
Tandy Warnow is an American computer scientist and Grainger Distinguished Chair in Engineering at the University of Illinois at Urbana–Champaign. She is known for her work on the reconstruction of evolutionary trees, both in biology and in historical linguistics, and also for multiple sequence alignment methods.
Ron Shamir is an Israeli professor of computer science known for his work in graph theory and in computational biology. He holds the Raymond and Beverly Sackler Chair in Bioinformatics, and is the founder and former head of the Edmond J. Safra Center for Bioinformatics at Tel Aviv University.
Gary Stormo is an American geneticist and currently Joseph Erlanger Professor in the Department of Genetics and the Center for Genome Sciences and Systems Biology at Washington University School of Medicine in St Louis. He is considered one of the pioneers of bioinformatics and genomics. His research combines experimental and computational approaches in order to identify and predict regulatory sequences in DNA and RNA, and their contributions to the regulatory networks that control gene expression.
In mathematics, the Chvátal–Sankoff constants are mathematical constants that describe the lengths of longest common subsequences of random strings. Although the existence of these constants has been proven, their exact values are unknown. They are named after Václav Chvátal and David Sankoff, who began investigating them in the mid-1970s.
Alexander George Bateman is a computational biologist and Head of Protein Sequence Resources at the European Bioinformatics Institute (EBI), part of the European Molecular Biology Laboratory (EMBL) in Cambridge, UK. He has led the development of the Pfam biological database and introduced the Rfam database of RNA families. He has also been involved in the use of Wikipedia for community-based annotation of biological databases.
Daniel Mier Gusfield is an American computer scientist, Distinguished Professor of Computer Science at the University of California, Davis. Gusfield is known for his research in combinatorial optimization and computational biology.
Mona Singh is a Professor of Computer Science in the Lewis-Sigler Institute for Integrative Genomics at Princeton University.
Hanah Margalit is a Professor in the faculty of medicine at the Hebrew University of Jerusalem. Her research combines bioinformatics, computational biology and systems biology, specifically in the fields of gene regulation in bacteria and eukaryotes.
Christophe Dessimoz is a Swiss National Science Foundation (SNSF) Professor at the University of Lausanne, Associate Professor at University College London and a group leader at the Swiss Institute of Bioinformatics. He was awarded the Overton Prize in 2019 for his contributions to computational biology. Starting in April 2022, he will be joint executive director of the SIB Swiss Institute of Bioinformatics, along with Ron Appel.
{{cite book}}
: CS1 maint: location missing publisher (link)