Gary Stormo | |
---|---|
Born | Gary Dean Stormo 1950 (age 72–73) South Dakota, USA [1] |
Alma mater | California Institute of Technology University of Colorado Boulder |
Scientific career | |
Fields | Bioinformatics Genetics Molecular biology |
Institutions | University of Colorado Boulder Washington University in St. Louis |
Thesis | Computer-aided characterization of translational initiation sites in E. coli (1981) |
Gary Stormo (born 1950) is an American geneticist and currently Joseph Erlanger Professor in the Department of Genetics and the Center for Genome Sciences and Systems Biology at Washington University School of Medicine in St Louis. [2] He is considered one of the pioneers of bioinformatics and genomics. [3] [4] [5] His research combines experimental and computational approaches in order to identify and predict regulatory sequences in DNA and RNA, and their contributions to the regulatory networks that control gene expression. [2]
Stormo initially majored in physics as an undergraduate at the California Institute of Technology, but switched to biology in his junior year. [1] He received his PhD in molecular biology in 1981 from the University of Colorado at Boulder. [2]
Following his PhD, Stormo stayed at the University of Colorado as a faculty member in the department of Molecular, Cellular and Developmental Biology, becoming a professor before moving to Washington University in St. Louis in 1999. [2] [3]
Stormo's research combines experimental and computational approaches to understand regulation of gene expression. His experimental work focuses on protein–DNA interactions and their role in gene regulation. His computational work involves analysis of these interactions and developing pattern recognition algorithms to discover regulatory sites in DNA and RNA.
In 1982, Stormo and his colleagues introduced the Position Weight Matrix (PWM), a now commonly used representation of motifs (patterns) in biological sequences. [6] Consensus sequences had previously been used to represent patterns in biological sequences, but had difficulties in the prediction of new occurrences of these patterns. [7] The first use of PWMs was in the discovery of RNA sites that function as translation initiation sites. The advantages of PWMs over consensus sequences have made PWMs a popular method for representing patterns in biological sequences and an essential component in modern algorithms for motif discovery. [8] [9]
He has published over 150 scientific papers. [3]
Stormo served on the board of directors of the International Society for Computational Biology (ISCB) from 2000 to 2004 and was honoured as an ISCB Fellow in 2010. [10] [11] He was elected as a fellow of the American Medical Informatics Association in 2001. [12]
Stormo was Executive Editor of the journal Bioinformatics from 1994 to 1999. [12] In 2014, he was appointed one of the first Honorary Editors of Bioinformatics. [13] Stormo has also served as Deputy Editor-in-Chief of the journal PLOS Computational Biology . [14] He is currently co-editor of the journal Current Protocols in Bioinformatics. [15]
Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.
Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has foundations in applied mathematics, chemistry, and genetics. It differs from biological computing, a subfield of computer engineering which uses bioengineering to build computers.
In biology, a sequence motif is a nucleotide or amino-acid sequence pattern that is widespread and usually assumed to be related to biological function of the macromolecule. For example, an N-glycosylation site motif can be defined as Asn, followed by anything but Pro, followed by either Ser or Thr, followed by anything but Pro residue.
Structural alignment attempts to establish homology between two or more polymer structures based on their shape and three-dimensional conformation. This process is usually applied to protein tertiary structures but can also be used for large RNA molecules. In contrast to simple structural superposition, where at least some equivalent residues of the two structures are known, structural alignment requires no a priori knowledge of equivalent positions. Structural alignment is a valuable tool for the comparison of proteins with low sequence similarity, where evolutionary relationships between proteins cannot be easily detected by standard sequence alignment techniques. Structural alignment can therefore be used to imply evolutionary relationships between proteins that share very little common sequence. However, caution should be used in using the results as evidence for shared evolutionary ancestry because of the possible confounding effects of convergent evolution by which multiple unrelated amino acid sequences converge on a common tertiary structure.
Computational genomics refers to the use of computational and statistical analysis to decipher biology from genome sequences and related data, including both DNA and RNA sequence as well as other "post-genomic" data. These, in combination with computational and statistical approaches to understanding the function of the genes and statistical association analysis, this field is also often referred to as Computational and Statistical Genetics/genomics. As such, computational genomics may be regarded as a subset of bioinformatics and computational biology, but with a focus on using whole genomes to understand the principles of how the DNA of a species controls its biology at the molecular level and beyond. With the current abundance of massive biological datasets, computational studies have become one of the most important means to biological discovery.
Nucleic acid structure prediction is a computational method to determine secondary and tertiary nucleic acid structure from its sequence. Secondary structure can be predicted from one or several nucleic acid sequences. Tertiary structure can be predicted from the sequence, or by comparative modeling.
David Haussler is an American bioinformatician known for his work leading the team that assembled the first human genome sequence in the race to complete the Human Genome Project and subsequently for comparative genome analysis that deepens understanding the molecular function and evolution of the genome.
Webb Colby Miller is an American bioinformatician who is professor in the Department of Biology and the Department of Computer Science and Engineering at The Pennsylvania State University.
Temple Ferris Smith is an emeritus professor in biomedical engineering who helped to develop the Smith-Waterman algorithm with Michael Waterman in 1981. The Smith-Waterman algorithm serves as the basis for multi sequence comparisons, identifying the segment with the maximum local sequence similarity, see sequence alignment. This algorithm is used for identifying similar DNA, RNA and protein segments. He was director of the BioMolecular Engineering Research Center at Boston University for twenty years and is now professor emeritus.
Anders Krogh is a bioinformatician at the University of Copenhagen, where he leads the university's bioinformatics center. He is known for his pioneering work on the use of hidden Markov models in bioinformatics, and is co-author of a widely used textbook in bioinformatics. In addition, he also co-authored one of the early textbooks on neural networks. His current research interests include promoter analysis, non-coding RNA, gene prediction and protein structure prediction.
DNA binding sites are a type of binding site found in DNA where other molecules may bind. DNA binding sites are distinct from other binding sites in that (1) they are part of a DNA sequence and (2) they are bound by DNA-binding proteins. DNA binding sites are often associated with specialized proteins known as transcription factors, and are thus linked to transcriptional regulation. The sum of DNA binding sites of a specific transcription factor is referred to as its cistrome. DNA binding sites also encompasses the targets of other proteins, like restriction enzymes, site-specific recombinases and methyltransferases.
Chris Sander is a computational biologist based at the Dana-Farber Cancer Center and Harvard Medical School. Previously he was chair of the Computational Biology Programme at the Memorial Sloan–Kettering Cancer Center in New York City. In 2015, he moved his lab to the Dana–Farber Cancer Institute and the Cell Biology Department at Harvard Medical School.
David Sankoff is a Canadian mathematician, bioinformatician, computer scientist and linguist. He holds the Canada Research Chair in Mathematical Genomics in the Mathematics and Statistics Department at the University of Ottawa, and is cross-appointed to the Biology Department and the School of Information Technology and Engineering. He was founding editor of the scientific journal Language Variation and Change (Cambridge) and serves on the editorial boards of a number of bioinformatics, computational biology and linguistics journals. Sankoff is best known for his pioneering contributions in computational linguistics and computational genomics. He is considered to be one of the founders of bioinformatics. In particular, he had a key role in introducing dynamic programming for sequence alignment and other problems in computational biology. In Pavel Pevzner's words, "[ Michael Waterman ] and David Sankoff are responsible for transforming bioinformatics from a ‘stamp collection' of ill-defined problems into a rigorous discipline with important biological applications."
In bioinformatics, alignment-free sequence analysis approaches to molecular sequence and structure data provide alternatives over alignment-based approaches.
Ron Shamir is an Israeli professor of computer science known for his work in graph theory and in computational biology. He holds the Raymond and Beverly Sackler Chair in Bioinformatics, and is the founder and former head of the Edmond J. Safra Center for Bioinformatics at Tel Aviv University.
Machine learning in bioinformatics is the application of machine learning algorithms to bioinformatics, including genomics, proteomics, microarrays, systems biology, evolution, and text mining.
Hanah Margalit is a Professor in the faculty of medicine at the Hebrew University of Jerusalem. Her research combines bioinformatics, computational biology and systems biology, specifically in the fields of gene regulation in bacteria and eukaryotes.
Non-coding RNAs have been discovered using both experimental and bioinformatic approaches. Bioinformatic approaches can be divided into three main categories. The first involves homology search, although these techniques are by definition unable to find new classes of ncRNAs. The second category includes algorithms designed to discover specific types of ncRNAs that have similar properties. Finally, some discovery methods are based on very general properties of RNA, and are thus able to discover entirely new kinds of ncRNAs.