David Botstein (born September 8, 1942) is an American biologist who is the chief scientific officer of Calico. He was the director of the Lewis-Sigler Institute for Integrative Genomics at Princeton University [4] [5] [6] [7] from 2003 to 2013, where he remains an Anthony B. Evnin Professor of Genomics.
Botstein graduated from the Bronx High School of Science in 1959, and Harvard University in 1963. He started his Ph.D. work under Maurice Sanford Fox at the Massachusetts Institute of Technology, then moved and received a Ph.D. from the University of Michigan in 1967 for work on P22 phage. [8]
Botstein taught at the Massachusetts Institute of Technology, where he became a professor of genetics. Botstein joined Genentech, Inc. in 1987 as vice president – science. In 1990, he became chairman of the Department of Genetics at Stanford University. Botstein was elected to the U.S. National Academy of Sciences in 1981 and to the Institute of Medicine in 1993.
Botstein is the director of the Integrated Science Program at Princeton University. [9]
In 1980, Botstein and his colleagues Ray White, Mark Skolnick, and Ronald W. Davis proposed a method [10] for constructing a genetic linkage map using restriction fragment length polymorphisms that was used in subsequent years to identify several human disease genes including Huntington's and BRCA1. Variations of this method were used in the mapping efforts that predated and enabled the sequencing phase of the Human Genome Project.
In 1998, Botstein and his postdoctoral fellow Michael Eisen, together with graduate student Paul Spellman and colleague Patrick Brown, developed a statistical method and graphical interface that is widely used to interpret genomic data including microarray data. [11] This approach was refined and applied for diverse applications, including for a molecular classification of heterogenous tumors using gene expression. These efforts included work on discovery of tumor subtypes with Lou Staudt, Ash Alizadeh and Ronald Levy, yielding a refined classification of diffuse large B cell lymphomas, and in painting the molecular portraits for refined classification of breast cancers with Anne-Lise Børresen-Dale and Charles Perou. He has subsequently worked on the creation of the influential Gene Ontology [12] with Michael Ashburner and Suzanna Lewis. He is one of the founding editors of the journal Molecular Biology of the Cell , along with Erkki Ruoslahti and Keith Yamamoto. [13]
In 2013, Botstein was named chief scientific officer of Google's anti-aging health startup Calico.
Botstein has won the Eli Lilly and Company Award in Microbiology (1978), the Genetics Society of America Medal (1988, with Ira Herskowitz), [1] the Allan Award of the American Society of Human Genetics (1989, with Ray White), the Gruber Prize in Genetics (2003), the Albany Medical Center Prize (2010, with Eric Lander and Francis Collins) and the Dan David Prize in 2012. In 2013 he was awarded the $3 million Breakthrough Prize in Life Sciences for his work and in 2020 the Thomas Hunt Morgan Medal of the Genetics Society of America. [14] In 2016, Semantic Scholar AI program included Botstein on its list of most top ten most influential biomedical researchers. [15]
Botstein is an alumnus of Camp Rising Sun. He is the brother of the conductor Leon Botstein. Both of Botstein's parents were physicians.
Genomics is an interdisciplinary field of biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dimensional structural configuration. In contrast to genetics, which refers to the study of individual genes and their roles in inheritance, genomics aims at the collective characterization and quantification of all of an organism's genes, their interrelations and influence on the organism. Genes may direct the production of proteins with the assistance of enzymes and messenger molecules. In turn, proteins make up body structures such as organs and tissues as well as control chemical reactions and carry signals between cells. Genomics also involves the sequencing and analysis of genomes through uses of high throughput DNA sequencing and bioinformatics to assemble and analyze the function and structure of entire genomes. Advances in genomics have triggered a revolution in discovery-based research and systems biology to facilitate understanding of even the most complex biological systems such as the brain.
The Gene Ontology (GO) is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species. More specifically, the project aims to: 1) maintain and develop its controlled vocabulary of gene and gene product attributes; 2) annotate genes and gene products, and assimilate and disseminate annotation data; and 3) provide tools for easy access to all aspects of the data provided by the project, and to enable functional interpretation of experimental data using the GO, for example via enrichment analysis. GO is part of a larger classification effort, the Open Biomedical Ontologies, being one of the Initial Candidate Members of the OBO Foundry.
Michael Ashburner was an English biologist and Professor in the Department of Genetics at University of Cambridge. He was also the former joint-head and co-founder of the European Bioinformatics Institute (EBI) of the European Molecular Biology Laboratory (EMBL) and a Fellow of Churchill College, Cambridge.
Michael Bruce Eisen is an American computational biologist and the former editor-in-chief of the journal eLife. He is a professor of genetics, genomics and development at University of California, Berkeley. He is a leading advocate of open access scientific publishing and is co-founder of Public Library of Science (PLOS). In 2018, Eisen announced his candidacy U.S. Senate from California as an Independent, though he failed to qualify for the ballot.
FlyBase is an online bioinformatics database and the primary repository of genetic and molecular data for the insect family Drosophilidae. For the most extensively studied species and model organism, Drosophila melanogaster, a wide range of data are presented in different formats.
David Haussler is an American bioinformatician known for his work leading the team that assembled the first human genome sequence in the race to complete the Human Genome Project and subsequently for comparative genome analysis that deepens understanding the molecular function and evolution of the genome.
Gerald Mayer Rubin is an American biologist, notable for pioneering the use of transposable P elements in genetics, and for leading the public project to sequence the Drosophila melanogaster genome. Related to his genomics work, Rubin's lab is notable for development of genetic and genomics tools and studies of signal transduction and gene regulation. Rubin also serves as a vice president of the Howard Hughes Medical Institute and executive director of the Janelia Research Campus.
SUPERFAMILY is a database and search platform of structural and functional annotation for all proteins and genomes. It classifies amino acid sequences into known structural domains, especially into SCOP superfamilies. Domains are functional, structural, and evolutionary units that form proteins. Domains of common Ancestry are grouped into superfamilies. The domains and domain superfamilies are defined and described in SCOP. Superfamilies are groups of proteins which have structural evidence to support a common evolutionary ancestor but may not have detectable sequence homology.
Richard Michael Durbin is a British computational biologist and Al-Kindi Professor of Genetics at the University of Cambridge. He also serves as an associate faculty member at the Wellcome Sanger Institute where he was previously a senior group leader.
Ira Herskowitz was an American phage and yeast geneticist who studied genetic regulatory circuits and mechanisms. He was particularly noted for his work on mating type switching and cellular differentiation, largely using Saccharomyces cerevisiae as a model organism.
Protein function prediction methods are techniques that bioinformatics researchers use to assign biological or biochemical roles to proteins. These proteins are usually ones that are poorly studied or predicted based on genomic sequence data. These predictions are often driven by data-intensive computational procedures. Information may come from nucleic acid sequence homology, gene expression profiles, protein domain structures, text mining of publications, phylogenetic profiles, phenotypic profiles, and protein-protein interaction. Protein function is a broad term: the roles of proteins range from catalysis of biochemical reactions to transport to signal transduction, and a single protein may play a role in multiple processes or cellular pathways.
Olga G. Troyanskaya is a Professor in the Department of Computer Science and the Lewis-Sigler Institute for Integrative Genomics at Princeton University and the Deputy Director for Genomics at the Flatiron Institute's Center for Computational Biology in NYC. She studies protein function and interactions in biological pathways by analyzing genomic data using computational tools.
Aviv Regev is a computational biologist and systems biologist and Executive Vice President and Head of Genentech Research and Early Development in Genentech/Roche. She is a core member at the Broad Institute of MIT and Harvard and professor at the Department of Biology of the Massachusetts Institute of Technology. Regev is a pioneer of single cell genomics and of computational and systems biology of gene regulatory circuits. She founded and leads the Human Cell Atlas project, together with Sarah Teichmann.
Ronald Wayne "Ron" Davis is professor of biochemistry and genetics, and director of the Stanford Genome Technology Center at Stanford University. Davis is a researcher in biotechnology and molecular genetics, particularly active in human and yeast genomics and the development of new technologies in genomics, with over 30 biotechnology patents. In 2013, it was said of Davis that "A substantial number of the major genetic advances of the past 20 years can be traced back to Davis in some way." Since his son fell severely ill with Myalgic encephalomyelitis/chronic fatigue syndrome Davis has focused his research efforts into the illness.
Suzanna (Suzi) E. Lewis was a scientist and Principal investigator at the Berkeley Bioinformatics Open-source Project based at Lawrence Berkeley National Laboratory until her retirement in 2019. Lewis led the development of open standards and software for genome annotation and ontologies.
Fred Marshall Winston is the John Emory Andrus Professor of Genetics in the Harvard Medical School Genetics Department, where he has been a member of the faculty since 1983. Research in his laboratory has focused on mechanisms of transcription and the regulation of chromatin structure in the budding yeast Saccharomyces cerevisiae and the fission yeast Schizosaccharomyces pombe. Dr. Winston served as the President of the Genetics Society of America in 2009 and has been elected to both the American Academy of Arts and Sciences (2009) and the National Academy of Sciences (2013).
Model organism databases (MODs) are biological databases, or knowledgebases, dedicated to the provision of in-depth biological data for intensively studied model organisms. MODs allow researchers to easily find background information on large sets of genes, plan experiments efficiently, combine their data with existing knowledge, and construct novel hypotheses. They allow users to analyse results and interpret datasets, and the data they generate are increasingly used to describe less well studied species. Where possible, MODs share common approaches to collect and represent biological information. For example, all MODs use the Gene Ontology (GO) to describe functions, processes and cellular locations of specific gene products. Projects also exist to enable software sharing for curation, visualization and querying between different MODs. Organismal diversity and varying user requirements however mean that MODs are often required to customize capture, display, and provision of data.
Christophe Dessimoz is a Swiss National Science Foundation (SNSF) Professor at the University of Lausanne, Associate Professor at University College London and a group leader at the Swiss Institute of Bioinformatics. He was awarded the Overton Prize in 2019 for his contributions to computational biology. Starting in April 2022, he will be joint executive director of the SIB Swiss Institute of Bioinformatics, along with Ron Appel.
Judith Anne Blake is a computational biologist at the Jackson Laboratory and Professor of Mammalian Genetics.
Carolyn Joy Lawrence-Dill is an American plant biologist and academic administrator. She develops computational systems and tools to help plant science researchers use plant genetics and genomics data for basic biology applications that advance plant breeding.