David Haussler

Last updated
David Haussler
David Haussler 1.jpg
Born
David Haussler

October 1953 (1953-10) (age 70) [1]
NationalityAmerican
Alma mater
Known for
Awards
Scientific career
Fields
Institutions University of California, Santa Cruz
University of Colorado
Thesis Insertion and iterated insertion as operations on formal language  (1982)
Doctoral advisor Andrzej Ehrenfeucht [8]
Doctoral students
Other notable students Anders Krogh [10]
Website stemcellgenomics.ucsc.edu

David Haussler (born 1953) is an American bioinformatician known for his work leading the team that assembled the first human genome sequence in the race to complete the Human Genome Project and subsequently for comparative genome analysis that deepens understanding the molecular function and evolution of the genome. [12] [13] [14]

Contents

Haussler was elected a member of the National Academy of Engineering in 2018 for developments in computational learning theory and bioinformatics, including first assembly of the human genome, its analysis, and data sharing.

He is a distinguished professor of biomolecular engineering and founding scientific director of the UC Santa Cruz Genomics Institute at the University of California, Santa Cruz, director of the California Institute for Quantitative Biosciences (QB3) on the UC Santa Cruz campus, and a consulting professor at the Stanford University School of Medicine and the UC San Francisco Biopharmaceutical Sciences Department. [10] [15]

Education

Haussler studied art briefly at the Academy of Art in San Francisco in 1971 and then psychotherapy at Immaculate Heart College in Hollywood until 1973, when he transferred to Connecticut College, finishing in 1975 with a major in mathematics and minor in physics. He earned an MS in applied mathematics from California Polytechnic University in San Luis Obispo in 1979. Haussler received his PhD in computer science from the University of Colorado at Boulder in 1982.

Career and research

During summers while he was in college, Haussler worked for his brother, Mark Haussler, a biochemist at the University of Arizona studying vitamin D metabolism. They were the first to measure the levels of Calcitriol, the hormonal form of vitamin D, in the human bloodstream. [16] Between 1975 and 1979 he traveled and worked a variety of jobs, including a job at a petroleum refinery in Burghausen, Germany, tomato farming on Crete, and farming kiwifruit, almonds, and walnuts in Templeton, CA. While in Templeton he worked on his master's degree at nearby California Polytechnic University. [10]

Haussler was an assistant professor in Mathematics and Computer Science at the University of Denver in Colorado from 1982 to 1986. From 1986 to the present, he has been at UC Santa Cruz, initially in the Computer Science Department, and in 2004 as an inaugural member of the Biomolecular Engineering Department. [10]

While pursuing his doctorate in theoretical computer science at the University of Colorado, Haussler became interested in the mathematical analysis of DNA along with fellow students Gene Myers, Gary Stormo, and Manfred Warmuth. Haussler's current research stems from his early work in machine learning. In 1988 he organized the first Workshop on Computational learning Theory with Leonard Pitt. With Blumer, Ehrenfeucht, and Warmuth he introduced the Vapnik-Chervonenkis framework to computational learning theory, solving some problems posed by Leslie Valiant. In the 1990s he obtained various results in information theory, empirical processes, artificial intelligence, [17] neural networks, [18] statistical decision theory, and pattern recognition. [19] At present, his lab grows human cerebral organoids for neurodevelopmental disease research [20] and to explore human neural circuit formation and learning.

Haussler's research combines mathematics, computer science, and molecular biology. [7] He develops new statistical and algorithmic methods to explore the molecular function and evolution of the human genome, integrating cross-species comparative and high-throughput genomics data to study gene structure, function, and regulation. [21] [22] [23] [24] He is credited with pioneering the use of Hidden Markov models (HMMs), stochastic context-free grammars, and the discriminative kernel method for analyzing DNA, RNA, and protein sequences. He was the first to apply the latter methods to the genome-wide search for gene expression biomarkers in cancer, now a major effort of his laboratory.

As a collaborator on the international Human Genome Project, his team, featuring programming work by graduate student Jim Kent, computationally assembled the first draft of the human genome [25] and posted it on the Internet on July 7, 2000. [26] Following this, his team developed the UCSC Genome Browser, [27] [28] [29] a web-based tool that is used extensively in biomedical research and serves as the platform for several large-scale genomics projects. These include the National Human Genome Research Institute (NHGRI)'s ENCODE project to use omics methods to explore the function of every base in the human genome (for which UCSC served as the Data Coordination Center), NIH's Mammalian Gene Collection, NHGRI's 1000 genomes project to explore human genetic variation, the Human Pangenome Reference Consortium to replace the single reference human genome with a collection of genomes from around the world, and the National Cancer Institute (NCI) Cancer Genome Atlas project to explore the genomic changes in cancer.

His group's informatics work on cancer genomics, including the UCSC Cancer Genomics Browser, [30] provides a complete analysis pipeline from raw DNA reads through the detection and interpretation of mutations and altered gene expression in tumor samples. His group collaborates with researchers at medical centers nationally, including members of the Stand Up To Cancer "Dream Teams" and the Cancer Genome Atlas, to discover molecular causes of cancer and develop a new personalized, genomics-based approach to cancer treatment. [31]

Haussler is one of eight organizing committee members of the Global Alliance for Genomic and Clinical Data Sharing, along with David Altshuler from the Broad Institute of Harvard and MIT; Peter Goodhand and Thomas Hudson from the Ontario Institute for Cancer Research; Brad Margus from the A-T Children's Project; Elizabeth Nabel from Brigham and Women's Hospital; Charles Sawyers from Memorial Sloan-Kettering; and Michael Stratton from Wellcome Trust Sanger Institute. [32]

He co-founded the Genome 10K Project (now superseded by the Vertebrate Genomes Project) to assemble a genomic zoo—a collection of DNA sequences representing the genomes of 10,000 vertebrate species—to capture genetic diversity as a resource for the life sciences and for worldwide conservation efforts. [33] [34]

Awards and honors

Haussler is a member of the National Academy of Sciences, [2] the National Academy of Engineering, [35] and the American Academy of Arts and Sciences [36] and a Fellow of the Association for the Advancement of Artificial Intelligence [37] (AAAI). His awards include the 2011 Weldon Memorial Prize from University of Oxford, the 2009 American Society of Human Genetics (ASHG) Curt Stern Award in Human Genetics, the 2008 ISCB Senior Scientist Award from the International Society for Computational Biology (who also elected him an ISCB Fellow in 2009), [5] the 2005 Dickson Prize for Science from Carnegie Mellon University, and the 2003 Association for Computing Machinery (ACM)/Association for the Advancement of Artificial Intelligence (AAAI) Allen Newell Award in Artificial Intelligence. [38]

With Cyrus Chothia and Michael Waterman, Haussler was awarded the 2015 Dan David Prize for his contributions to the field of bioinformatics. [39]

Related Research Articles

<span class="mw-page-title-main">Comparative genomics</span>

Comparative genomics is a field of biological research in which the genomic features of different organisms are compared. The genomic features may include the DNA sequence, genes, gene order, regulatory sequences, and other genomic structural landmarks. In this branch of genomics, whole or large parts of genomes resulting from genome projects are compared to study basic biological similarities and differences as well as evolutionary relationships between organisms. The major principle of comparative genomics is that common features of two organisms will often be encoded within the DNA that is evolutionarily conserved between them. Therefore, comparative genomic approaches start with making some form of alignment of genome sequences and looking for orthologous sequences in the aligned genomes and checking to what extent those sequences are conserved. Based on these, genome and molecular evolution are inferred and this may in turn be put in the context of, for example, phenotypic evolution or population genetics.

<span class="mw-page-title-main">Jim Kent</span> American research scientist and computer programmer

William James Kent is an American research scientist and computer programmer. He has been a contributor to genome database projects and the 2003 winner of the Benjamin Franklin Award.

<span class="mw-page-title-main">Jack Baskin School of Engineering</span>

The Baskin School of Engineering, known simply as Baskin Engineering, is the school of engineering at the University of California, Santa Cruz. It consists of six departments: Applied Mathematics, Biomolecular Engineering, Computational Media, Computer Science and Engineering, Electrical and Computer Engineering, and Statistics.

<span class="mw-page-title-main">Human Genome Project</span> Human genome sequencing programme

The Human Genome Project (HGP) was an international scientific research project with the goal of determining the base pairs that make up human DNA, and of identifying, mapping and sequencing all of the genes of the human genome from both a physical and a functional standpoint. It started in 1990 and was completed in 2003. It remains the world's largest collaborative biological project. Planning for the project started after it was adopted in 1984 by the US government, and it officially launched in 1990. It was declared complete on April 14, 2003, and included about 92% of the genome. Level "complete genome" was achieved in May 2021, with a remaining only 0.3% bases covered by potential issues. The final gapless assembly was finished in January 2022.

<span class="mw-page-title-main">Michael Waterman</span> American mathematician

Michael Spencer Waterman is a Professor of Biology, Mathematics and Computer Science at the University of Southern California (USC), where he holds an Endowed Associates Chair in Biological Sciences, Mathematics and Computer Science. He previously held positions at Los Alamos National Laboratory and Idaho State University.

<span class="mw-page-title-main">Takashi Gojobori</span> Japanese molecular biologist

Takashi Gojobori is a Japanese molecular biologist, Vice-Director of the National Institute of Genetics (NIG) and the DNA Data Bank of Japan (DDBJ) at NIG, in Mishima, Japan. Gojobori is a Distinguished Professor at King Abdullah University of Science and Technology (KAUST) in Thuwal, Saudi Arabia. He is a Professor of Bioscience and Acting Director at the Computational Bioscience Research Center at KAUST.

<span class="mw-page-title-main">Eugene Myers</span> American scientist

Eugene Wimberly "Gene" Myers, Jr. is an American computer scientist and bioinformatician, who is best known for contributing to the early development of the NCBI's BLAST tool for sequence analysis.

The UCSC Genome Browser is an online and downloadable genome browser hosted by the University of California, Santa Cruz (UCSC). It is an interactive website offering access to genome sequence data from a variety of vertebrate and invertebrate species and major model organisms, integrated with a large collection of aligned annotations. The Browser is a graphical viewer optimized to support fast interactive performance and is an open-source, web-based tool suite built on top of a MySQL database for rapid visualization, examination, and querying of the data at many levels. The Genome Browser Database, browsing tools, downloadable data files, and documentation can all be found on the UCSC Genome Bioinformatics website.

<span class="mw-page-title-main">Aviv Regev</span> Bioinformatician

Aviv Regev is a computational biologist and systems biologist and Executive Vice President and Head of Genentech Research and Early Development in Genentech/Roche. She is a core member at the Broad Institute of MIT and Harvard and professor at the Department of Biology of the Massachusetts Institute of Technology. Regev is a pioneer of single cell genomics and of computational and systems biology of gene regulatory circuits. She founded and leads the Human Cell Atlas project, together with Sarah Teichmann.

Adam C. Siepel is an American computational biologist known for his research in comparative genomics and population genetics, particularly the development of statistical methods and software tools for identifying evolutionarily conserved sequences. Siepel is currently Chair of the Simons Center for Quantitative Biology and Professor in the Watson School for Biological Sciences at Cold Spring Harbor Laboratory.

GenomeSpace is an environment for genomics software tools and applications. It helps users manage their analysis workflows involving multiple diverse tools, including web applications and desktop tools and facilitates the transfer of data between tools via automatic format conversion. Analyses can use data from local or cloud-based stores.

<span class="mw-page-title-main">Alfonso Valencia</span>

Alfonso Valencia is a Spanish biologist, ICREA Professor, current director of the Life Sciences department at Barcelona Supercomputing Center. and of Spanish National Bioinformatics Institute (INB-ISCIII). From 2015-2018, he was President of the International Society for Computational Biology. His research is focused on the study of biomedical systems with computational biology and bioinformatics approaches.

<span class="mw-page-title-main">Gary Stormo</span> American geneticist (born 1950)

Gary Stormo is an American geneticist and currently Joseph Erlanger Professor in the Department of Genetics and the Center for Genome Sciences and Systems Biology at Washington University School of Medicine in St Louis. He is considered one of the pioneers of bioinformatics and genomics. His research combines experimental and computational approaches in order to identify and predict regulatory sequences in DNA and RNA, and their contributions to the regulatory networks that control gene expression.

Mathieu Daniel Blanchette is a computational biologist and Director of the School of Computer Science at McGill University. His research focuses on developing new algorithms for the detection of functional regions in DNA sequences.

Kate R. Rosenbloom is a member of the Encyclopedia of DNA Elements (ENCODE) Consortium. She is a Tech Project Manager and Software Developer at the Center for Biomolecular Science and Engineering, Jack Baskin School of Engineering, University of California Santa Cruz (UCSC), USA. She has been a member of the scientific advisory board to the human proteome project and contributed data integration and visualisation within the GTEx consortium, an international project aiming to understand how genetic variation shapes variation between human tissues.

Harris A. Lewin, an American biologist, is a professor of evolution and ecology and Robert and Rosabel Osborne Endowed Chair at the University of California, Davis. He is a member of the National Academy of Sciences. In 2011, Lewin won the Wolf Prize in Agriculture for his research into cattle genomics. Lewin chairs the working group for the Earth BioGenome Project, a moonshot for biology that aims to sequence, catalog, and characterize the genomes of all of Earth’s eukaryotic biodiversity over a period of 10 years. Lewin is a founding co-editor of the Annual Review of Animal Biosciences, first published in 2013.

<span class="mw-page-title-main">Kelly A. Frazer</span> American physician

Kelly A Frazer is a Professor of Pediatrics in the Medical School at the University of California, San Diego, Chief of the Division of Genome Information Sciences and Director of the Institute for Genomic Medicine.

The UC Santa Cruz Genomics Institute is a public research institution based in the Jack Baskin School of Engineering at the University of California, Santa Cruz. The Genomics Institute's scientists and engineers work on a variety of projects related to genome sequencing, computational biology, large data analytics, and data sharing. The institute also maintains a number of software tools used by researchers worldwide, including the UCSC Genome Browser, Dockstore, and the Xena Browser.

Katherine Snowden Pollard is the Director of the Gladstone Institute of Data Science and Biotechnology and a professor at the University of California, San Francisco (UCSF). She is a Chan Zuckerberg Biohub Investigator. She was awarded Fellowship of the International Society for Computational Biology in 2020 and the American Institute for Medical and Biological Engineering in 2021 for outstanding contributions to computational biology and bioinformatics.

Karen Elizabeth Hayden Miga is an American geneticist who co-leads the Telomere-to-Telomore (T2T) consortium that released fully complete assembly of the human genome in March 2022. She is an assistant professor of biomolecular engineering at the University of California, Santa Cruz and Associate Director of Human Pangenomics at the UC Santa Cruz Genomics Institute. She was named as "One to Watch" in the 2020 Nature's 10 and one of Time 100’s most influential people of 2022.

References

  1. Jones, Pevzner An introduction to bioinformatics algorithms, 2004, p. 403.
  2. 1 2 Anon (2006). "Member directory: David Haussler at the University of California, Santa Cruz". nasonline.org. National Academy of Sciences.
  3. Anon (2010). "2009 ASHG Awards and Addresses". The American Journal of Human Genetics. 86 (3): 309–310. doi:10.1016/j.ajhg.2010.02.013. PMC   3591852 .
  4. Sansom, C.; Morrison Mckay, B. J. (2008). Bourne, Philip E. (ed.). "ISCB Honors David Haussler and Aviv Regev". PLOS Computational Biology. 4 (7): e1000101. Bibcode:2008PLSCB...4E0101S. doi: 10.1371/journal.pcbi.1000101 . PMC   2536508 . PMID   18795145.
  5. 1 2 Anon (2017). "ISCB Fellows". iscb.org. International Society for Computational Biology. Archived from the original on 2017-03-20.
  6. "Wikipedia co-founder Jimmy Wales among 2015 Dan David Prize winners" . Retrieved 13 February 2015.
  7. 1 2 David Haussler publications indexed by Google Scholar OOjs UI icon edit-ltr-progressive.svg
  8. 1 2 3 David Haussler at the Mathematics Genealogy Project
  9. Freund, Yoav (1993). Data filtering and distribution modeling algorithms for machine learning (PhD thesis). University of California, Santa Cruz. OCLC   679396091.
  10. 1 2 3 4 Gitschier, Jane (2013). "Life, the Universe, and Everything: An Interview with David Haussler". PLOS Genetics. 9 (1): e1003282. doi: 10.1371/journal.pgen.1003282 . PMC   3561096 . PMID   23382705.
  11. Pollard KS, Salama SR, King B, Kern AD, Dreszer T, Katzman S, Siepel A, Pedersen JS, Bejerano G, Baertsch R, Rosenbloom KR, Kent J, Haussler D (2006). "Forces shaping the fastest evolving regions in the human genome". PLOS Genetics . 2 (10): e168. doi: 10.1371/journal.pgen.0020168 . PMC   1599772 . PMID   17040131.
  12. Haussler, D. (2011). "David Haussler". Nature Biotechnology. 29 (3): 243. doi: 10.1038/nbt.1808 . PMID   21390032. S2CID   8173735.
  13. Downey, P. (2008). "Profile of David Haussler". Proceedings of the National Academy of Sciences. 105 (38): 14251–14253. Bibcode:2008PNAS..10514251D. doi: 10.1073/pnas.0808284105 . PMC   2567157 . PMID   18799747.
  14. David Haussler publications from Europe PubMed Central
  15. Don't throw it out: 'Junk DNA' essential in evolution, radio interview by Joe Palca, NPR, Aug 19, 2011.
  16. Brumbaugh, P. F.; Haussler, D. H.; Bressler, R.; Haussler, M. R. (1974). "Radioreceptor assay for 1 alpha,25-dihydroxyvitamin D3". Science. 183 (4129): 1089–1091. Bibcode:1974Sci...183.1089B. doi:10.1126/science.183.4129.1089. PMID   4812038. S2CID   83858470.
  17. "MIT notes" (PDF).
  18. Baum, Eric B.; Haussler, David (1988-01-01). "What size net gives valid generalization?". Proceedings of the 1st International Conference on Neural Information Processing Systems. NIPS'88. Cambridge, MA, USA: MIT Press: 81–90.
  19. "Probably..." (PDF). Archived (PDF) from the original on 2015-10-23.
  20. Fiddes, I. T.; Lodewijk, G. A.; Mooring, M.; Bosworth, C. M.; Ewing, A.D.; Mantalas, G.M.; Novak, A.M.; van den Bout, A.; Bishara, A.; Rosenkrantz, J.L.; Lorig-Roach, R.; Field, A.R.; Haeussler, M.; Russo, L.; Bhaduri, A.; Nowakowski, T.J.; Pollen, A.A.; Dougherty, M.L.; Nuttle, X.; Addor, M.C.; Zwolinski, S.; Katzman, S.; Kriegstein, E.E.; Salama, S.R.; Jacobs, F.M.J.; Haussler, D. (2018). "Human-Specific NOTCH2NL Genes Affect Notch Signaling and Cortical Neurogenesis". Cell. 173 (6): 1356–1369.e22. doi: 10.1016/j.cell.2018.03.051 . PMC   5986104 . PMID   29856954.
  21. Pearson, Helen (2004). "'Junk' DNA reveals vital role". Nature: news040503–9. doi:10.1038/news040503-9.
  22. Biello, David. "Scientists Identify Gene Difference Between Humans and Chimps". Scientific American. 17 August 2006. Retrieved 2012-02-29.
  23. "Vertebrate Evolution Occurred in Genetically Distinct Epochs". HHMI News. 19 August 2011. Retrieved 2012-02-29.
  24. Zimmer, Carl. "When Bats and Humans Were One and the Same". The New York Times. 7 December 2004. Retrieved 2012-02-29.
  25. Wade, Nicholas (2001-02-13). "READING THE BOOK OF LIFE; Grad Student Becomes Gene Effort's Unlikely Hero (Published 2001)". The New York Times. ISSN   0362-4331 . Retrieved 2021-02-15.
  26. "UC Santa Cruz Puts Human Genome Online / Programming wizard does job in 4 weeks". 7 July 2000.
  27. UCSC Genome Browser
  28. Kent, W. James; Sugnet, Charles W.; Furey, Terrence S.; Roskin, Krishna M.; Pringle, Tom H.; Zahler, Alan M.; Haussler, and David (June 2002). "The Human Genome Browser at UCSC". Genome Research. 12 (6): 996–1006. doi:10.1101/gr.229102. ISSN   1088-9051. PMC   186604 . PMID   12045153.
  29. Kent, W. James; Sugnet, Charles W.; Furey, Terrence S.; Roskin, Krishna M.; Pringle, Tom H.; Zahler, Alan M.; Haussler, and David (2002-06-01). "The Human Genome Browser at UCSC". Genome Research. 12 (6): 996–1006. doi:10.1101/gr.229102. ISSN   1088-9051. PMC   186604 . PMID   12045153.
  30. Zhu, J.; Sanborn, J. Z.; Benz, S.; Szeto, C.; Hsu, F.; Kuhn, R. M.; Karolchik, D.; Archie, J.; Lenburg, M. E.; Esserman, L. J.; Kent, W. J.; Haussler, D.; Wang, T. (2009). "The UCSC Cancer Genomics Browser". Nature Methods. 6 (4): 239–240. doi:10.1038/nmeth0409-239. PMC   5027375 . PMID   19333237.
  31. Patterson, David. "Computer Scientists May Have What It Takes to Help Cure Cancer". The New York Times. 5 December 2011. Retrieved 2012-02-29.
  32. "Home - Wellcome Sanger Institute". www.sanger.ac.uk. Retrieved 2021-02-15.
  33. Pennisi, Elizabeth. No Genome Left Behind. Science News. November 2009. Retrieved 2012-02-29.
  34. "Building the Genome Zoo: The Genome 10K Project". The 7th Avenue Project. 22 November 2009. Retrieved 2012-02-29.
  35. "National Academy of Engineering".
  36. "American Academy of Arts & Sciences". 13 September 2023.
  37. "Elected AAAI Fellows". www.aaai.org. Retrieved 2021-02-15.
  38. "All Award Winners". awards.acm.org. Retrieved 2021-02-15.
  39. "Dan David Prize".