Barbara Engelhardt | |
---|---|
Born | Barbara Elizabeth Engelhardt |
Alma mater | Stanford University (BS, MS) University of California, Berkeley (PhD) |
Awards | Overton Prize (2021) |
Scientific career | |
Fields | Statistical genetics Bayesian statistics Machine learning Statistical inference Genomics [1] |
Institutions | Princeton University Chicago University Jet Propulsion Laboratory |
Thesis | Predicting protein molecular function (2007) |
Doctoral advisor | Michael I. Jordan [2] |
Website | www |
Barbara Elizabeth Engelhardt is an American computer scientist and specialist in bioinformatics. Working as a Professor at Stanford University, her work has focused on latent variable models, exploratory data analysis for genomic data, and QTLs. [1] In 2021, she was awarded the Overton Prize by the International Society for Computational Biology.
Engelhardt received a Bachelor of Science in Symbolic Systems and a Master of Science in Computer Science from Stanford University. She received a PhD in 2008 from the University of California, Berkeley supervised by Michael I. Jordan. [3]
Engelhardt worked as a postdoctoral researcher at the University of Chicago in the Department of Human Genetics with Matthew Stephens from 2008 to 2011. [4] She joined Duke University in 2011 as an assistant professor in the Biostatistics and Bioinformatics Department. She joined Princeton University as an assistant professor in 2014 and received a promotion to Associate Professor with tenure in 2017. [5] In August 2022, she moved to California, she now holds the position of Professor at Stanford University and Gladstone Institute of Data Science and Biotechnology. [6] [7]
After graduating from Stanford, Engelhardt worked at the Jet Propulsion Laboratory in the Artificial Intelligence group for two years, working on planning and scheduling for autonomous spacecraft. [8] As a graduate student at Berkeley, she developed statistical models for protein function annotation and statistical frameworks for reasoning about ontologies. [9] [10] During her postdoctoral research, she developed sparse factor analysis models for population structure [11] and Bayesian models for association testing. [12]
In her faculty position, the bulk of Engelhardt's research focused on developing latent variable models and exploratory data analysis for genomic data, [13] and also on statistical models for association testing in expression QTLs. [14] As a member of the Genotype Tissue Expression (GTEx) Consortium, her group was responsible for the trans-eQTL discovery and analysis in the GTEx v6 [15] and v8 data. [16]
Post tenure, Engelhardt's research in these latent variable models has expanded to include single cell sequencing, with a particular focus on spatial transcriptomics. [17] She also has work on Bayesian experimental design using contextual multi-armed bandits, and has adapted this work to the novel species problem in order to inform single cell data collection for atlas building. [18] Her work has also expanded into machine learning for electronic healthcare records. [19] [20]
Engelhardt's work has been featured in Quanta Magazine . In 2017, she gave a TEDx talk entitled: 'Not What but Why: Machine Learning for Understanding Genomics.' [21]
Engelhardt's research has been funded by the National Institutes of Health through two R01 grants and a number of other mechanisms. Engelhardt has been recognized by several awards including an Alfred P. Sloan Fellowship in Computational Biology, [22] a National Science Foundation CAREER Award, [23] two Chan Zuckerberg Initiative grants for the Human Cell Atlas, [24] and a Fast Grant for her recent work on COVID-19. [25] In 2021, she was awarded the Overton Prize by the International Society for Computational Biology. [26]
Engelhardt's postdoctoral work was partly funded through an NIH NHGRI K99 grant, [27] and her PhD was partly funded through an NSF Graduate Research Fellowship and the Google Anita Borg Scholarship in 2005. [28] She received SMBE's Walter M. Fitch Prize in 2004. [29]
Engelhardt served on the Board of Directors (2014–2017) and the Senior Advisory Council (2017–present) for Women in Machine Learning. [30] She is the Diversity & Inclusion Co-chair at the International Conference on Machine Learning (ICML, 2018–2022). [31] In 2019, she was a member of the NIH Advisory Committee to the Director, Working Group on Artificial Intelligence [32]
Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.
Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has foundations in applied mathematics, chemistry, and genetics. It differs from biological computing, a subfield of computer science and engineering which uses bioengineering to build computers.
Functional genomics is a field of molecular biology that attempts to describe gene functions and interactions. Functional genomics make use of the vast data generated by genomic and transcriptomic projects. Functional genomics focuses on the dynamic aspects such as gene transcription, translation, regulation of gene expression and protein–protein interactions, as opposed to the static aspects of the genomic information such as DNA sequence or structures. A key characteristic of functional genomics studies is their genome-wide approach to these questions, generally involving high-throughput methods rather than a more traditional "candidate-gene" approach.
Mark Bender Gerstein is an American scientist working in bioinformatics and Data Science. As of 2009, he is co-director of the Yale Computational Biology and Bioinformatics program.
GeneNetwork is a combined database and open-source bioinformatics data analysis software resource for systems genetics. This resource is used to study gene regulatory networks that link DNA sequence differences to corresponding differences in gene and protein expression and to variation in traits such as health and disease risk. Data sets in GeneNetwork are typically made up of large collections of genotypes and phenotypes from groups of individuals, including humans, strains of mice and rats, and organisms as diverse as Drosophila melanogaster, Arabidopsis thaliana, and barley. The inclusion of genotypes makes it practical to carry out web-based gene mapping to discover those regions of genomes that contribute to differences among individuals in mRNA, protein, and metabolite levels, as well as differences in cell function, anatomy, physiology, and behavior.
The ISCB Overton Prize is a computational biology prize awarded annually for outstanding accomplishment by a scientist in the early to mid stage of his or her career. Laureates have made significant contribution to the field of computational biology either through research, education, service, or a combination of the three.
Olga G. Troyanskaya is a Professor in the Department of Computer Science and the Lewis-Sigler Institute for Integrative Genomics at Princeton University and the Deputy Director for Genomics at the Flatiron Institute's Center for Computational Biology in NYC. She studies protein function and interactions in biological pathways by analyzing genomic data using computational tools.
Dana Pe'er, Chair and Professor in Computational and Systems Biology Program at Sloan Kettering Institute is a researcher in computational systems biology. A Howard Hughes Medical Institute (HHMI) Investigator since 2021, she was previously a professor at Columbia Department of Biological Sciences. Pe'er's research focuses on understanding the organization, function and evolution of molecular networks, particularly how genetic variations alter the regulatory network and how these genetic variations can cause cancer.
Curtis Huttenhower is a Professor of Computational Biology and Bioinformatics in the Department of Biostatistics, School of Public Health, Harvard University.
Núria López Bigas is a Spanish biologist and research professor with expertise in medical genetics, computational biology, and bioinformatics. She is an ICREA professor at Pompeu Fabra University and she also leads the Biomedical Genomics Research Group at the Institute for Research in Biomedicine in Barcelona, Spain. Her research is focused on developing computational approaches to investigate cancer genomes.
Mona Singh is a Professor of Computer Science in the Lewis-Sigler Institute for Integrative Genomics at Princeton University.
Manolis Kellis is a professor of Computer Science at the Massachusetts Institute of Technology (MIT) in the area of Computational Biology and a member of the Broad Institute of MIT and Harvard. He is the head of the Computational Biology Group at MIT and is a Principal Investigator in the Computer Science and Artificial Intelligence Lab (CSAIL) at MIT.
Tracy Teal is an American bioinformatician and the executive director of Data Carpentry. She is known for her work in open science and biomedical data science education.
Emmanouil Theophilos Dermitzakis is a Greek human geneticist and professor in the Department of Genetic Medicine and Development at the University of Geneva, where he is also Director of the Health 2030 Genome Center. He is an ISI Highly Cited Researcher and an elected member of the European Molecular Biology Organization. He is a member of the Swiss Institute of Bioinformatics, where his research group is focused on the genetics and genomics of complex traits in humans. He has joined GlaxoSmithKline as Vice President, Computational Biology in R&D.
David G. Robinson is a data scientist at the Heap analytics company. He is a co-author of the tidytext R package and the O’Reilly book, Text Mining with R. Robinson has previously worked as a chief data scientist at DataCamp and as a data scientist at Stack Overflow. He was also a data engineer at Flatiron Health in 2019.
Laxmi Parida is an IBM Master Inventor and group leader in computational genomics at the Thomas J. Watson Research Center and Courant Institute of Mathematical Sciences in New York.
Katherine Snowden Pollard is the Director of the Gladstone Institute of Data Science and Biotechnology and a professor at the University of California, San Francisco (UCSF). She is a Chan Zuckerberg Biohub Investigator. She was awarded Fellowship of the International Society for Computational Biology in 2020 and the American Institute for Medical and Biological Engineering in 2021 for outstanding contributions to computational biology and bioinformatics.
Stein Aerts is a Belgian bio-engineer and computational biologist. He leads the Laboratory of Computational Biology at VIB and KU Leuven, and has received several accolades for his research into the workings of the genomic regulatory code.