Barbara Engelhardt

Last updated
Barbara Engelhardt
Born
Barbara Elizabeth Engelhardt
Alma mater Stanford University (BS, MS) University of California, Berkeley (PhD)
Awards Overton Prize (2021)
Scientific career
Fields Statistical genetics
Bayesian statistics
Machine learning
Statistical inference
Genomics [1]
Institutions Princeton University
Chicago University
Jet Propulsion Laboratory
Thesis Predicting protein molecular function  (2007)
Doctoral advisor Michael I. Jordan [2]
Website www.cs.princeton.edu/~bee/ OOjs UI icon edit-ltr-progressive.svg

Barbara Elizabeth Engelhardt is an American computer scientist and specialist in bioinformatics. Working as a Professor at Stanford University, her work has focused on latent variable models, exploratory data analysis for genomic data, and QTLs. [1] In 2021, she was awarded the Overton Prize by the International Society for Computational Biology.

Contents

Education

Engelhardt received a Bachelor of Science in Symbolic Systems and a Master of Science in Computer Science from Stanford University. She received a PhD in 2008 from the University of California, Berkeley supervised by Michael I. Jordan. [3]  

Career and research

Engelhardt worked as a postdoctoral researcher at the University of Chicago in the Department of Human Genetics with Matthew Stephens from 2008 to 2011. [4]   She joined Duke University in 2011 as an assistant professor in the Biostatistics and Bioinformatics Department. She joined Princeton University as an assistant professor in 2014 and received a promotion to Associate Professor with tenure in 2017. [5] In August 2022, she moved to California, she now holds the position of Professor at Stanford University and Gladstone Institute of Data Science and Biotechnology. [6] [7]

After graduating from Stanford, Engelhardt worked at the Jet Propulsion Laboratory in the Artificial Intelligence group for two years, working on planning and scheduling for autonomous spacecraft. [8] As a graduate student at Berkeley, she developed statistical models for protein function annotation and statistical frameworks for reasoning about ontologies. [9] [10] During her postdoctoral research, she developed sparse factor analysis models for population structure [11] and Bayesian models for association testing. [12]

In her faculty position, the bulk of Engelhardt's research focused on developing latent variable models and exploratory data analysis for genomic data, [13] and also on statistical models for association testing in expression QTLs. [14] As a member of the Genotype Tissue Expression (GTEx) Consortium, her group was responsible for the trans-eQTL discovery and analysis in the GTEx v6 [15] and v8 data. [16]

Post tenure, Engelhardt's research in these latent variable models has expanded to include single cell sequencing, with a particular focus on spatial transcriptomics. [17]   She also has work on Bayesian experimental design using contextual multi-armed bandits, and has adapted this work to the novel species problem in order to inform single cell data collection for atlas building. [18] Her work has also expanded into machine learning for electronic healthcare records. [19] [20]

Engelhardt's work has been featured in Quanta Magazine . In 2017, she gave a TEDx talk entitled: 'Not What but Why: Machine Learning for Understanding Genomics.' [21]

Honors and awards

Engelhardt's research has been funded by the National Institutes of Health through two R01 grants and a number of other mechanisms. Engelhardt has been recognized by several awards including an Alfred P. Sloan Fellowship in Computational Biology, [22] a National Science Foundation CAREER Award, [23] two Chan Zuckerberg Initiative grants for the Human Cell Atlas, [24] and a Fast Grant for her recent work on COVID-19. [25] In 2021, she was awarded the Overton Prize by the International Society for Computational Biology. [26]

Engelhardt's postdoctoral work was partly funded through an NIH NHGRI K99 grant, [27] and her PhD was partly funded through an NSF Graduate Research Fellowship and the Google Anita Borg Scholarship in 2005. [28] She received SMBE's Walter M. Fitch Prize in 2004. [29]

Service and leadership

Engelhardt served on the Board of Directors (2014–2017) and the Senior Advisory Council (2017–present) for Women in Machine Learning. [30] She is the Diversity & Inclusion Co-chair at the International Conference on Machine Learning (ICML, 2018–2022). [31] In 2019, she was a member of the NIH Advisory Committee to the Director, Working Group on Artificial Intelligence [32]

Related Research Articles

<span class="mw-page-title-main">Bioinformatics</span> Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.

<span class="mw-page-title-main">Computational biology</span> Branch of biology

Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has foundations in applied mathematics, chemistry, and genetics. It differs from biological computing, a subfield of computer science and engineering which uses bioengineering to build computers.

<span class="mw-page-title-main">Functional genomics</span> Field of molecular biology

Functional genomics is a field of molecular biology that attempts to describe gene functions and interactions. Functional genomics make use of the vast data generated by genomic and transcriptomic projects. Functional genomics focuses on the dynamic aspects such as gene transcription, translation, regulation of gene expression and protein–protein interactions, as opposed to the static aspects of the genomic information such as DNA sequence or structures. A key characteristic of functional genomics studies is their genome-wide approach to these questions, generally involving high-throughput methods rather than a more traditional "candidate-gene" approach.

Mark Bender Gerstein is an American scientist working in bioinformatics and Data Science. As of 2009, he is co-director of the Yale Computational Biology and Bioinformatics program.

GeneNetwork is a combined database and open-source bioinformatics data analysis software resource for systems genetics. This resource is used to study gene regulatory networks that link DNA sequence differences to corresponding differences in gene and protein expression and to variation in traits such as health and disease risk. Data sets in GeneNetwork are typically made up of large collections of genotypes and phenotypes from groups of individuals, including humans, strains of mice and rats, and organisms as diverse as Drosophila melanogaster, Arabidopsis thaliana, and barley. The inclusion of genotypes makes it practical to carry out web-based gene mapping to discover those regions of genomes that contribute to differences among individuals in mRNA, protein, and metabolite levels, as well as differences in cell function, anatomy, physiology, and behavior.

The ISCB Overton Prize is a computational biology prize awarded annually for outstanding accomplishment by a scientist in the early to mid stage of his or her career. Laureates have made significant contribution to the field of computational biology either through research, education, service, or a combination of the three.

<span class="mw-page-title-main">Olga Troyanskaya</span> American academic

Olga G. Troyanskaya is a Professor in the Department of Computer Science and the Lewis-Sigler Institute for Integrative Genomics at Princeton University and the Deputy Director for Genomics at the Flatiron Institute's Center for Computational Biology in NYC. She studies protein function and interactions in biological pathways by analyzing genomic data using computational tools.

<span class="mw-page-title-main">Dana Pe'er</span> Bioinformatician

Dana Pe'er, Chair and Professor in Computational and Systems Biology Program at Sloan Kettering Institute is a researcher in computational systems biology. A Howard Hughes Medical Institute (HHMI) Investigator since 2021, she was previously a professor at Columbia Department of Biological Sciences. Pe'er's research focuses on understanding the organization, function and evolution of molecular networks, particularly how genetic variations alter the regulatory network and how these genetic variations can cause cancer.

<span class="mw-page-title-main">Curtis Huttenhower</span> American biologist (born 1981)

Curtis Huttenhower is a Professor of Computational Biology and Bioinformatics in the Department of Biostatistics, School of Public Health, Harvard University.

<span class="mw-page-title-main">Núria López Bigas</span> Researcher on computational cancer genomics

Núria López Bigas is a Spanish biologist and research professor with expertise in medical genetics, computational biology, and bioinformatics. She is an ICREA professor at Pompeu Fabra University and she also leads the Biomedical Genomics Research Group at the Institute for Research in Biomedicine in Barcelona, Spain. Her research is focused on developing computational approaches to investigate cancer genomes.

Mona Singh is a Professor of Computer Science in the Lewis-Sigler Institute for Integrative Genomics at Princeton University.

<span class="mw-page-title-main">Manolis Kellis</span> Greek-born computational biologist

Manolis Kellis is a professor of Computer Science at the Massachusetts Institute of Technology (MIT) in the area of Computational Biology and a member of the Broad Institute of MIT and Harvard. He is the head of the Computational Biology Group at MIT and is a Principal Investigator in the Computer Science and Artificial Intelligence Lab (CSAIL) at MIT.

<span class="mw-page-title-main">Tracy Teal</span> American bioinformatician

Tracy Teal is an American bioinformatician and the executive director of Data Carpentry. She is known for her work in open science and biomedical data science education.

<span class="mw-page-title-main">Emmanouil Dermitzakis</span> Greek human genetics researcher

Emmanouil Theophilos Dermitzakis is a Greek human geneticist and professor in the Department of Genetic Medicine and Development at the University of Geneva, where he is also Director of the Health 2030 Genome Center. He is an ISI Highly Cited Researcher and an elected member of the European Molecular Biology Organization. He is a member of the Swiss Institute of Bioinformatics, where his research group is focused on the genetics and genomics of complex traits in humans. He has joined GlaxoSmithKline as Vice President, Computational Biology in R&D.

David G. Robinson is a data scientist at the Heap analytics company. He is a co-author of the tidytext R package and the O’Reilly book, Text Mining with R. Robinson has previously worked as a chief data scientist at DataCamp and as a data scientist at Stack Overflow. He was also a data engineer at Flatiron Health in 2019.

Laxmi Parida is an IBM Master Inventor and group leader in computational genomics at the Thomas J. Watson Research Center and Courant Institute of Mathematical Sciences in New York.

Katherine Snowden Pollard is the Director of the Gladstone Institute of Data Science and Biotechnology and a professor at the University of California, San Francisco (UCSF). She is a Chan Zuckerberg Biohub Investigator. She was awarded Fellowship of the International Society for Computational Biology in 2020 and the American Institute for Medical and Biological Engineering in 2021 for outstanding contributions to computational biology and bioinformatics.

<span class="mw-page-title-main">Stein Aerts</span> Belgian bio-engineer and computational biologist

Stein Aerts is a Belgian bio-engineer and computational biologist. He leads the Laboratory of Computational Biology at VIB and KU Leuven, and has received several accolades for his research into the workings of the genomic regulatory code.

References

  1. 1 2 Barbara Engelhardt publications indexed by Google Scholar OOjs UI icon edit-ltr-progressive.svg
  2. Barbara Engelhardt at the Mathematics Genealogy Project OOjs UI icon edit-ltr-progressive.svg
  3. "Michael I. Jordan's Home Page". people.eecs.berkeley.edu. Retrieved 2021-01-11.
  4. "Stephens Lab". stephenslab.uchicago.edu. Retrieved 2021-01-11.
  5. "Eleven Women Faculty Members Who Have Been Assigned New Duties". Women In Academia Report. 2018-03-08. Retrieved 2021-01-11.
  6. "Barbara Elizabeth Engelhardt's Profile | Stanford Profiles". profiles.stanford.edu. Retrieved 2022-08-27.
  7. "barbara.engelhardt@gladstone.ucsf.edu". gladstone.org. Retrieved 2022-08-27.
  8. "3cs | AIG". sensorwebs.jpl.nasa.gov. Retrieved 2021-01-11.
  9. Engelhardt, Barbara E.; Jordan, Michael I.; Muratore, Kathryn E.; Brenner, Steven E. (2005-10-07). "Protein Molecular Function Prediction by Bayesian Phylogenomics". PLOS Computational Biology. 1 (5): e45. Bibcode:2005PLSCB...1...45E. doi: 10.1371/journal.pcbi.0010045 . ISSN   1553-7358. PMC   1246806 . PMID   16217548.
  10. Engelhardt, Barbara E.; Jordan, Michael I.; Srouji, John R.; Brenner, Steven E. (2011-11-01). "Genome-scale phylogenetic function annotation of large and diverse protein families". Genome Research. 21 (11): 1969–1980. doi:10.1101/gr.104687.109. ISSN   1088-9051. PMC   3205580 . PMID   21784873.
  11. Engelhardt, Barbara E.; Stephens, Matthew (2010-09-16). "Analysis of Population Structure: A Unifying Framework and Novel Methods Based on Sparse Factor Analysis". PLOS Genetics. 6 (9): e1001117. doi: 10.1371/journal.pgen.1001117 . ISSN   1553-7404. PMC   2940725 . PMID   20862358.
  12. Mangravite, Lara M.; Engelhardt, Barbara E.; Medina, Marisa W.; Smith, Joshua D.; Brown, Christopher D.; Chasman, Daniel I.; Mecham, Brigham H.; Howie, Bryan; Shim, Heejung; Naidoo, Devesh; Feng, QiPing (October 2013). "A statin-dependent QTL for GATM expression is associated with statin-induced myopathy". Nature. 502 (7471): 377–380. Bibcode:2013Natur.502..377M. doi:10.1038/nature12508. ISSN   1476-4687. PMC   3933266 . PMID   23995691.
  13. Gao, Chuan; McDowell, Ian C.; Zhao, Shiwen; Brown, Christopher D.; Engelhardt, Barbara E. (2016-07-28). Zhou, Xianghong Jasmine (ed.). "Context Specific and Differential Gene Co-expression Networks via Bayesian Biclustering". PLOS Computational Biology. 12 (7): e1004791. Bibcode:2016PLSCB..12E4791G. doi: 10.1371/journal.pcbi.1004791 . ISSN   1553-7358. PMC   4965098 . PMID   27467526.
  14. Dumitrascu, Bianca; Darnell, Gregory; Ayroles, Julien; Engelhardt, Barbara E (2019-01-15). Hancock, John (ed.). "Statistical tests for detecting variance effects in quantitative trait studies". Bioinformatics. 35 (2): 200–210. doi:10.1093/bioinformatics/bty565. ISSN   1367-4803. PMC   6330007 . PMID   29982387.
  15. Aguet, François; Brown, Andrew A.; Castel, Stephane E.; Davis, Joe R.; He, Yuan; Jo, Brian; Mohammadi, Pejman; Park, YoSon; Parsana, Princy; Segrè, Ayellet V.; Strober, Benjamin J. (October 2017). "Genetic effects on gene expression across human tissues". Nature. 550 (7675): 204–213. Bibcode:2017Natur.550..204A. doi:10.1038/nature24277. ISSN   1476-4687. PMC   5776756 . PMID   29022597.
  16. The GTEx Consortium (2020-09-11). "The GTEx Consortium atlas of genetic regulatory effects across human tissues". Science. 369 (6509): 1318–1330. Bibcode:2020Sci...369.1318.. doi:10.1126/science.aaz1776. ISSN   0036-8075. PMC   7737656 . PMID   32913098.
  17. Verma, Archit; Engelhardt, Barbara E. (2020-07-21). "A robust nonlinear low-dimensional manifold for single cell RNA-seq data". BMC Bioinformatics. 21 (1): 324. doi: 10.1186/s12859-020-03625-z . ISSN   1471-2105. PMC   7374962 . PMID   32693778.
  18. Camerlenghi, Federico; Dumitrascu, Bianca; Ferrari, Federico; Engelhardt, Barbara E.; Favaro, Stefano (December 2020). "Nonparametric Bayesian multiarmed bandits for single-cell experiment design". Annals of Applied Statistics. 14 (4): 2003–2019. arXiv: 1910.05355 . doi:10.1214/20-AOAS1370. ISSN   1932-6157. S2CID   204509422.
  19. Cheng, Li-Fang; Dumitrascu, Bianca; Darnell, Gregory; Chivers, Corey; Draugelis, Michael; Li, Kai; Engelhardt, Barbara E. (2020-07-08). "Sparse multi-output Gaussian processes for online medical time series prediction". BMC Medical Informatics and Decision Making. 20 (1): 152. doi: 10.1186/s12911-020-1069-4 . ISSN   1472-6947. PMC   7341595 . PMID   32641134.
  20. Cheng, Li-Fang; Prasad, Niranjani; Engelhardt, Barbara E. (2019). "An Optimal Policy for Patient Laboratory Tests in Intensive Care Units". Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing. 24: 320–331. arXiv: 1808.04679 . ISSN   2335-6936. PMC   6417830 . PMID   30864333.
  21. "A Statistical Search for Genomic Truths". 27 February 2018.
  22. "Prof. Barbara Engelhardt recipient of an Alfred P. Sloan Foundation Research Fellowship | Computer Science Department at Princeton University". www.cs.princeton.edu. Retrieved 2021-01-11.
  23. "Barbara Engelhardt wins CAREER award for research with high-dimensional genomic data | Computer Science Department at Princeton University". www.cs.princeton.edu. Retrieved 2021-01-11.
  24. "Grants". Chan Zuckerberg Initiative. Retrieved 2021-01-11.
  25. "Fast Grants". fastgrants.org. Retrieved 2021-01-11.
  26. "Overton Prize". www.iscb.org.
  27. "NHGRI supports seven young investigators on research career paths". Genome.gov. Retrieved 2021-01-11.
  28. "2005 Google Anita Borg Memorial Scholarship Winners Announced – News announcements – News from Google – Google". googlepress.blogspot.com. Retrieved 2021-01-11.
  29. The Society for Molecular Biology & Evolution. "The Walter M. Fitch Award". www.smbe.org. Archived from the original on 2020-08-12. Retrieved 2021-01-11.
  30. "Senior Advisory Council". Archived from the original on 2021-01-13. Retrieved 2021-01-11.
  31. "2021 Conference". icml.cc. Retrieved 2021-01-11.
  32. "ACD Working Group on Artificial Intelligence". NIH Advisory Committee to the Director. Retrieved 2021-01-11.