Peter Karp (scientist)

Last updated
Peter Karp
Born
Peter D. Karp
Alma mater University of Pennsylvania (BA) [1]
Stanford University (PhD)
Known for
Awards ISCB Fellow (2012) [5]
Scientific career
Fields Bioinformatics
Artificial Intelligence [6]
Institutions SRI International
National Center for Biotechnology Information [1]
Thesis Hypothesis Formation and Qualitative Reasoning in Molecular Biology  (1988)
Academic advisors
Website sri.com/about/people/peter-karp

Peter D. Karp is director of the Bioinformatics Research Group at SRI International in Menlo Park, California. [1] [6] Karp leads the development of the BioCyc database collection (which includes the highly curated EcoCyc and MetaCyc databases). BioCyc databases combine genome, metabolic pathway, and regulatory information for thousands of organisms.

Contents

Education

Karp received his undergraduate degree from the University of Pennsylvania. He received a Ph.D. degree in Computer Science from Stanford University. His dissertation developed qualitative reasoning and machine learning techniques for hypothesis generation in molecular biology. Karp was a postdoctoral fellow at the National Library of Medicine.

Honors and recognition

He was elected a fellow of the International Society for Computational Biology (ISCB) in 2012 for outstanding contributions to the fields of computational biology and bioinformatics. [5] He is also a Fellow of the American Association for the Advancement of Science.

Related Research Articles

<span class="mw-page-title-main">Metabolic network modelling</span> Form of biological modelling

Metabolic network modelling, also known as metabolic network reconstruction or metabolic pathway analysis, allows for an in-depth insight into the molecular mechanisms of a particular organism. In particular, these models correlate the genome with molecular physiology. A reconstruction breaks down metabolic pathways into their respective reactions and enzymes, and analyzes them within the perspective of the entire network. In simplified terms, a reconstruction collects all of the relevant metabolic information of an organism and compiles it in a mathematical model. Validation and analysis of reconstructions can allow identification of key features of metabolism such as growth yield, resource distribution, network robustness, and gene essentiality. This knowledge can then be applied to create novel biotechnology.

In bioinformatics EcoCyc is a biological database for the bacterium Escherichia coli K-12. The EcoCyc project performs literature-based curation of the E. coli genome, and of E. coli transcriptional regulation, transporters, and metabolic pathways. EcoCyc contains written summaries of E. coli genes, distilled from over 36,000 scientific articles. EcoCyc is also a description of the genome and cellular networks of E. coli that supports scientists to carry out computational analyses.

The MetaCyc database is one of the largest metabolic pathways and enzymes databases currently available. The data in the database is manually curated from the scientific literature, and covers all domains of life. MetaCyc has extensive information about chemical compounds, reactions, metabolic pathways and enzymes. The data have been curated from more than 58,000 publications.

The BioCyc database collection is an assortment of organism specific Pathway/Genome Databases (PGDBs) that provide reference to genome and metabolic pathway information for thousands of organisms. As of June 2021, there were over 17,800 databases within BioCyc. SRI International, based in Menlo Park, California, maintains the BioCyc database family.

Rfam is a database containing information about non-coding RNA (ncRNA) families and other structured RNA elements. It is an annotated, open access database originally developed at the Wellcome Trust Sanger Institute in collaboration with Janelia Farm, and currently hosted at the European Bioinformatics Institute. Rfam is designed to be similar to the Pfam database for annotating protein families.

SUPERFAMILY is a database and search platform of structural and functional annotation for all proteins and genomes. It classifies amino acid sequences into known structural domains, especially into SCOP superfamilies. Domains are functional, structural, and evolutionary units that form proteins. Domains of common Ancestry are grouped into superfamilies. The domains and domain superfamilies are defined and described in SCOP. Superfamilies are groups of proteins which have structural evidence to support a common evolutionary ancestor but may not have detectable sequence homology.

<span class="mw-page-title-main">Sean Eddy</span> American professor at Harvard University

Sean Roberts Eddy is Professor of Molecular & Cellular Biology and of Applied Mathematics at Harvard University. Previously he was based at the Janelia Research Campus from 2006 to 2015 in Virginia. His research interests are in bioinformatics, computational biology and biological sequence analysis. As of 2016 projects include the use of Hidden Markov models in HMMER, Infernal Pfam and Rfam.

<span class="mw-page-title-main">Lincoln Stein</span>

Lincoln David Stein is a scientist and Professor in bioinformatics and computational biology at the Ontario Institute for Cancer Research.

<span class="mw-page-title-main">Rolf Apweiler</span>

Rolf Apweiler is a director of European Bioinformatics Institute (EBI) part of the European Molecular Biology Laboratory (EMBL) with Ewan Birney.

<span class="mw-page-title-main">Alex Bateman</span>

Alexander George Bateman is a computational biologist and Head of Protein Sequence Resources at the European Bioinformatics Institute (EBI), part of the European Molecular Biology Laboratory (EMBL) in Cambridge, UK. He has led the development of the Pfam biological database and introduced the Rfam database of RNA families. He has also been involved in the use of Wikipedia for community-based annotation of biological databases.

Monica Riley was an American scientist who contributed to the discovery of messenger RNA in her Ph.D work with Arthur Pardee, and was later a pioneer in the exploration and computer representation of the Escherichia coli genome.

<span class="mw-page-title-main">Desmond G. Higgins</span>

Desmond Gerard Higgins is a Professor of Bioinformatics at University College Dublin, widely known for CLUSTAL, a series of computer programs for performing multiple sequence alignment. According to Nature, Higgins' papers describing CLUSTAL are among the top ten most highly cited scientific papers of all time.

Julio Collado-Vides is a Guatemalan scientist and Professor of Computational Genomics at the National Autonomous University of Mexico. His research focuses on genomics and bioinformatics.

Donna R. Maglott is a staff scientist at the National Center for Biotechnology Information known for her research on large-scale genomics projects, including the mouse genome and development of databases required for genomics research.

Patricia Clement Babbitt is a Professor and Principal Investigator (PI) in the school of pharmacy at the University of California, San Francisco (UCSF).

Toby James Gibson is a group leader and biochemist at the European Molecular Biology Laboratory (EMBL) in Heidelberg known for his work on Clustal. According to Nature, Gibson's co-authored papers describing Clustal are among the top ten most highly cited scientific papers of all time.

In molecular biology, MvirDB is a publicly available database that stores information on toxins, virulence factors and antibiotic resistance genes. Sources that this database uses for DNA and protein information include: Tox-Prot, SCORPION, the PRINTS Virulence Factors, VFDB, TVFac, Islander, ARGO and VIDA. The database provides a BLAST tool that allows the user to query their sequence against all DNA and protein sequences in MvirDB. Information on virulence factors can be obtained from the usage of the provided browser tool. Once the browser tool is used, the results are returned as a readable table that is organized by ascending E-Values, each of which are hyperlinked to their related page. MvirDB is implemented in an Oracle 10g relational database.

Christos A. Ouzounis is a computational biologist, a director of research at the CERTH, and Professor of Bioinformatics at Aristotle University in Thessaloniki.

Judith Anne Blake is a computational biologist at the Jackson Laboratory and Professor of Mammalian Genetics.

Rita Casadio is a Professor of Biochemistry at the University of Bologna.

References

  1. 1 2 3 "Biographical Sketch for Peter D. Karp, Ph.D." www.ai.sri.com.
  2. Karp, Peter D.; et al. (2005). "Expansion of the BioCyc collection of pathway/genome databases to 160 genomes". Nucleic Acids Research. 33 (19): 6083–6089. doi:10.1093/nar/gki892. ISSN   0305-1048. PMC   1266070 . PMID   16246909. Open Access logo PLoS transparent.svg
  3. Caspi, R.; Foerster, H.; Fulcher, C. A.; Kaipa, P.; Krummenacker, M.; Latendresse, M.; Paley, S.; Rhee, S. Y.; Shearer, A. G.; Tissier, C.; Walk, T. C.; Zhang, P.; Karp, P. D. (2007). "The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases". Nucleic Acids Research . 36 (Database): D623–D631. doi:10.1093/nar/gkm900. ISSN   0305-1048. PMC   2238876 . PMID   17965431. Open Access logo PLoS transparent.svg
  4. Keseler, I. M.; et al. (2004). "EcoCyc: a comprehensive database resource for Escherichia coli". Nucleic Acids Research . 33 (Database issue): D334–D337. doi:10.1093/nar/gki108. ISSN   1362-4962. PMC   540062 . PMID   15608210. Open Access logo PLoS transparent.svg
  5. 1 2 Anon (2018). "ISCB Fellows". iscb.org. International Society for Computational Biology. Archived from the original on 2017-03-20.
  6. 1 2 Peter Karp publications indexed by Google Scholar OOjs UI icon edit-ltr-progressive.svg
  7. 1 2 3 4 Karp, Peter Dornin (1988). Hypothesis Formation and Qualitative Reasoning in Molecular Biology. dtic.mil (PhD thesis). Stanford University. doi:10.1609/aimag.v11i4.859. OCLC   20463112. Archived from the original on June 9, 2017.