Steven Salzberg

Last updated
Steven Salzberg
Steven Salzberg at CSHL Nov2018.jpg
Steven Salzberg at the Biological Data Science conference (CSHL) in 2018
Born
Steven Lloyd Salzberg

1960 (age 6364)
Alma mater Yale University
Harvard University
Known for GLIMMER [1]
MUMmer [2]
AMOS assembler [3]
Bowtie [4]
TopHat [5]
SpouseClaudia Pasche [6]
Awards Ben Franklin Award (2013)
Scientific career
Institutions University of Maryland, College Park
The Institute for Genomic Research
Johns Hopkins University
Thesis Learning with nested generalized exemplars  (1989)
Doctoral advisor William Aaron Woods [7]
Doctoral students
Other notable students Olga Troyanskaya [8]
Website salzberg-lab.org

Steven Lloyd Salzberg (born 1960) is an American computational biologist and computer scientist who is a Bloomberg Distinguished Professor of Biomedical Engineering, Computer Science, and Biostatistics at Johns Hopkins University, where he is also Director of the Center for Computational Biology.

Contents

Early life and education

Salzberg was born in 1960 as one of four children to Herman Salzberg, a Distinguished Professor Emeritus of Psychology, and Adele Salzberg, a retired school teacher. [9] Salzberg did his undergraduate studies at Yale University where he received his Bachelor of Arts degree in English in 1980. In 1981 he returned to Yale, and he received his Master of Science and Master of Philosophy degrees in Computer Science in 1982 and 1984, respectively. After several years in a startup company, he enrolled at Harvard University, where he earned a Ph.D. in Computer Science in 1989. [10]

Career

After obtaining his undergraduate degree, he worked for a local power company in South Carolina, where he gained programming experience on an IBM mainframe, [11] programming in COBOL and IBM assembly language. He then joined a Boston-based AI startup upon completion of his masters degree in Computer Science. [11]

After earning his Ph.D., Salzberg joined Johns Hopkins University as an assistant professor in the Department of Computer Science, and was promoted to associate professor in 1997. From 1998 to 2005, he was the head of the Bioinformatics department at The Institute for Genomic Research, one of the world's largest genome sequencing centers. Salzberg then joined the Department of Computer Science at the University of Maryland, College Park, where he was the Horvitz Professor of Computer Science as well as the Director of the Center for Bioinformatics and Computational Biology. In 2011, Salzberg returned to Johns Hopkins University as a professor in the McKusick-Nathans Institute of Genetic Medicine and in the Department of Medicine. [6] [12] [13]

In 2013, Salzberg won the Benjamin Franklin award [14] in bioinformatics.

In 2014, he was named a Bloomberg Distinguished Professor at Johns Hopkins University for his accomplishments as an interdisciplinary researcher and excellence in teaching the next generation of scholars. [15] The Bloomberg Distinguished Professorships were established in 2013 by a gift from Michael Bloomberg. [16] Salzberg holds joint appointments in the Johns Hopkins Whiting School of Engineering, Johns Hopkins School of Medicine, and the Johns Hopkins Bloomberg School of Public Health.

Research

Salzberg has been a prominent scientist in the field of bioinformatics and computational biology since the 1990s. He has made many contributions to gene finding algorithms, notably the GLIMMER [17] program for bacterial gene finding as well as several related programs for finding genes in animals, plants, and other organisms. He has also been a leader in genome assembly research and has led the assembly of dozens of genomes, both large and small. He was a participant in the human genome project [18] as well as many other genome projects, including the malaria genome ( Plasmodium falciparum ) and the genome of the model plant Arabidopsis thaliana . In 2001–2002, he and his colleagues sequenced the anthrax that was used in the 2001 anthrax attacks. They published their results in the journal Science in 2002. [19] These findings helped the FBI track the source of the attacks to a single vial at Ft. Detrick in Frederick, Maryland.

Salzberg together with David Lipman and Lone Simonsen started the Influenza Genome Sequencing Project in 2003, a project to sequence and make available the genomes of thousands of influenza virus isolates. [20] [21]

Soon after the advent of next-generation sequencing (NGS) in the mid-2000s, Salzberg's research lab and his collaborators developed a suite of highly efficient, accurate programs for alignment of NGS sequences to large genomes and for assembly of sequences from RNA-Seq experiments. These include the "Tuxedo" suite, comprising the Bowtie, TopHat, and Cufflinks programs, which have been cited tens of thousands of times in the years since their publication.

Salzberg has also been a vocal advocate against pseudoscience and has authored editorials and appeared in print media on this topic. Since 2010, he has written a column at Forbes magazine [22] on science, medicine, and pseudoscience, where he has published hundreds of articles that have received tens of millions of views. His work at Forbes won the 2012 Robert P. Balles Prize in Critical Thinking. [23]

Salzberg was a charter member of the Cambridge Working Group in 2014, which was created to express alarm in the scientific community over the creation of highly transmissible and contagious viruses (also called Gain-of-function research) and the likelihood of an accidental lab release. [24]

Publications

Salzberg has authored or co-authored over 300 scientific publications. [25] He has more than 300,000 citations in Google Scholar and an h-index of 159. [26] In 2014 and every year since (through at least 2022), Salzberg was selected for inclusion in HighlyCited.com, a ranking compiled by the Institute for Scientific Information of scientists who are among the top 1% most cited for their subject field during the previous ten years. He was also chosen for this list when it was first created in 2001. This list of highly cited researchers continues under Clarivate, and Salzberg was also included in the list in 2018, 2019, 2020, 2021, 2022, and 2023. [27]

Highly cited articles (more than 10,000 citations)

Awards

Related Research Articles

<span class="mw-page-title-main">Bioinformatics</span> Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is often referred to as computational biology, though the distinction between the two terms is often disputed.

<span class="mw-page-title-main">Comparative genomics</span> Field of biological research

Comparative genomics is a branch of biological research that examines genome sequences across a spectrum of species, spanning from humans and mice to a diverse array of organisms from bacteria to chimpanzees. This large-scale holistic approach compares two or more genomes to discover the similarities and differences between the genomes and to study the biology of the individual genomes. Comparison of whole genome sequences provides a highly detailed view of how organisms are related to each other at the gene level. By comparing whole genome sequences, researchers gain insights into genetic relationships between organisms and study evolutionary changes. The major principle of comparative genomics is that common features of two organisms will often be encoded within the DNA that is evolutionarily conserved between them. Therefore, Comparative genomics provides a powerful tool for studying evolutionary changes among organisms, helping to identify genes that are conserved or common among species, as well as genes that give unique characteristics of each organism. Moreover, these studies can be performed at different levels of the genomes to obtain multiple perspectives about the organisms.

In bioinformatics, GLIMMER (Gene Locator and Interpolated Markov ModelER) is used to find genes in prokaryotic DNA. "It is effective at finding genes in bacteria, archea, viruses, typically finding 98-99% of all relatively long protein coding genes". GLIMMER was the first system that used the interpolated Markov model to identify coding regions. The GLIMMER software is open source and is maintained by Steven Salzberg, Art Delcher, and their colleagues at the Center for Computational Biology at Johns Hopkins University. The original GLIMMER algorithms and software were designed by Art Delcher, Simon Kasif and Steven Salzberg and applied to bacterial genome annotation in collaboration with Owen White.

Computational genomics refers to the use of computational and statistical analysis to decipher biology from genome sequences and related data, including both DNA and RNA sequence as well as other "post-genomic" data. These, in combination with computational and statistical approaches to understanding the function of the genes and statistical association analysis, this field is also often referred to as Computational and Statistical Genetics/genomics. As such, computational genomics may be regarded as a subset of bioinformatics and computational biology, but with a focus on using whole genomes to understand the principles of how the DNA of a species controls its biology at the molecular level and beyond. With the current abundance of massive biological datasets, computational studies have become one of the most important means to biological discovery.

<span class="mw-page-title-main">Conserved sequence</span> Similar DNA, RNA or protein sequences within genomes or among species

In evolutionary biology, conserved sequences are identical or similar sequences in nucleic acids or proteins across species, or within a genome, or between donor and receptor taxa. Conservation indicates that a sequence has been maintained by natural selection.

<span class="mw-page-title-main">Webb Miller</span> American bioinformatician

Webb Colby Miller is an American bioinformatician who is professor in the Department of Biology and the Department of Computer Science and Engineering at The Pennsylvania State University.

<i>k</i>-mer Substrings of length k contained in a biological sequence

In bioinformatics, k-mers are substrings of length contained within a biological sequence. Primarily used within the context of computational genomics and sequence analysis, in which k-mers are composed of nucleotides, k-mers are capitalized upon to assemble DNA sequences, improve heterologous gene expression, identify species in metagenomic samples, and create attenuated vaccines. Usually, the term k-mer refers to all of a sequence's subsequences of length , such that the sequence AGAT would have four monomers, three 2-mers, two 3-mers and one 4-mer (AGAT). More generally, a sequence of length will have k-mers and total possible k-mers, where is number of possible monomers.

MUMmer is a bioinformatics software system for sequence alignment. It is based on the suffix tree data structure. It has been used for comparing different genomes assemblies to one another, which allows scientists to determine how a genome has changed. The acronym "MUMmer" comes from "Maximal Unique Matches", or MUMs.

<span class="mw-page-title-main">RNA-Seq</span> Lab technique in cellular biology

RNA-Seq is a technique that uses next-generation sequencing to reveal the presence and quantity of RNA molecules in a biological sample, providing a snapshot of gene expression in the sample, also known as transcriptome.

<span class="mw-page-title-main">Richard M. Durbin</span> British computational biologist

Richard Michael Durbin is a British computational biologist and Al-Kindi Professor of Genetics at the University of Cambridge. He also serves as an associate faculty member at the Wellcome Sanger Institute where he was previously a senior group leader.

<span class="mw-page-title-main">DNA annotation</span> The process of describing the structure and function of a genome

In molecular biology and genetics, DNA annotation or genome annotation is the process of describing the structure and function of the components of a genome, by analyzing and interpreting them in order to extract their biological significance and understand the biological processes in which they participate. Among other things, it identifies the locations of genes and all the coding regions in a genome and determines what those genes do.

Bowtie is a software package commonly used for sequence alignment and sequence analysis in bioinformatics. The source code for the package is distributed freely and compiled binaries are available for Linux, macOS and Windows platforms. As of 2017, the Genome Biology paper describing the original Bowtie method has been cited more than 11,000 times. Bowtie is open-source software and is currently maintained by Johns Hopkins University.

<span class="mw-page-title-main">Lior Pachter</span> Computational biologist

Lior Samuel Pachter is a computational biologist. He works at the California Institute of Technology, where he is the Bren Professor of Computational Biology. He has widely varied research interests including genomics, combinatorics, computational geometry, machine learning, scientific computing, and statistics.

TopHat is an open-source bioinformatics tool for the throughput alignment of shotgun cDNA sequencing reads generated by transcriptomics technologies using Bowtie first and then mapping to a reference genome to discover RNA splice sites de novo. TopHat aligns RNA-Seq reads to mammalian-sized genomes.

Ben Langmead is a computational biologist and associate professor in the Computational Biology & Medicine Group at Johns Hopkins University.

Transcriptomics technologies are the techniques used to study an organism's transcriptome, the sum of all of its RNA transcripts. The information content of an organism is recorded in the DNA of its genome and expressed through transcription. Here, mRNA serves as a transient intermediary molecule in the information network, whilst non-coding RNAs perform additional diverse functions. A transcriptome captures a snapshot in time of the total transcripts present in a cell. Transcriptomics technologies provide a broad account of which cellular processes are active and which are dormant. A major challenge in molecular biology is to understand how a single genome gives rise to a variety of cells. Another is how gene expression is regulated.

Owen R. White is a bioinformatician and director of the Institute For Genome Sciences at the University of Maryland School of Medicine, United States. He is known for his work on the bioinformatics tools GLIMMER and MUMmer.

<span class="mw-page-title-main">Cole Trapnell</span> Assistant Professor in the Department of Genome Sciences at the University of Washington

Bruce Colston Trapnell Jr. is an assistant professor in the Department of Genome Sciences at the University of Washington. He was awarded the Overton Prize by the International Society for Computational Biology (ISCB) for “outstanding accomplishment in the early to mid stage of his career” in 2018.

References

  1. Salzberg, S. L.; Delcher, A. L.; Kasif, S.; White, O. (1998). "Microbial gene identification using interpolated Markov models". Nucleic Acids Research. 26 (2): 544–548. doi:10.1093/nar/26.2.544. PMC   147303 . PMID   9421513.
  2. Delcher, A. L.; Kasif, S.; Fleischmann, R. D.; Peterson, J.; White, O.; Salzberg, S. L. (1999). "Alignment of whole genomes". Nucleic Acids Research. 27 (11): 2369–2376. doi:10.1093/nar/27.11.2369. PMC   148804 . PMID   10325427.
  3. Sommer, D. D.; Delcher, A. L.; Salzberg, S. L.; Pop, M. (2007). "Minimus: A fast, lightweight genome assembler". BMC Bioinformatics. 8: 64. doi: 10.1186/1471-2105-8-64 . PMC   1821043 . PMID   17324286.
  4. Langmead, B.; Trapnell, C.; Pop, M.; Salzberg, S. L. (2009). "Ultrafast and memory-efficient alignment of short DNA sequences to the human genome". Genome Biology. 10 (3): R25. doi: 10.1186/gb-2009-10-3-r25 . PMC   2690996 . PMID   19261174.
  5. Trapnell, C.; Pachter, L.; Salzberg, S. L. (2009). "TopHat: Discovering splice junctions with RNA-Seq". Bioinformatics. 25 (9): 1105–1111. doi:10.1093/bioinformatics/btp120. PMC   2672628 . PMID   19289445.
  6. 1 2 ccb.jhu.edu Brief biosketch
  7. 1 2 Steven Salzberg at the Mathematics Genealogy Project
  8. Mullins, J.; Morrison Mckay, B. (2011). "International Society for Computational Biology Honors Michael Ashburner and Olga Troyanskaya with Top Bioinformatics/Computational Biology Awards for 2011". PLOS Computational Biology . 7 (6): e1002081. Bibcode:2011PLSCB...7E2081M. doi: 10.1371/journal.pcbi.1002081 . PMC   3107244 .
  9. "Steven Salzberg: brief biography". Salzberg Lab. 2015-02-21. Retrieved 2021-06-03.
  10. Salzberg, Steven Lloyd (1989). Learning with nested generalized exemplars (PhD thesis). Harvard University. ProQuest   303755625.
  11. 1 2 Fogg, Christina; Kovats, Diane; Shamir, Ron (October 29, 2021). "2020 ISCB accomplishments by a Senior Scientist Award: Steven Salzberg". Oxford Academic. Retrieved 2023-07-26.
  12. Editorial on evolution and the flu, Philadelphia Inquirer
  13. "Bloomberg Distinguished Professorships | Steven Salzberg". Johns Hopkins Office of Research. 9 September 2016.
  14. "Steven Salzberg on Microbial Genomes, Open Access, Flu Shots and Gene Patents".
  15. "With Bloomberg Distinguished Professorships, Johns Hopkins aims to foster cross-specialty collaboration 2014". 2014-02-17.
  16. "Michael R. Bloomberg Commits $350 Million to Johns Hopkins for Transformational Academic Initiative 2013".
  17. Delcher, A.; Harmon, D.; Kasif, S.; White, O.; Salzberg, S. (1999). "Improved microbial gene identification with GLIMMER". Nucleic Acids Research. 27 (23): 4636–4641. doi:10.1093/nar/27.23.4636. PMC   148753 . PMID   10556321.
  18. Venter, J. C.; Adams, M.; Myers, E.; Li, P.; Mural, R.; Sutton, G.; Smith, H.; Yandell, M.; Evans, C.; Holt, R. A.; Gocayne, J. D.; Amanatides, P.; Ballew, R. M.; Huson, D. H.; Wortman, J. R.; Zhang, Q.; Kodira, C. D.; Zheng, X. H.; Chen, L.; Skupski, M.; Subramanian, G.; Thomas, P. D.; Zhang, J.; Gabor Miklos, G. L.; Nelson, C.; Broder, S.; Clark, A. G.; Nadeau, J.; McKusick, V. A.; et al. (2001). "The Sequence of the Human Genome". Science. 291 (5507): 1304–1351. Bibcode:2001Sci...291.1304V. doi: 10.1126/science.1058040 . PMID   11181995.
  19. Read, T. D.; Salzberg, S.; Pop, M.; Shumway, M.; Umayam, L.; Jiang, L.; Holtzapple, E.; Busch, J.; Smith, K.; Schupp, J. M.; Solomon, D.; Keim, P.; Fraser, C. M. (2002). "Comparative Genome Sequencing for Discovery of Novel Polymorphisms in Bacillus anthracis". Science. 296 (5575): 2028–2033. Bibcode:2002Sci...296.2028R. doi: 10.1126/science.1071837 . PMID   12004073. S2CID   15470665.
  20. Ghedin, E.; Sengamalay, N. A.; Shumway, M.; Zaborsky, J.; Feldblyum, T.; Subbu, V.; Spiro, D. J.; Sitz, J.; Koo, H.; Bolotov, P.; Dernovoy, D.; Tatusova, T.; Bao, Y.; St George, K.; Taylor, J.; Lipman, D. J.; Fraser, C. M.; Taubenberger, J. K.; Salzberg, S. L. (2005). "Large-scale sequencing of human influenza reveals the dynamic nature of viral genome evolution". Nature. 437 (7062): 1162–1166. Bibcode:2005Natur.437.1162G. doi: 10.1038/nature04239 . PMID   16208317.
  21. Holmes, E. C.; Ghedin, E.; Miller, N.; Taylor, J.; Bao, Y.; St George, K.; Grenfell, B. T.; Salzberg, S. L.; Fraser, C. M.; Lipman, D. J.; Taubenberger, J. K. (2005). "Whole-Genome Analysis of Human Influenza A Virus Reveals Multiple Persistent Lineages and Reassortment among Recent H3N2 Viruses". PLOS Biology. 3 (9): e300. doi: 10.1371/journal.pbio.0030300 . PMC   1180517 . PMID   16026181.
  22. Salzberg's column at Forbes
  23. "Skeptic Authors Steven Salzberg and Joe Nickell to Receive Balles Prize in Critical Thinking". 14 June 2013.
  24. Baker, Nicholson. (4 January 2021). "The Lab-Leak Hypothesis For decades, scientists have been hot-wiring viruses in hopes of preventing a pandemic, not causing one. But what if …?". New York magazine Retrieved 18 January 2021.
  25. "Steven Salzberg, Ph.D., Professor of Biomedical Engineering". Johns Hopkins Medicine. Retrieved 2021-06-03.
  26. "Steven Salzberg". scholar.google.com. Retrieved 2021-06-03.
  27. "Highly Cited Researchers". publons.com. Retrieved 2021-06-03.
  28. "Steven Salzberg, Ph.D., Professor of Biomedical Engineering". Johns Hopkins Medicine. Retrieved 2021-05-04.