Adam C. Siepel

Last updated
Adam Siepel
Adam Siepel head shot.tiff
Adam Siepel in 2009.
Born
Adam C. Siepel

(1972-06-24) June 24, 1972 (age 52)
United States
Alma mater
Known for evolutionarily conserved sequences
Awards
Scientific career
Fields
Institutions
Thesis Comparative mammalian genomics: Models of evolution and detection of functional elements  (2005)
Doctoral advisor David Haussler
Website siepellab.labsites.cshl.edu

Adam C. Siepel (born 1972) is an American computational biologist known for his research in comparative genomics and population genetics, particularly the development of statistical methods and software tools for identifying evolutionarily conserved sequences. [1] [2] [3] [4] Siepel is currently Chair of the Simons Center for Quantitative Biology and Professor in the Watson School for Biological Sciences at Cold Spring Harbor Laboratory. [5]

Contents

Education and career

Siepel completed a B.S. in Agricultural and Biological Engineering at Cornell University in 1994, then worked at Los Alamos National Laboratory until 1996. From 1996 to 2001, he worked as a software developer at the National Center for Genome Resources in Santa Fe, while completing an M.S. in Computer Science at the University of New Mexico. He obtained a Ph.D. in computer science from the University of California, Santa Cruz in 2005. He was on the faculty of Cornell University from 2006 to 2014 and moved to Cold Spring Harbor Laboratory in 2014.

Research

Siepel has worked on various problems at the intersection of computer science, statistics, evolutionary biology, and genomics. At Los Alamos National Laboratory, he developed phylogenetic methods for detecting recombinant strains of HIV, [6] and at the National Center for Genome Resources, he led the development of ISYS, a technology for integrating heterogeneous bioinformatics databases, analysis tools, and visualization programs. [7] Siepel also did theoretical work on algorithms for phylogeny reconstruction based on genome rearrangements, working with Bernard Moret at the University of New Mexico. [8] When Siepel left software development to join David Haussler's laboratory at the University of California, Santa Cruz, he turned to computational problems in comparative genomics. In Haussler's group, he developed several analysis methods based on phylogenetic hidden Markov models, including a widely used program called phastCons for identifying evolutionarily conserved sequences in genomic sequences. [9]

At Cornell, Siepel's research group continued to work on the identification and characterization of conserved non-coding sequences. They also studied fast-evolving sequences in both coding [10] and noncoding [11] regions, including human accelerated regions. In recent years, the Siepel laboratory has increasingly focused on human population genetics, developing methods for estimating the times in early human history when major population groups first diverged, [12] for measuring the influence of natural selection on transcription factor binding sites, [13] and for estimating probabilities that mutations across the human genome will have fitness consequences. [14] The group also has an active research program in transcriptional regulation, carried out in close collaboration with John T. Lis's laboratory.

A common theme in Siepel's research is the development of precise mathematical models for the complex processes by which genomes evolve over time. His research group uses these models, together with techniques from computer science and statistics, both to peer into the past, and to address questions of practical importance for human health. [15]

Awards and honours

Siepel was a recipient of a Guggenheim Fellowship in 2012. [15] He was also awarded a David and Lucile Packard Fellowship for Science and Engineering in 2007, a Microsoft Research Faculty Fellowship in 2007, and a Sloan Research Fellowship in 2009.

Related Research Articles

<span class="mw-page-title-main">Bioinformatics</span> Computational analysis of large, complex sets of biological data

Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.

<span class="mw-page-title-main">Genomics</span> Discipline in genetics

Genomics is an interdisciplinary field of molecular biology focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes as well as its hierarchical, three-dimensional structural configuration. In contrast to genetics, which refers to the study of individual genes and their roles in inheritance, genomics aims at the collective characterization and quantification of all of an organism's genes, their interrelations and influence on the organism. Genes may direct the production of proteins with the assistance of enzymes and messenger molecules. In turn, proteins make up body structures such as organs and tissues as well as control chemical reactions and carry signals between cells. Genomics also involves the sequencing and analysis of genomes through uses of high throughput DNA sequencing and bioinformatics to assemble and analyze the function and structure of entire genomes. Advances in genomics have triggered a revolution in discovery-based research and systems biology to facilitate understanding of even the most complex biological systems such as the brain.

<span class="mw-page-title-main">Computational biology</span> Branch of biology

Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. An intersection of computer science, biology, and big data, the field also has foundations in applied mathematics, chemistry, and genetics. It differs from biological computing, a subfield of computer science and engineering which uses bioengineering to build computers.

<span class="mw-page-title-main">Comparative genomics</span> Field of biological research

Comparative genomics is a branch of biological research that examines genome sequences across a spectrum of species, spanning from humans and mice to a diverse array of organisms from bacteria to chimpanzees. This large-scale holistic approach compares two or more genomes to discover the similarities and differences between the genomes and to study the biology of the individual genomes. Comparison of whole genome sequences provides a highly detailed view of how organisms are related to each other at the gene level. By comparing whole genome sequences, researchers gain insights into genetic relationships between organisms and study evolutionary changes. The major principle of comparative genomics is that common features of two organisms will often be encoded within the DNA that is evolutionarily conserved between them. Therefore, Comparative genomics provides a powerful tool for studying evolutionary changes among organisms, helping to identify genes that are conserved or common among species, as well as genes that give unique characteristics of each organism. Moreover, these studies can be performed at different levels of the genomes to obtain multiple perspectives about the organisms.

<span class="mw-page-title-main">Sequence homology</span> Shared ancestry between DNA, RNA or protein sequences

Sequence homology is the biological homology between DNA, RNA, or protein sequences, defined in terms of shared ancestry in the evolutionary history of life. Two segments of DNA can have shared ancestry because of three phenomena: either a speciation event (orthologs), or a duplication event (paralogs), or else a horizontal gene transfer event (xenologs).

<span class="mw-page-title-main">Michael Ashburner</span> English biologist (1942–2023)

Michael Ashburner was an English biologist and Professor in the Department of Genetics at University of Cambridge. He was also the former joint-head and co-founder of the European Bioinformatics Institute (EBI) of the European Molecular Biology Laboratory (EMBL) and a Fellow of Churchill College, Cambridge.

Computational genomics refers to the use of computational and statistical analysis to decipher biology from genome sequences and related data, including both DNA and RNA sequence as well as other "post-genomic" data. These, in combination with computational and statistical approaches to understanding the function of the genes and statistical association analysis, this field is also often referred to as Computational and Statistical Genetics/genomics. As such, computational genomics may be regarded as a subset of bioinformatics and computational biology, but with a focus on using whole genomes to understand the principles of how the DNA of a species controls its biology at the molecular level and beyond. With the current abundance of massive biological datasets, computational studies have become one of the most important means to biological discovery.

<span class="mw-page-title-main">Conserved sequence</span> Similar DNA, RNA or protein sequences within genomes or among species

In evolutionary biology, conserved sequences are identical or similar sequences in nucleic acids or proteins across species, or within a genome, or between donor and receptor taxa. Conservation indicates that a sequence has been maintained by natural selection.

<span class="mw-page-title-main">Ewan Birney</span> English businessman

John Frederick William Birney is joint director of EMBL's European Bioinformatics Institute (EMBL-EBI), in Hinxton, Cambridgeshire and deputy director general of the European Molecular Biology Laboratory (EMBL). He also serves as non-executive director of Genomics England, chair of the Global Alliance for Genomics and Health (GA4GH) and honorary professor of bioinformatics at the University of Cambridge. Birney has made significant contributions to genomics, through his development of innovative bioinformatics and computational biology tools. He previously served as an associate faculty member at the Wellcome Trust Sanger Institute.

<span class="mw-page-title-main">Human accelerated regions</span>

Human accelerated regions (HARs), first described in August 2006, are a set of 49 segments of the human genome that are conserved throughout vertebrate evolution but are strikingly different in humans. They are named according to their degree of difference between humans and chimpanzees. Found by scanning through genomic databases of multiple species, some of these highly mutated areas may contribute to human-specific traits. Others may represent loss of functional mutations, possibly due to the action of biased gene conversion rather than adaptive evolution.

<span class="mw-page-title-main">David Haussler</span> American bioinformatician

David Haussler is an American bioinformatician known for his work leading the team that assembled the first human genome sequence in the race to complete the Human Genome Project and subsequently for comparative genome analysis that deepens understanding the molecular function and evolution of the genome.

Anders Krogh is a bioinformatician at the University of Copenhagen, where he leads the university's bioinformatics center. He is known for his pioneering work on the use of hidden Markov models in bioinformatics, and is co-author of a widely used textbook in bioinformatics. In addition, he also co-authored one of the early textbooks on neural networks. His current research interests include promoter analysis, non-coding RNA, gene prediction and protein structure prediction.

<span class="mw-page-title-main">Richard M. Durbin</span> British computational biologist

Richard Michael Durbin is a British computational biologist and Al-Kindi Professor of Genetics at the University of Cambridge. He also serves as an associate faculty member at the Wellcome Sanger Institute where he was previously a senior group leader.

<span class="mw-page-title-main">DNA annotation</span> The process of describing the structure and function of a genome

In molecular biology and genetics, DNA annotation or genome annotation is the process of describing the structure and function of the components of a genome, by analyzing and interpreting them in order to extract their biological significance and understand the biological processes in which they participate. Among other things, it identifies the locations of genes and all the coding regions in a genome and determines what those genes do.

hCONDELs refer to regions of deletions within the human genome containing sequences that are highly conserved among closely related relatives. Almost all of these deletions fall within regions that perform non-coding functions. These represent a new class of regulatory sequences and may have played an important role in the development of specific traits and behavior that distinguish closely related organisms from each other.

<span class="mw-page-title-main">Alfonso Valencia</span>

Alfonso Valencia is a Spanish biologist, ICREA Professor, current director of the Life Sciences department at Barcelona Supercomputing Center., of Spanish National Bioinformatics Institute (INB-ISCIII), and coordinator of the data pillar of the Spanish Personalised Medicine initiative, IMPaCT. From 2015-2018, he was President of the International Society for Computational Biology.

Mathieu Daniel Blanchette is a computational biologist and Director of the School of Computer Science at McGill University. His research focuses on developing new algorithms for the detection of functional regions in DNA sequences.

Non-coding RNAs have been discovered using both experimental and bioinformatic approaches. Bioinformatic approaches can be divided into three main categories. The first involves homology search, although these techniques are by definition unable to find new classes of ncRNAs. The second category includes algorithms designed to discover specific types of ncRNAs that have similar properties. Finally, some discovery methods are based on very general properties of RNA, and are thus able to discover entirely new kinds of ncRNAs.

Katherine Snowden Pollard is the Director of the Gladstone Institute of Data Science and Biotechnology and a professor at the University of California, San Francisco (UCSF). She is a Chan Zuckerberg Biohub Investigator. She was awarded Fellowship of the International Society for Computational Biology in 2020 and the American Institute for Medical and Biological Engineering in 2021 for outstanding contributions to computational biology and bioinformatics.

References

  1. 1 2 Adam C. Siepel publications indexed by Google Scholar
  2. Adam C. Siepel's publications indexed by the Scopus bibliographic database. (subscription required)
  3. Brian Couger, M.; Pipes, L.; Squina, F.; Prade, R.; Siepel, A.; Palermo, R.; Katze, M. G.; Mason, C. E.; Blood, P. D. (2014). "Enabling large-scale next-generation sequence assembly with Blacklight". Concurrency and Computation: Practice and Experience. 26 (13): 2157–2166. doi:10.1002/cpe.3231. PMC   4185199 . PMID   25294974.
  4. ENCODE Project Consortium; Birney E; Stamatoyannopoulos JA; Dutta A; Guigó R; Gingeras TR; Margulies EH; Weng Z; Snyder M; Dermitzakis ET; et al. (2007). "Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project". Nature. 447 (7146): 799–816. Bibcode:2007Natur.447..799B. doi:10.1038/nature05874. PMC   2212820 . PMID   17571346.
  5. Adam Siepel's CV.
  6. Siepel, A. C.; Halpern, A. L.; MacKen, C; Korber, B. T. (1995). "A computer program designed to screen rapidly for HIV type 1 intersubtype recombinant sequences". AIDS Research and Human Retroviruses. 11 (11): 1413–6. doi:10.1089/aid.1995.11.1413. PMID   8573400.
  7. Siepel, A.; Farmer, A.; Tolopko, A.; Zhuang, M.; Mendes, P.; Beavis, W.; Sobral, B. (2001). "ISYS: A decentralized, component-based approach to the integration of heterogeneous bioinformatics resources". Bioinformatics. 17 (1): 83–94. doi:10.1093/bioinformatics/17.1.83. PMID   11222265.
  8. Siepel, A. C. (2003). "An algorithm to enumerate sorting reversals for signed permutations" (PDF). Journal of Computational Biology. 10 (3–4): 575–97. CiteSeerX   10.1.1.114.8797 . doi:10.1089/10665270360688200. PMID   12935346.
  9. Siepel, A.; Bejerano, G; Pedersen, J. S.; Hinrichs, A. S.; Hou, M; Rosenbloom, K; Clawson, H; Spieth, J; Hillier, L. W.; Richards, S; Weinstock, G. M.; Wilson, R. K.; Gibbs, R. A.; Kent, W. J.; Miller, W; Haussler, D (2005). "Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes". Genome Research. 15 (8): 1034–50. doi:10.1101/gr.3715005. PMC   1182216 . PMID   16024819.
  10. Kosiol, C.; Vinař, T. Š.; Da Fonseca, R. R.; Hubisz, M. J.; Bustamante, C. D.; Nielsen, R.; Siepel, A. (2008). "Patterns of Positive Selection in Six Mammalian Genomes". PLOS Genetics. 4 (8): e1000144. doi: 10.1371/journal.pgen.1000144 . PMC   2483296 . PMID   18670650.
  11. Pollard, K. S.; Hubisz, M. J.; Rosenbloom, K. R.; Siepel, A. (2009). "Detection of nonneutral substitution rates on mammalian phylogenies". Genome Research. 20 (1): 110–21. doi:10.1101/gr.097857.109. PMC   2798823 . PMID   19858363.
  12. Gronau, I.; Hubisz, M. J.; Gulko, B.; Danko, C. G.; Siepel, A. (2011). "Bayesian inference of ancient human demography from individual genome sequences". Nature Genetics. 43 (10): 1031–4. doi:10.1038/ng.937. PMC   3245873 . PMID   21926973.
  13. Arbiza, L.; Gronau, I.; Aksoy, B. A.; Hubisz, M. J.; Gulko, B.; Keinan, A.; Siepel, A. (2013). "Genome-wide inference of natural selection on human transcription factor binding sites". Nature Genetics. 45 (7): 723–729. doi:10.1038/ng.2658. PMC   3932982 . PMID   23749186.
  14. Gulko, B.; Hubisz, M. J.; Gronau, I.; Siepel, A. (2015). "A method for calculating probabilities of fitness consequences for point mutations across the human genome". Nature Genetics. 47 (3): 276–283. doi:10.1038/ng.3196. PMC   4342276 . PMID   25599402.
  15. 1 2 Guggenheim profile. Archived April 18, 2012, at the Wayback Machine