Gad Landau

Last updated

Gad M. Landau
Gad M. Landau.jpg
Gad M. Landau
Born (1954-09-24) 24 September 1954 (age 68)
NationalityIsraeli
Alma mater Tel-Aviv University
Known for k-differences problem
incremental sequence alignment
Scientific career
Fields Theoretical computer science
Institutions University of Haifa
NYU Polytechnic School of Engineering
Thesis String matching in erroneous input (1987)
Doctoral advisor Uzi Vishkin

Gad Menahem Landau (born 1954) is an Israeli computer scientist noted for his contributions to combinatorial pattern matching and string algorithms and is the founding department chair of the Computer Science Department at the University of Haifa.

Contents

He has coauthored over 100 peer-reviewed scientific papers. [1] [2]

Academic background

Landau received his Ph.D. in Computer Science from Tel Aviv University in 1987. From 1988 to the present he has held positions as Assistant, Associate, and Research Professor at Polytechnic University in New York (now called NYU Polytechnic School of Engineering, New York University). In 1995, Landau joined the faculty of the University of Haifa, where he founded the Department of Computer Science and was the first department head. In 2006, Landau was promoted to his current position of full Professor at the University of Haifa.

Research

Landau's research interests focus on string algorithms, data structures, computational biology, and parallel computation. He has made several profound contributions to these areas, even in the early days of his scientific career. His Ph.D. thesis, supervised by Prof. Uzi Vishkin, includes the fundamental text-book solution for the k-differences problem, [3] [4] solving one of the major open problems in the area at the time. His solution was the first to combine suffix trees and lowest common ancestor queries, and has since inspired many extensions of this technique to other problems.

The footprints of Landau's research can be found in almost every subarea of string algorithms, including his foundational work on dynamic programming algorithms for the edit distance [5] problem, his numerous papers on modeling digitized images and 2D matching, [6] incremental sequence alignment, [7] [8] [9] and recently, his work on jumbled pattern matching [10] and compressed text [11] [12] [13] algorithms. He was instrumental in the application of pattern matching techniques to the area of computational biology, working on problems in several diverse areas such as DNA and RNA comparison, [14] [15] clustering, [16] haplotype inference, [17] protein secondary structure prediction, [18] and tandem repeats. [19]

Landau's research has been continually funded by the U.S. National Science Foundation, the Israel Science Foundation, and the U.S.-Israel Binational Science Foundation. He received the IBM Faculty award, and was awarded funding from the DFG and Yahoo!. Landau co-chaired the International Symposium on Combinatorial Pattern Matching in both 2001 [20] and 2008. [21] He serves on the editorial board of Journal of Discrete Algorithms, and served as a guest editor for TCS and Discrete Applied Mathematics. He has served on numerous program committees for international conferences, most recently, International Conference on Language and Automata Theory and Applications (LATA), International Symposium on String Processing and Information Retrieval (SPIRE), International Symposium on Algorithms and Computation (ISAAC), Annual Symposium on Combinatorial Pattern Matching (CPM), Workshop on Algorithms in Bioinformatics (WABI), International Workshop on Combinatorial Algorithms (IWOCA), and Brazilian Symposium on Bioinformatics (BSB).

Academic activities

Landau has been an active member on academic committees, including committees that advise and supervise the academic activity in newly founded computer science departments in Israel. He founded several academic projects at the University of Haifa, most notably the Etgar undergraduate program for highly talented high school students throughout the north of Israel. Apart from these, Landau was also involved in community and civic activities, and served as a member of the Haifa city council from 2008 until 2013. [22]

Related Research Articles

<span class="mw-page-title-main">Sequence alignment</span> Process in bioinformatics that identifies equivalent sites within molecular sequences

In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences. Aligned sequences of nucleotide or amino acid residues are typically represented as rows within a matrix. Gaps are inserted between the residues so that identical or similar characters are aligned in successive columns. Sequence alignments are also used for non-biological sequences, such as calculating the distance cost between strings in a natural language or in financial data.

In computational linguistics and computer science, edit distance is a string metric, i.e. a way of quantifying how dissimilar two strings are to one another, that is measured by counting the minimum number of operations required to transform one string into the other. Edit distances find applications in natural language processing, where automatic spelling correction can determine candidate corrections for a misspelled word by selecting words from a dictionary that have a low distance to the word in question. In bioinformatics, it can be used to quantify the similarity of DNA sequences, which can be viewed as strings of the letters A, C, G and T.

<span class="mw-page-title-main">Suffix tree</span> Tree containing all suffixes of a given text

In computer science, a suffix tree is a compressed trie containing all the suffixes of the given text as their keys and positions in the text as their values. Suffix trees allow particularly fast implementations of many important string operations.

<span class="mw-page-title-main">Approximate string matching</span> Finding strings that approximately match a pattern

In computer science, approximate string matching is the technique of finding strings that match a pattern approximately. The problem of approximate string matching is typically divided into two sub-problems: finding approximate substring matches inside a given string and finding dictionary strings that match the pattern approximately.

In computational phylogenetics, tree alignment is a computational problem concerned with producing multiple sequence alignments, or alignments of three or more sequences of DNA, RNA, or protein. Sequences are arranged into a phylogenetic tree, modeling the evolutionary relationships between species or taxa. The edit distances between sequences are calculated for each of the tree's internal vertices, such that the sum of all edit distances within the tree is minimized. Tree alignment can be accomplished using one of several algorithms with various trade-offs between manageable tree size and computational effort.

Sequential pattern mining is a topic of data mining concerned with finding statistically relevant patterns between data examples where the values are delivered in a sequence. It is usually presumed that the values are discrete, and thus time series mining is closely related, but usually considered a different activity. Sequential pattern mining is a special case of structured data mining.

Nucleic acid structure prediction is a computational method to determine secondary and tertiary nucleic acid structure from its sequence. Secondary structure can be predicted from one or several nucleic acid sequences. Tertiary structure can be predicted from the sequence, or by comparative modeling.

<span class="mw-page-title-main">Eugene Myers</span> American scientist

Eugene Wimberly "Gene" Myers, Jr. is an American computer scientist and bioinformatician, who is best known for contributing to the early development of the NCBI's BLAST tool for sequence analysis.

<span class="mw-page-title-main">Cartesian tree</span> Binary tree derived from a sequence of numbers

In computer science, a Cartesian tree is a binary tree derived from a sequence of distinct numbers. To construct the Cartesian tree, set its root to be the minimum number in the sequence, and recursively construct its left and right subtrees from the subsequences before and after this number. It is uniquely defined as a min-heap whose symmetric (in-order) traversal returns the original sequence.

Uzi Vishkin is a computer scientist at the University of Maryland, College Park, where he is Professor of Electrical and Computer Engineering at the University of Maryland Institute for Advanced Computer Studies (UMIACS). Uzi Vishkin is known for his work in the field of parallel computing. In 1996, he was inducted as a Fellow of the Association for Computing Machinery, with the following citation: "One of the pioneers of parallel algorithms research, Dr. Vishkin's seminal contributions played a leading role in forming and shaping what thinking in parallel has come to mean in the fundamental theory of Computer Science."

Explicit Multi-Threading (XMT) is a computer science paradigm for building and programming parallel computers designed around the parallel random-access machine (PRAM) parallel computational model. A more direct explanation of XMT starts with the rudimentary abstraction that made serial computing simple: that any single instruction available for execution in a serial program executes immediately. A consequence of this abstraction is a step-by-step (inductive) explication of the instruction available next for execution. The rudimentary parallel abstraction behind XMT, dubbed Immediate Concurrent Execution (ICE) in Vishkin (2011), is that indefinitely many instructions available for concurrent execution execute immediately. A consequence of ICE is a step-by-step (inductive) explication of the instructions available next for concurrent execution. Moving beyond the serial von Neumann computer, the aspiration of XMT is that computer science will again be able to augment mathematical induction with a simple one-line computing abstraction.

In computer science, a compressed suffix array is a compressed data structure for pattern matching. Compressed suffix arrays are a general class of data structure that improve on the suffix array. These data structures enable quick search for an arbitrary string with a comparatively small index.

<i>Phylo</i> (video game) 2010 video game

Phylo is an experimental video game about multiple sequence alignment optimisation. Developed by the McGill Centre for Bioinformatics, it was originally released as a free Flash game in November 2010. Designed as a game with a purpose, players solve pattern-matching puzzles that represent nucleotide sequences of different phylogenetic taxa to optimize alignments over a computer algorithm. By aligning together each nucleotide sequence, represented as differently coloured blocks, players attempt to create the highest point value score for each set of sequences by matching as many colours as possible and minimizing gaps.

In combinatorial mathematics and theoretical computer science, heavy path decomposition is a technique for decomposing a rooted tree into a set of paths. In a heavy path decomposition, each non-leaf node selects one "heavy edge", the edge to the child that has the greatest number of descendants. The selected edges form the paths of the decomposition.

In bioinformatics, alignment-free sequence analysis approaches to molecular sequence and structure data provide alternatives over alignment-based approaches.

<span class="mw-page-title-main">Ron Shamir</span> Israeli professor of computer science (born 1953)

Ron Shamir is an Israeli professor of computer science known for his work in graph theory and in computational biology. He holds the Raymond and Beverly Sackler Chair in Bioinformatics, and is the founder and former head of the Edmond J. Safra Center for Bioinformatics at Tel Aviv University.

In mathematics and computer science, graph edit distance (GED) is a measure of similarity between two graphs. The concept of graph edit distance was first formalized mathematically by Alberto Sanfeliu and King-Sun Fu in 1983. A major application of graph edit distance is in inexact graph matching, such as error-tolerant pattern recognition in machine learning.

Daniel Mier Gusfield is an American computer scientist, Distinguished Professor of Computer Science at the University of California, Davis. Gusfield is known for his research in combinatorial optimization and computational biology.

<span class="mw-page-title-main">Srinivas Aluru</span>

Srinivas Aluru is a professor in the School of Computational Science and Engineering at Georgia Institute of Technology, and co-Executive Director for the Georgia Tech Interdisciplinary Research Institute in Data Engineering and Science. His main areas of research are high performance computing, data science, bioinformatics and systems biology, combinatorial methods in scientific computing, and string algorithms. Aluru is a Fellow of the American Association for the Advancement of Science (AAAS) and the Institute for Electrical and Electronic Engineers (IEEE). He is best known for his research contributions in parallel algorithms and applications, interdisciplinary research in bioinformatics and computational biology, and particularly the intersection of these two fields.

In bioinformatics, a spaced seed is a pattern of relevant and irrelevant positions in a biosequence and a method of approximate string matching that allows for substitutions. They are a straightforward modification to the earliest heuristic-based alignment efforts that allow for minor differences between the sequences of interest. Spaced seeds have been used in homology search., alignment, assembly, and metagenomics. They are usually represented as a sequence of zeroes and ones, where a one indicates relevance and a zero indicates irrelevance at the given position. Some visual representations use pound signs for relevant and dashes or asterisks for irrelevant positions.

References

  1. Gad M. Landau at DBLP Bibliography Server OOjs UI icon edit-ltr-progressive.svg
  2. Gad Landau publications indexed by Microsoft Academic
  3. Landau, Gad M.; Vishkin, Uzi (1986). "Efficient String Matching with k Mismatches". Theor. Comput. Sci. 43: 239–249. doi: 10.1016/0304-3975(86)90178-7 .
  4. Gusfield, Dan (1997). "Chapter 9: More Applications of Suffix Trees, Chapter 12: Refining Core String Edits and Alignments". Algorithms on Strings, Trees, and Sequences – Computer Science and Computational Biology. Cambridge University Press. ISBN   978-0-521-58519-4.
  5. Landau, Gad M.; Vishkin, Uzi (1988). "Fast String Matching with k Differences". J. Comput. Syst. Sci. 37 (1): 63–78. doi: 10.1016/0022-0000(88)90045-1 .
  6. Landau, Gad M.; Vishkin, Uzi (1994). "Pattern Matching in a Digitized Image". Algorithmica. 12 (4/5): 375–408. CiteSeerX   10.1.1.55.9322 . doi:10.1007/BF01185433. S2CID   3352884.
  7. Landau, Gad M.; Myers, Eugene W.; Schmidt, Jeanette P. (1998). "Incremental String Comparison". SIAM J. Comput. 27 (2): 557–582. CiteSeerX   10.1.1.38.1766 . doi:10.1137/S0097539794264810.
  8. Landau, Gad M.; Ziv-Ukelson, Michal (2001). "On the Common Substring Alignment Problem". J. Algorithms. 41 (2): 338–359. CiteSeerX   10.1.1.149.775 . doi:10.1006/jagm.2001.1191.
  9. Landau, Gad M.; Schieber, Baruch; Ziv-Ukelson, Michal (2003). "Sparse LCS Common Substring Alignment Matrices". Inf. Process. Lett. 88 (6): 259–270. doi:10.1016/j.ipl.2003.09.006.
  10. Gagie, Travis; Hermelin, Danny; Landau, Gad M.; Weimann, Oren (2013). "Binary Jumbled Pattern Matching on Trees and Tree-Like Structures". Algorithms – ESA 2013. Lecture Notes in Computer Science. Vol. 8125. pp. 517–528. arXiv: 1301.6127 . doi:10.1007/978-3-642-40450-4_44. ISBN   978-3-642-40449-8.
  11. Hermelin, Danny; Landau, Gad M.; Landau, Shir; Weimann, Oren (2013). "Unified Compression-Based Acceleration of Edit-Distance Computation". Algorithmica. 65 (2): 339–353. arXiv: 1004.1194 . doi:10.1007/s00453-011-9590-6. S2CID   1257530.
  12. Crochemore, Maxime; Landau, Gad M.; Ziv-Ukelson, Michal (2003). "A Subquadratic Sequence Alignment Algorithm for Unrestricted Scoring Matrices". SIAM J. Comput. 32 (6): 1654–1673. CiteSeerX   10.1.1.57.8562 . doi:10.1137/S0097539702402007. S2CID   2661452.
  13. Bille, Philip; Gortz, Inge Li; Landau, Gad M.; Weimann, Oren (2013). "Tree Compression with Top Trees". Automata, Languages, and Programming. pp. 160–171. arXiv: 1304.5702 . doi:10.1007/978-3-642-39206-1_14. ISBN   978-3-642-39205-4. S2CID   6231735.{{cite book}}: |journal= ignored (help)
  14. Backofen, Rolf; Chen, Shihyen; Hermelin, Danny; Landau, Gad M.; Roytberg, Mikhail A.; Weimann, Oren; Zhang, Kaizhong (2007). "Locality and Gaps in RNA Comparison". Journal of Computational Biology. 14 (8): 1074–1087. CiteSeerX   10.1.1.230.7750 . doi:10.1089/cmb.2007.0062. PMID   17985988.
  15. Amit, Mika; Backofen, Rolf; Heyne, Steffen; Landau, Gad M.; Mohl, Mathias; Otto, Christina; Will, Sebastian (2014). "Local Exact Pattern Matching for Non-Fixed RNA Structures". IEEE/ACM Transactions on Computational Biology and Bioinformatics. 11 (1): 219–230. CiteSeerX   10.1.1.641.139 . doi:10.1109/TCBB.2013.2297113. PMID   26355520. S2CID   779878.
  16. Eres, Revital; Landau, Gad M.; Parida, Laxmi (2003). "A Combinatorial Approach to Automatic Discovery of Cluster-Patterns". Algorithms in Bioinformatics. Lecture Notes in Computer Science. Vol. 2812. pp. 139–150. doi:10.1007/978-3-540-39763-2_11. ISBN   978-3-540-20076-5.
  17. Fellows, Michael R.; Hartman, Tzvika; Hermelin, Danny; Landau, Gad M.; Rosamond, Frances A.; Rozenberg, Liat (2011). "Haplotype Inference Constrained by Plausible Haplotype Data". IEEE/ACM Transactions on Computational Biology and Bioinformatics. 8 (6): 1692–1699. CiteSeerX   10.1.1.502.7164 . doi:10.1109/TCBB.2010.72. PMID   20733241. S2CID   6947773.
  18. Backofen, Rolf; Landau, Gad M.; Mohl, Mathias; Tsur, Dekel; Weimann, Oren (2011). "Fast RNA structure alignment for crossing input structures". J. Discrete Algorithms. 9 (1): 2–11. doi: 10.1016/j.jda.2010.07.004 .
  19. Landau, Gad M.; Schmidt, Jeanette P.; Sokol, Dina (2001). "An Algorithm for Approximate Tandem Repeats". Journal of Computational Biology. 8 (1): 1–18. CiteSeerX   10.1.1.24.3741 . doi:10.1089/106652701300099038. PMID   11339903.
  20. Amir, Amihood; Landau, Gad M., eds. (2001). Combinatorial Pattern Matching, 12th Annual Symposium, Proceedings. Springer.
  21. Ferragina, Paolo; Landau, Gad M., eds. (2008). Combinatorial Pattern Matching, 19th Annual Symposium, Proceedings. Springer.
  22. he:Special:PermanentLink/15964007