Gonzalo Navarro

Gonzalo Navarro
Gonzalo Navarro
Born	June 9, 1969 (age 55); Argentina
Alma mater	University of Chile ; National University of La Plata ; Escuela Superior Latinoamericana de Informática
	Scientific career
Fields	Computer science ; Algorithms ; Data structures ; Data compression ; Text searching
Institutions	University of Chile
Thesis	Approximate Text Searching (1998)
Doctoral advisor	Ricardo Baeza-Yates
Website	users.dcc.uchile.cl/~gnavarro/

Last updated November 19, 2024

Gonzalo Navarro Badino (born June 9, 1969) is a full professor of computer science at the University of Chile and ACM Fellow,^[1] whose interests include algorithms and data structures, data compression and text searching. He also participates in the Center for Biotechnology and Bioengineering (CeBiB) and the Millennium Institute for Foundational Research on Data (IMFD).. He obtained his PhD at the University of Chile in 1998 under the supervision of Ricardo Baeza-Yates with the thesis Approximate Text Searching,^[2] then worked as a post-doctoral researcher with Esko Ukkonen and Maxime Crochemore.

He is one of the most prolific and highly cited researchers in Latin America, having authored the books Flexible Pattern Matching in Strings^[3] and Compact Data Structures,^[4] around 25 book chapters, over 160 journal articles and over 240 conference papers. He is editor in chief of the ACM Journal of Experimental Algorithmics (JEA) and a member of the editorial board of Information Systems, and has been guest editor of special issues of ACM SIGSPATIAL, the Journal of Discrete Algorithms, Information Systems and Algorithmica.

He created the Workshop on Compression, Text and Algorithms (WCTA) in 2005 and co-created the conference SISAP in 2008; has chaired or co-chaired SPIRE 2001, SCCC 2004, SPIRE 2005, SIGIR 2005 (posters), IFIP TCS 2006, SISAP 2008, SISAP 2012, LATIN 2016, SPIRE 2018 and CPM 2018; served on the steering committees of SPIRE, LATIN and SISAP; and has given around 50 invited talks, including 12 plenary talks and 5 tutorials in international conferences.

Education

He studied for his Licenciate in Informatics (1989–1992) (5 years plus thesis) from Latin American School of Informatics (ESLAI, Argentina). His thesis was: “A Study on Control Structures”. His advisor was Prof. Jorge Aguirre (ESLAI and Universidad de Buenos Aires, Argentina).

He studied for his Licenciate in Informatics (1986–1993) (5 years plus thesis), at the Faculty of Exact Sciences, Universidad Nacional de La Plata (UNLP, Argentina). His thesis was: “MediaCore: A Multimedia Interface Composition Toolkit”, Advisor: Prof. Jorge Sanz (IBM Argentina and Almaden Research Center).

He received a MSc. in computer science (1994–1995), from Faculty of Physics and Mathematical Sciences, Universidad de Chile with Prof. Ricardo Baeza-Yates (Universidad de Chile) as his advisor. His thesis was: “A Language for Queries on Structure and Contents of Textual Databases”.

He received his PhD in computer science (1995–1998), from Faculty of Physics and Mathematical Sciences, Universidad de Chile under advisor: Prof. Ricardo Baeza-Yates (Universidad de Chile). His thesis was: “Approximate Text Searching”.

Awards and distinctions

2022: ACM Fellow ^[1]
2018: ACM Distinguished Member^[5]
2016: Article "On compressing and indexing repetitive sequences",^[6] with Sebastian Kreft, included in the Virtual Special Issue "40th Anniversary of Theoretical Computer Science -- Top Cited Articles: 1975–2014", which collects the most cited articles of each year.
2016: Highest Cited Paper Award of Elsevier, for the articles "On compressing and indexing repetitive sequences"^[6] and "Colored range queries and document retrieval",^[7] which are among the 5 most cited papers in Theoretical Computer Science. Similar award for the article "DACs: Bringing direct access to variable-length codes",^[8] among the 5 most cited in Information Processing and Management, and "Improved Compressed Indexes for Full-Text Document Retrieval",^[9] among the 5 most cited in Journal of Discrete Algorithms.
2009: Included in the book "70 Stories of success in Innovation and Science",^[10] published by the Ministry of Economy and several government research funding agencies, Chile, 2009.
2008: Award Scopus Chile 2008 in Computer Science, Mathematics and Engineering, awarded by Elsevier to researchers with high scientific productivity, with the support of Conicyt (Chile)
1996: First prize in the III CLEI-UNESCO Contest of Latin American Computer Science MSc. Theses.

SPIRE 2001

Navarro organized the 13th International Symposium on String Processing and Information Retrieval (SPIRE 2001), with the support of Ricardo Baeza-Yates. The conference took place for three days on a boat heading to the Laguna San Rafael in Chilean Patagonia.

Related Research Articles

Information retrieval (IR) in computing and information science is the task of identifying and retrieving information system resources that are relevant to an information need. The information need can be specified in the form of a search query. In the case of document retrieval, queries can be based on full-text or other content-based indexing. Information retrieval is the science of searching for information in a document, searching for documents themselves, and also searching for the metadata that describes data, and for databases of texts, images or sounds.

In computer science, string-searching algorithms, sometimes called string-matching algorithms, are an important class of string algorithms that try to find a place where one or several strings are found within a larger string or text.

In computer science, a trie, also called digital tree or prefix tree, is a type of search tree: specifically, a k-ary tree data structure used for locating specific keys from within a set. These keys are most often strings, with links between nodes defined not by the entire key, but by individual characters. In order to access a key, the trie is traversed depth-first, following the links between nodes, which represent each character in the key.

In linguistics, Heaps' law is an empirical law which describes the number of distinct words in a document as a function of the document length. It can be formulated as

In computer science, a suffix tree is a compressed trie containing all the suffixes of the given text as their keys and positions in the text as their values. Suffix trees allow particularly fast implementations of many important string operations.

In computer science, a suffix array is a sorted array of all suffixes of a string. It is a data structure used in, among others, full-text indices, data-compression algorithms, and the field of bibliometrics.

The bitap algorithm is an approximate string matching algorithm. The algorithm tells whether a given text contains a substring which is "approximately equal" to a given pattern, where approximate equality is defined in terms of Levenshtein distance – if the substring and pattern are within a given distance k of each other, then the algorithm considers them equal. The algorithm begins by precomputing a set of bitmasks containing one bit for each element of the pattern. Then it is able to do most of the work with bitwise operations, which are extremely fast.

Ricardo A. Baeza-Yates is a Chilean computer scientist that currently is the Director of Research of the Institute for Experiential AI at Northeastern University in the Silicon Valley campus. He is also part-time professor at Universitat Pompeu Fabra in Barcelona and Universidad de Chile in Santiago. He is an expert member of the Global Partnership on Artificial Intelligence, a member of the Association for Computing Machinery's US Technology Policy Committee as well as IEEE's Ethics Committee.

In computer science, an inverted index is a database index storing a mapping from content, such as words or numbers, to its locations in a table, or in a document or a set of documents. The purpose of an inverted index is to allow fast full-text searches, at a cost of increased processing when a document is added to the database. The inverted file may be the database file itself, rather than its index. It is the most popular data structure used in document retrieval systems, used on a large scale for example in search engines. Additionally, several significant general-purpose mainframe-based database management systems have used inverted list architectures, including ADABAS, DATACOM/DB, and Model 204.

In computer science, approximate string matching is the technique of finding strings that match a pattern approximately. The problem of approximate string matching is typically divided into two sub-problems: finding approximate substring matches inside a given string and finding dictionary strings that match the pattern approximately.

Search engine indexing is the collecting, parsing, and storing of data to facilitate fast and accurate information retrieval. Index design incorporates interdisciplinary concepts from linguistics, cognitive psychology, mathematics, informatics, and computer science. An alternate name for the process, in the context of search engines designed to find web pages on the Internet, is web indexing.

A web query or web search query is a query that a user enters into a web search engine to satisfy their information needs. Web search queries are distinctive in that they are often plain text and boolean search directives are rarely used. They vary greatly from standard query languages, which are governed by strict syntax rules as command languages with keyword or positional parameters.

In computer science, the all nearest smaller values problem is the following task: for each position in a sequence of numbers, search among the previous positions for the last position that contains a smaller value. This problem can be solved efficiently both by parallel and non-parallel algorithms: Berkman, Schieber & Vishkin (1993), who first identified the procedure as a useful subroutine for other parallel programs, developed efficient algorithms to solve it in the Parallel Random Access Machine model; it may also be solved in linear time on a non-parallel computer using a stack-based algorithm. Later researchers have studied algorithms to solve it in other models of parallel computation.

The term compressed data structure arises in the computer science subfields of algorithms, data structures, and theoretical computer science. It refers to a data structure whose operations are roughly as fast as those of a conventional data structure for the problem, but whose size can be substantially smaller. The size of the compressed data structure is typically highly dependent upon the information entropy of the data being represented.

In computer science, a compressed suffix array is a compressed data structure for pattern matching. Compressed suffix arrays are a general class of data structure that improve on the suffix array. These data structures enable quick search for an arbitrary string with a comparatively small index.

An infobox is a digital or physical table used to collect and present a subset of information about its subject, such as a document. It is a structured document containing a set of attribute–value pairs, and in Wikipedia represents a summary of information about the subject of an article. In this way, they are comparable to data tables in some aspects. When presented within the larger document it summarizes, an infobox is often presented in a sidebar format.

In data mining, intention mining or intent mining is the problem of determining a user's intention from logs of his/her behavior in interaction with a computer system, such as in search engines, where there has been research on user intent or query intent prediction since 2002 ; and commercial intents expressed in social media posts.

Gad Menahem Landau is an Israeli computer scientist noted for his contributions to combinatorial pattern matching and string algorithms and is the founding department chair of the Computer Science Department at the University of Haifa.

Martin Farach-Colton is an American computer scientist, known for his work in streaming algorithms, suffix tree construction, pattern matching in compressed data, cache-oblivious algorithms, and lowest common ancestor data structures. He is the Leonard J. Shustek Professor of Computer Science and chair of the Department of Computer Science and Engineering at New York University. Formerly, he was a Distinguished Professor of Computer Science at Rutgers University. He co-founded the storage technology startup company Tokutek.

Jewels of Stringology: Text Algorithms is a book on algorithms for pattern matching in strings and related problems. It was written by Maxime Crochemore and Wojciech Rytter, and published by World Scientific in 2003.

References

1 2 "Global computing association names 57 fellows for outstanding contributions that propel technology today". Association for Computing Machinery. January 18, 2023. Retrieved 2023-01-18.
↑ "Approximate Text Searching" (PDF).
↑ Navarro, Gonzalo; Raffinot, Mathieu (2002). Flexible pattern matching in strings : practical on-line search algorithms for texts and biological sequences. Cambridge: Cambridge University Press. ISBN 0521813077. OCLC 47930721.
↑ Navarro, Gonzalo (2016-09-08). Compact data structures : a practical approach. New York, NY. ISBN 9781107152380. OCLC 952389252.{{cite book}}: CS1 maint: location missing publisher (link)
↑ "ACM Distinguished Member". 2018 ACM Distinguished Members. Association for Computing Machinery.
1 2 Kreft, Sebastian; Navarro, Gonzalo (2013). "On compressing and indexing repetitive sequences". Theoretical Computer Science. 483: 115–113. doi: 10.1016/j.tcs.2012.02.006 .
↑ Gagie, Travis; Kärkkäinen, Juha; Navarro, Gonzalo; Simon J., Puglisi (2013). "Colored range queries and document retrieval". Theoretical Computer Science. 483: 36–50. doi: 10.1016/j.tcs.2012.08.004 .
↑ Brisaboa, Nieves R.; Ladra, Susana; Navarro, Gonzalo (2013). "DACs: Bringing direct access to variable-length codes". Information Processing & Management. 49: 392–404. doi:10.1016/j.ipm.2012.08.003. hdl: 10533/130014 .
↑ Belazzougui, Djamal; Navarro, Gonzalo; Valenzuela, Daniel (2013). "Improved compressed indexes for full-text document retrieval". Journal of Discrete Algorithms. 13: 3–13. doi: 10.1016/j.jda.2012.07.005 .
↑ "70 Historias exitosas de Innovación y Ciencia" (PDF). Ministry of Economy, Government of Chile.

External links

https://users.dcc.uchile.cl/~gnavarro/

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[acm-fellow-1] 1 2 "Global computing association names 57 fellows for outstanding contributions that propel technology today". Association for Computing Machinery. January 18, 2023. Retrieved 2023-01-18.

[:0-2] "Approximate Text Searching" (PDF).

[:1-3] Navarro, Gonzalo; Raffinot, Mathieu (2002). Flexible pattern matching in strings : practical on-line search algorithms for texts and biological sequences. Cambridge: Cambridge University Press. ISBN 0521813077. OCLC 47930721.

[:2-4] Navarro, Gonzalo (2016-09-08). Compact data structures : a practical approach. New York, NY. ISBN 9781107152380. OCLC 952389252.{{cite book}}: CS1 maint: location missing publisher (link)

[5] "ACM Distinguished Member". 2018 ACM Distinguished Members. Association for Computing Machinery.

[Oncompressing-6] 1 2 Kreft, Sebastian; Navarro, Gonzalo (2013). "On compressing and indexing repetitive sequences". Theoretical Computer Science. 483: 115–113. doi: 10.1016/j.tcs.2012.02.006 .

[ColoredRange-7] Gagie, Travis; Kärkkäinen, Juha; Navarro, Gonzalo; Simon J., Puglisi (2013). "Colored range queries and document retrieval". Theoretical Computer Science. 483: 36–50. doi: 10.1016/j.tcs.2012.08.004 .

[DAC-8] Brisaboa, Nieves R.; Ladra, Susana; Navarro, Gonzalo (2013). "DACs: Bringing direct access to variable-length codes". Information Processing & Management. 49: 392–404. doi:10.1016/j.ipm.2012.08.003. hdl: 10533/130014 .

[ImprovedCI-9] Belazzougui, Djamal; Navarro, Gonzalo; Valenzuela, Daniel (2013). "Improved compressed indexes for full-text document retrieval". Journal of Discrete Algorithms. 13: 3–13. doi: 10.1016/j.jda.2012.07.005 .

[10] "70 Historias exitosas de Innovación y Ciencia" (PDF). Ministry of Economy, Government of Chile.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

Authority control databases
International	ISNI VIAF
National	Germany United States Czech Republic Netherlands Israel
Academics	ORCID Mathematics Genealogy Project Association for Computing Machinery ResearcherID zbMATH Google Scholar DBLP MathSciNet
People	DDB
Other	IdRef

Gonzalo Navarro
Born	(1969-06-09) June 9, 1969 (age 55) Argentina
Alma mater	University of Chile National University of La Plata Escuela Superior Latinoamericana de Informática
Scientific career
Fields	Computer science Algorithms Data structures Data compression Text searching
Institutions	University of Chile
Thesis	Approximate Text Searching (1998)
Doctoral advisor	Ricardo Baeza-Yates

Website	users.dcc.uchile.cl/~gnavarro/