This article has multiple issues. Please help improve it or discuss these issues on the talk page . (Learn how and when to remove these template messages)
|
Rasmus Pagh | |
---|---|
Born | 6 February 1975 49) Denmark | (age
Alma mater |
|
Awards | |
Scientific career | |
Fields | |
Institutions | |
Thesis | Hashing, randomness and dictionaries |
Doctoral advisor | Peter Bro Miltersen |
Website | rasmuspagh |
Rasmus Pagh is a Danish computer scientist and a professor of computer science at the University of Copenhagen. His main work is in algorithms and data structures, and he is particularly known for the cuckoo hashing algorithm and for co-founding the Basic Algorithms Research Center, BARC, in Copenhagen.
Rasmus Pagh was born in Copenhagen, [1] but soon after his family moved to Esbjerg in western Denmark. He went to high school at Rødkilde Amtsgymnasium where he participated in the "JP Forsker" science competition, and in the "Georg Mohr" mathematics competition. After graduating in 1994, he went to study mathematics and computer science at Aarhus University. In 1998 he started his PhD with Peter Bro Miltersen and started writing articles about hashing and efficient dictionaries, culminating in his work on cuckoo hashing. Soon after his thesis defence was in the fall of 2002 he became an assistant professor at the recently founded IT University of Copenhagen.
In 2007, Rasmus founded the Scalable Query Evaluation for Reliable Databases (SQERD) project. The project aimed at applying modern algorithmic techniques to problems arising in database management systems in connection with the evaluation of queries. From 2011-2015, he ran the MaDaMS project, which partnered with Demetra A/S, Aarhus University and Apptus AB at finding more efficient approaches to data mining. [2]
Rasmus Pagh was made full professor at ITU with his Inaugural Lecture in 2013. [3] In 2014, he received an ERC Consolidator Grant for a project on Scalable Similarity Search. [4] [5] The project resulted in many new algorithms, including a way to prevent false negatives in high dimensional search. [6] In 2017 Pagh co-founded the Basic Algorithms Research Center, BARC, in Copenhagen [7] with Mikkel Thorup, Thore Husfeldt and Stephen Alstrup. Soon thereafter he took a sabbatical to join the Simons Institute at University of California, Berkeley [8] and become a Google visiting scholar. [9]
In 2019, Rasmus Pagh became an Associate Editor of the SIAM Journal on Computing . [10]
In 2020, Rasmus Pagh received the European Symposium on Algorithms Test-of-Time award for his 2001 work on cuckoo hashing with Flemming Friche Rodler. [11]
In computing, a hash table is a data structure that implements an associative array, also called a dictionary or simply map; an associative array is an abstract data type that maps keys to values. A hash table uses a hash function to compute an index, also called a hash code, into an array of buckets or slots, from which the desired value can be found. During lookup, the key is hashed and the resulting hash indicates where the corresponding value is stored. A map implemented by a hash table is called a hash map.
In computer science, a perfect hash functionh for a set S is a hash function that maps distinct elements in S to a set of m integers, with no collisions. In mathematical terms, it is an injective function.
A Bloom filter is a space-efficient probabilistic data structure, conceived by Burton Howard Bloom in 1970, that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not – in other words, a query returns either "possibly in set" or "definitely not in set". Elements can be added to the set, but not removed ; the more items added, the larger the probability of false positives.
Linear probing is a scheme in computer programming for resolving collisions in hash tables, data structures for maintaining a collection of key–value pairs and looking up the value associated with a given key. It was invented in 1954 by Gene Amdahl, Elaine M. McGraw, and Arthur Samuel and first analyzed in 1963 by Donald Knuth.
In computer programming, primary clustering is a phenomenon that causes performance degradation in linear-probing hash tables. The phenomenon states that, as elements are added to a linear probing hash table, they have a tendency to cluster together into long runs. If the hash table is at a load factor of for some parameter , then the expected length of the run containing a given element is . This causes insertions and negative queries to take expected time in a linear-probing hash table.
Jeffrey Scott Vitter is a U.S. computer scientist and academic administrator. Born in 1955 in New Orleans, Vitter has served in several senior higher education administration posts. He is a former chancellor of the University of Mississippi. He assumed the chancellor position on January 1, 2016. His formal investiture to the chancellorship took place on November 10, 2016, at the University of Mississippi's Oxford Campus.
Andrei Zary Broder is a distinguished scientist at Google. Previously, he was a research fellow and vice president of computational advertising for Yahoo!, and before that, the vice president of research for AltaVista. He has also worked for IBM Research as a distinguished engineer and was CTO of IBM's Institute for Search and Text Analysis.
Cuckoo hashing is a scheme in computer programming for resolving hash collisions of values of hash functions in a table, with worst-case constant lookup time. The name derives from the behavior of some species of cuckoo, where the cuckoo chick pushes the other eggs or young out of the nest when it hatches in a variation of the behavior referred to as brood parasitism; analogously, inserting a new key into a cuckoo hashing table may push an older key to a different location in the table.
Nearest neighbor search (NNS), as a form of proximity search, is the optimization problem of finding the point in a given set that is closest to a given point. Closeness is typically expressed in terms of a dissimilarity function: the less similar the objects, the larger the function values.
In computer science, a succinct data structure is a data structure which uses an amount of space that is "close" to the information-theoretic lower bound, but still allows for efficient query operations. The concept was originally introduced by Jacobson to encode bit vectors, (unlabeled) trees, and planar graphs. Unlike general lossless data compression algorithms, succinct data structures retain the ability to use them in-place, without decompressing them first. A related notion is that of a compressed data structure, insofar as the size of the stored or encoded data similarly depends upon the specific content of the data itself.
In computer science, locality-sensitive hashing (LSH) is a fuzzy hashing technique that hashes similar input items into the same "buckets" with high probability. Since similar items end up in the same buckets, this technique can be used for data clustering and nearest neighbor search. It differs from conventional hashing techniques in that hash collisions are maximized, not minimized. Alternatively, the technique can be seen as a way to reduce the dimensionality of high-dimensional data; high-dimensional input items can be reduced to low-dimensional versions while preserving relative distances between items.
Kurt Mehlhorn is a German theoretical computer scientist. He has been a vice president of the Max Planck Society and is director of the Max Planck Institute for Computer Science.
WADS, the Algorithms and Data Structures Symposium, is an international academic conference in the field of computer science, focusing on algorithms and data structures. WADS is held every second year, usually in Canada and always in North America. It is held in alternation with its sister conference, the Scandinavian Symposium and Workshops on Algorithm Theory (SWAT), which is usually held in Scandinavia and always in Northern Europe. Historically, the proceedings of both conferences were published by Springer Verlag through their Lecture Notes in Computer Science series. Springer continues to publish WADS proceedings, but starting in 2016, SWAT proceedings are now published by Dagstuhl through their Leibniz International Proceedings in Informatics.
Michael David Mitzenmacher is an American computer scientist working in algorithms. He is Professor of Computer Science at the Harvard John A. Paulson School of Engineering and Applied Sciences and was area dean of computer science July 2010 to June 2013. He also runs My Biased Coin, a blog about theoretical computer science.
In computer science, a family of hash functions is said to be k-independent, k-wise independent or k-universal if selecting a function at random from the family guarantees that the hash codes of any designated k keys are independent random variables. Such families allow good average case performance in randomized algorithms or data structures, even if the input data is chosen by an adversary. The trade-offs between the degree of independence and the efficiency of evaluating the hash function are well studied, and many k-independent families have been proposed.
In computer science, tabulation hashing is a method for constructing universal families of hash functions by combining table lookup with exclusive or operations. It was first studied in the form of Zobrist hashing for computer games; later work by Carter and Wegman extended this method to arbitrary fixed-length keys. Generalizations of tabulation hashing have also been developed that can handle variable-length keys such as text strings.
Mikkel Thorup is a Danish computer scientist working at University of Copenhagen. He completed his undergraduate education at Technical University of Denmark and his doctoral studies at Oxford University in 1993. From 1993 to 1998, he was at University of Copenhagen and from 1998 to 2013 he was at AT&T Labs-Research in New Jersey. Since 2013 he has been at the University of Copenhagen as a Professor and Head of Center for Efficient Algorithms and Data Structures (EADS).
A quotient filter is a space-efficient probabilistic data structure used to test whether an element is a member of a set. A query will elicit a reply specifying either that the element is definitely not in the set or that the element is probably in the set. The former result is definitive; i.e., the test does not generate false negatives. But with the latter result there is some probability, ε, of the test returning "element is in the set" when in fact the element is not present in the set. There is a tradeoff between ε, the false positive rate, and storage size; increasing the filter's storage size reduces ε. Other AMQ operations include "insert" and "optionally delete". The more elements are added to the set, the larger the probability of false positives.
A cuckoo filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set, like a Bloom filter does. False positive matches are possible, but false negatives are not – in other words, a query returns either "possibly in set" or "definitely not in set". A cuckoo filter can also delete existing items, which is not supported by Bloom filters. In addition, for applications that store many items and target moderately low false positive rates, cuckoo filters can achieve lower space overhead than space-optimized Bloom filters.
In computer science, a retrieval data structure, also known as static function, is a space-efficient dictionary-like data type composed of a collection of pairs that allows the following operations: