David Mount | |
---|---|
Nationality | American |
Alma mater | Purdue University |
Known for | Computational Geometry |
Scientific career | |
Fields | Computer Science |
Institutions | University of Maryland, College Park |
Doctoral advisor | Christoph Hoffmann |
Website | www |
David Mount is a professor at the University of Maryland, College Park department of computer science whose research is in computational geometry.
Mount received a B.S. in Computer Science at Purdue University in 1977 and received his Ph.D. in Computer Science at Purdue University in 1983 under the advisement of Christoph Hoffmann.
He began teaching at the University of Maryland in 1984 and is Professor in the department of Computer Science there. [1]
As a teacher, he has won the University of Maryland, College of Computer Mathematical and Physical Sciences Dean's Award for Excellence in Teaching in 2005 and 1997 as well as other teaching awards including the Hong Kong Science and Technology, School of Engineering Award for Teaching Excellence Appreciation in 2001.
Mounts's main area of research is computational geometry, which is the branch of algorithms devoted to solving problems of a geometric nature. This field includes problems from classic geometry, like the closest pair of points problem, as well as more recent applied problems, such as computer representation and modeling of curves and surfaces. In particular, Mount has worked on the k-means clustering problem, nearest neighbor search, and point location problem.
Mount has worked on developing practical algorithms for k-means clustering, a problem known to be NP-hard. The most common algorithm used is Lloyd's algorithm, which is heuristic in nature but performs well in practice. He and others later showed [2] how k-d trees could be used to speed up Lloyd's algorithm. They have implemented this algorithm, along with some additional improvements, in the software library Kmeans.
Mount has worked on the nearest neighbor and approximate nearest neighbor search problems. By allowing the algorithm to return an approximate solution to the nearest neighbor query, a significant speedup in space and time complexity can be obtained. One class of approximate algorithms takes as input the error distance, , and forms a data structure that can be stored efficiently (low space complexity) and that returns the -approximate nearest neighbor quickly (low time complexity). In co-authored work with Arya, Netanyahu, R. Silverman and A. Wu, [3] Mount showed that the approximate nearest neighbor problem could be solved efficiently in spaces of low dimension. The data structure described in that paper formed the basis of the ANN open-source library for approximate nearest neighbor searching. [4] In subsequent work, he investigated the computational complexity of approximate nearest neighbor searching. Together with co-authors Arya and Malamatos, he provided efficient space–time tradeoffs for approximate nearest neighbor searching, [5] based on a data structure called the AVD (or approximate Voronoi diagram).
Mount has also worked on point location, which involves preprocessing a planar polygonal subdivision S of size to determine the cell of a subdivision that a query point is in. [6] The paper gives an time to construct a data structure of space that when asked what cell a query point lies in, takes expected time where is the entropy of the probability distribution of which cells the query points lie in.
In addition to the design and analysis of algorithms in computational geometry, Mount has worked on the implementation of efficient algorithms in software libraries such as:
As of December 8, 2009, here is a list of his most cited works (according to Google Scholar) and their main contributions, listed in decreasing order of citations:
Mount was named to the 2022 class of ACM Fellows, "for contributions to algorithms and data structures for geometric data analysis and retrieval". [8]
Computational geometry is a branch of computer science devoted to the study of algorithms which can be stated in terms of geometry. Some purely geometrical problems arise out of the study of computational geometric algorithms, and such problems are also considered to be part of computational geometry. While modern computational geometry is a recent development, it is one of the oldest fields of computing with a history stretching back to antiquity.
In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. Time complexity is commonly estimated by counting the number of elementary operations performed by the algorithm, supposing that each elementary operation takes a fixed amount of time to perform. Thus, the amount of time taken and the number of elementary operations performed by the algorithm are taken to be related by a constant factor.
In quantum computing, a quantum algorithm is an algorithm that runs on a realistic model of quantum computation, the most commonly used model being the quantum circuit model of computation. A classical algorithm is a finite sequence of instructions, or a step-by-step procedure for solving a problem, where each step or instruction can be performed on a classical computer. Similarly, a quantum algorithm is a step-by-step procedure, where each of the steps can be performed on a quantum computer. Although all classical algorithms can also be performed on a quantum computer, the term quantum algorithm is generally reserved for algorithms that seem inherently quantum, or use some essential feature of quantum computation such as quantum superposition or quantum entanglement.
In mathematics, given a non-empty set of objects of finite extension in -dimensional space, for example a set of points, a bounding sphere, enclosing sphere or enclosing ball for that set is a -dimensional solid sphere containing all of these objects.
A Euclidean minimum spanning tree of a finite set of points in the Euclidean plane or higher-dimensional Euclidean space connects the points by a system of line segments with the points as endpoints, minimizing the total length of the segments. In it, any two points can reach each other along a path through the line segments. It can be found as the minimum spanning tree of a complete graph with the points as vertices and the Euclidean distances between points as edge weights.
In computer science, a k-d tree is a space-partitioning data structure for organizing points in a k-dimensional space. K-dimensional is that which concerns exactly k orthogonal axes or a space of any number of dimensions. k-d trees are a useful data structure for several applications, such as:
MAX-3SAT is a problem in the computational complexity subfield of computer science. It generalises the Boolean satisfiability problem (SAT) which is a decision problem considered in complexity theory. It is defined as:
Nearest neighbor search (NNS), as a form of proximity search, is the optimization problem of finding the point in a given set that is closest to a given point. Closeness is typically expressed in terms of a dissimilarity function: the less similar the objects, the larger the function values.
The closest pair of points problem or closest pair problem is a problem of computational geometry: given points in metric space, find a pair of points with the smallest distance between them. The closest pair problem for points in the Euclidean plane was among the first geometric problems that were treated at the origins of the systematic study of the computational complexity of geometric algorithms.
In computer science, the range searching problem consists of processing a set S of objects, in order to determine which objects from S intersect with a query object, called the range. For example, if S is a set of points corresponding to the coordinates of several cities, find the subset of cities within a given range of latitudes and longitudes.
In computer science, locality-sensitive hashing (LSH) is a fuzzy hashing technique that hashes similar input items into the same "buckets" with high probability. Since similar items end up in the same buckets, this technique can be used for data clustering and nearest neighbor search. It differs from conventional hashing techniques in that hash collisions are maximized, not minimized. Alternatively, the technique can be seen as a way to reduce the dimensionality of high-dimensional data; high-dimensional input items can be reduced to low-dimensional versions while preserving relative distances between items.
The nearest neighbor graph (NNG) is a directed graph defined for a set of points in a metric space, such as the Euclidean distance in the plane. The NNG has a vertex for each point, and a directed edge from p to q whenever q is a nearest neighbor of p, a point whose distance from p is minimum among all the given points other than p itself.
Density-based spatial clustering of applications with noise (DBSCAN) is a data clustering algorithm proposed by Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu in 1996. It is a density-based clustering non-parametric algorithm: given a set of points in some space, it groups together points that are closely packed, and marks as outliers points that lie alone in low-density regions . DBSCAN is one of the most commonly used and cited clustering algorithms.
In mathematics, the Johnson–Lindenstrauss lemma is a result named after William B. Johnson and Joram Lindenstrauss concerning low-distortion embeddings of points from high-dimensional into low-dimensional Euclidean space. The lemma states that a set of points in a high-dimensional space can be embedded into a space of much lower dimension in such a way that distances between the points are nearly preserved. In the classical proof of the lemma, the embedding is a random orthogonal projection.
Ordering points to identify the clustering structure (OPTICS) is an algorithm for finding density-based clusters in spatial data. It was presented by Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel and Jörg Sander. Its basic idea is similar to DBSCAN, but it addresses one of DBSCAN's major weaknesses: the problem of detecting meaningful clusters in data of varying density. To do so, the points of the database are (linearly) ordered such that spatially closest points become neighbors in the ordering. Additionally, a special distance is stored for each point that represents the density that must be accepted for a cluster so that both points belong to the same cluster. This is represented as a dendrogram.
In computational geometry, an ε-net is the approximation of a general set by a collection of simpler subsets. In probability theory it is the approximation of one probability distribution by another.
In computer science and data mining, MinHash is a technique for quickly estimating how similar two sets are. The scheme was published by Andrei Broder in a 1997 conference, and initially used in the AltaVista search engine to detect duplicate web pages and eliminate them from search results. It has also been applied in large-scale clustering problems, such as clustering documents by the similarity of their sets of words.
In graph theory and computer science, a dense subgraph is a subgraph with many edges per vertex. This is formalized as follows: let G = (V, E) be an undirected graph and let S = (VS, ES) be a subgraph of G. Then the density of S is defined to be:
In computer science, the cell-probe model is a model of computation similar to the random-access machine, except that all operations are free except memory access. This model is useful for proving lower bounds of algorithms for data structure problems.
In computer science, a retrieval data structure, also known as static function, is a space-efficient dictionary-like data type composed of a collection of pairs that allows the following operations: