Lloyd's algorithm

Last updated

In electrical engineering and computer science, Lloyd's algorithm, also known as Voronoi iteration or relaxation, is an algorithm named after Stuart P. Lloyd for finding evenly spaced sets of points in subsets of Euclidean spaces and partitions of these subsets into well-shaped and uniformly sized convex cells. [1] Like the closely related k-means clustering algorithm, it repeatedly finds the centroid of each set in the partition and then re-partitions the input according to which of these centroids is closest. In this setting, the mean operation is an integral over a region of space, and the nearest centroid operation results in Voronoi diagrams.

Contents

Although the algorithm may be applied most directly to the Euclidean plane, similar algorithms may also be applied to higher-dimensional spaces or to spaces with other non-Euclidean metrics. Lloyd's algorithm can be used to construct close approximations to centroidal Voronoi tessellations of the input, [2] which can be used for quantization, dithering, and stippling. Other applications of Lloyd's algorithm include smoothing of triangle meshes in the finite element method.

Example of Lloyd's algorithm. The Voronoi diagram of the current site positions (red) at each iteration is shown. The gray circles denote the centroids of the Voronoi cells.
LloydsMethod1.svg
Iteration 1
LloydsMethod2.svg
Iteration 2
LloydsMethod3.svg
Iteration 3
LloydsMethod15.svg
Iteration 15
In the last image, the sites are very near the centroids of the Voronoi cells. A centroidal Voronoi tessellation has been found.

History

The algorithm was first proposed by Stuart P. Lloyd of Bell Labs in 1957 as a technique for pulse-code modulation. Lloyd's work became widely circulated but remained unpublished until 1982. [1] A similar algorithm was developed independently by Joel Max and published in 1960, [3] which is why the algorithm is sometimes referred as the Lloyd-Max algorithm.

Algorithm description

Lloyd's algorithm starts by an initial placement of some number k of point sites in the input domain. In mesh-smoothing applications, these would be the vertices of the mesh to be smoothed; in other applications they may be placed at random or by intersecting a uniform triangular mesh of the appropriate size with the input domain.

It then repeatedly executes the following relaxation step:

Integration and centroid computation

Because Voronoi diagram construction algorithms can be highly non-trivial, especially for inputs of dimension higher than two, the steps of calculating this diagram and finding the exact centroids of its cells may be replaced by an approximation.

Approximation

A common simplification is to employ a suitable discretization of space like a fine pixel-grid, e.g. the texture buffer in graphics hardware. Cells are materialized as pixels, labeled with their corresponding site-ID. A cell's new center is approximated by averaging the positions of all pixels assigned with the same label. Alternatively, Monte Carlo methods may be used, in which random sample points are generated according to some fixed underlying probability distribution, assigned to the closest site, and averaged to approximate the centroid for each site.

Exact computation

Although embedding in other spaces is also possible, this elaboration assumes Euclidean space using the L2 norm and discusses the two most relevant scenarios, which are two, and respectively three dimensions.

Since a Voronoi cell is of convex shape and always encloses its site, there exist trivial decompositions into easy integratable simplices:

Integration of a cell and computation of its centroid (center of mass) is now given as a weighted combination of its simplices' centroids (in the following called ).

For a 2D cell with n triangular simplices and an accumulated area (where is the area of a triangle simplex), the new cell centroid computes as:

Analogously, for a 3D cell with a volume of (where is the volume of a tetrahedron simplex), the centroid computes as:

Convergence

Each time a relaxation step is performed, the points are left in a slightly more even distribution: closely spaced points move farther apart, and widely spaced points move closer together. In one dimension, this algorithm has been shown to converge to a centroidal Voronoi diagram, also named a centroidal Voronoi tessellation. [4] In higher dimensions, some slightly weaker convergence results are known. [5] [6]

The algorithm converges slowly or, due to limitations in numerical precision, may not converge. Therefore, real-world applications of Lloyd's algorithm typically stop once the distribution is "good enough." One common termination criterion is to stop when the maximum distance moved by any site in an iteration falls below a preset threshold. Convergence can be accelerated by over-relaxing the points, which is done by moving each point ω times the distance to the center of mass, typically using a value slightly less than 2 for ω. [7]

Applications

Lloyd's method was originally used for scalar quantization, but it is clear that the method extends for vector quantization as well. As such, it is extensively used in data compression techniques in information theory. Lloyd's method is used in computer graphics because the resulting distribution has blue noise characteristics (see also Colors of noise), meaning there are few low-frequency components that could be interpreted as artifacts. It is particularly well-suited to picking sample positions for dithering. Lloyd's algorithm is also used to generate dot drawings in the style of stippling. [8] In this application, the centroids can be weighted based on a reference image to produce stipple illustrations matching an input image. [9]

In the finite element method, an input domain with a complex geometry is partitioned into elements with simpler shapes; for instance, two-dimensional domains (either subsets of the Euclidean plane or surfaces in three dimensions) are often partitioned into triangles. It is important for the convergence of the finite element methods that these elements be well shaped; in the case of triangles, often elements that are nearly equilateral triangles are preferred. Lloyd's algorithm can be used to smooth a mesh generated by some other algorithm, moving its vertices and changing the connection pattern among its elements in order to produce triangles that are more closely equilateral. [10] These applications typically use a smaller number of iterations of Lloyd's algorithm, stopping it to convergence, in order to preserve other features of the mesh such as differences in element size in different parts of the mesh. In contrast to a different smoothing method, Laplacian smoothing (in which mesh vertices are moved to the average of their neighbors' positions), Lloyd's algorithm can change the topology of the mesh, leading to more nearly equilateral elements as well as avoiding the problems with tangling that can arise with Laplacian smoothing. However, Laplacian smoothing can be applied more generally to meshes with non-triangular elements.

Different distances

Lloyd's algorithm is usually used in a Euclidean space. The Euclidean distance plays two roles in the algorithm: it is used to define the Voronoi cells, but it also corresponds to the choice of the centroid as the representative point of each cell, since the centroid is the point that minimizes the average squared Euclidean distance to the points in its cell. Alternative distances, and alternative central points than the centroid, may be used instead. For example, Hausner (2001) used a variant of the Manhattan metric (with locally varying orientations) to find a tiling of an image by approximately square tiles whose orientation aligns with features of an image, which he used to simulate the construction of tiled mosaics. [11] In this application, despite varying the metric, Hausner continued to use centroids as the representative points of their Voronoi cells. However, for metrics that differ more significantly from Euclidean, it may be appropriate to choose the minimizer of average squared distance as the representative point, in place of the centroid. [12]

See also

Related Research Articles

<span class="mw-page-title-main">Delaunay triangulation</span> Triangulation method

In mathematics and computational geometry, a Delaunay triangulation (DT), also known as a Delone triangulation, for a given set {pi} of discrete points pi in general position is a triangulation such that no point pi is inside the circumcircle of any triangle in the DT. Delaunay triangulations maximize the minimum of all the angles of the triangles in the triangulation; they tend to avoid sliver triangles. The triangulation is named after Boris Delaunay for his work on this topic from 1934.

Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. Developed in the early 1980s by Robert M. Gray, it was originally used for data compression. It works by dividing a large set of points (vectors) into groups having approximately the same number of points closest to them. Each group is represented by its centroid point, as in k-means and some other clustering algorithms. In simpler terms, vector quantization chooses a set of points to represent a larger set of points.

Computational geometry is a branch of computer science devoted to the study of algorithms which can be stated in terms of geometry. Some purely geometrical problems arise out of the study of computational geometric algorithms, and such problems are also considered to be part of computational geometry. While modern computational geometry is a recent development, it is one of the oldest fields of computing with a history stretching back to antiquity.

<span class="mw-page-title-main">Voronoi diagram</span> Type of plane partition

In mathematics, a Voronoi diagram is a partition of a plane into regions close to each of a given set of objects. It can be classified also as a tessellation. In the simplest case, these objects are just finitely many points in the plane. For each seed there is a corresponding region, called a Voronoi cell, consisting of all points of the plane closer to that seed than to any other. The Voronoi diagram of a set of points is dual to that set's Delaunay triangulation.

<span class="mw-page-title-main">Discrete geometry</span> Branch of geometry that studies combinatorial properties and constructive methods

Discrete geometry and combinatorial geometry are branches of geometry that study combinatorial properties and constructive methods of discrete geometric objects. Most questions in discrete geometry involve finite or discrete sets of basic geometric objects, such as points, lines, planes, circles, spheres, polygons, and so forth. The subject focuses on the combinatorial properties of these objects, such as how they intersect one another, or how they may be arranged to cover a larger object.

k-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells. k-means clustering minimizes within-cluster variances, but not regular Euclidean distances, which would be the more difficult Weber problem: the mean optimizes squared errors, whereas only the geometric median minimizes Euclidean distances. For instance, better Euclidean solutions can be found using k-medians and k-medoids.

<span class="mw-page-title-main">Polygonal modeling</span> Object modeling method

In 3D computer graphics, polygonal modeling is an approach for modeling objects by representing or approximating their surfaces using polygon meshes. Polygonal modeling is well suited to scanline rendering and is therefore the method of choice for real-time computer graphics. Alternate methods of representing 3D objects include NURBS surfaces, subdivision surfaces, and equation-based representations used in ray tracers.

<span class="mw-page-title-main">Dual graph</span> Graph representing faces of another graph

In the mathematical discipline of graph theory, the dual graph of a planar graph G is a graph that has a vertex for each face of G. The dual graph has an edge for each pair of faces in G that are separated from each other by an edge, and a self-loop when the same face appears on both sides of an edge. Thus, each edge e of G has a corresponding dual edge, whose endpoints are the dual vertices corresponding to the faces on either side of e. The definition of the dual depends on the choice of embedding of the graph G, so it is a property of plane graphs rather than planar graphs. For planar graphs generally, there may be multiple dual graphs, depending on the choice of planar embedding of the graph.

<span class="mw-page-title-main">Mesh generation</span> Subdivision of space into cells

Mesh generation is the practice of creating a mesh, a subdivision of a continuous geometric space into discrete geometric and topological cells. Often these cells form a simplicial complex. Usually the cells partition the geometric input domain. Mesh cells are used as discrete local approximations of the larger domain. Meshes are created by computer algorithms, often with human guidance through a GUI, depending on the complexity of the domain and the type of mesh desired. A typical goal is to create a mesh that accurately captures the input domain geometry, with high-quality (well-shaped) cells, and without so many cells as to make subsequent calculations intractable. The mesh should also be fine in areas that are important for the subsequent calculations.

<span class="mw-page-title-main">Unstructured grid</span> Unstructured (or irregular) grid is a tessellation of a part of the Euclidean plane

An unstructured grid or irregular grid is a tessellation of a part of the Euclidean plane or Euclidean space by simple shapes, such as triangles or tetrahedra, in an irregular pattern. Grids of this type may be used in finite element analysis when the input to be analyzed has an irregular shape.

<span class="mw-page-title-main">Color quantization</span>

In computer graphics, color quantization or color image quantization is quantization applied to color spaces; it is a process that reduces the number of distinct colors used in an image, usually with the intention that the new image should be as visually similar as possible to the original image. Computer algorithms to perform color quantization on bitmaps have been studied since the 1970s. Color quantization is critical for displaying images with many colors on devices that can only display a limited number of colors, usually due to memory limitations, and enables efficient compression of certain types of images.

<span class="mw-page-title-main">24-cell honeycomb</span>

In four-dimensional Euclidean geometry, the 24-cell honeycomb, or icositetrachoric honeycomb is a regular space-filling tessellation of 4-dimensional Euclidean space by regular 24-cells. It can be represented by Schläfli symbol {3,4,3,3}.

In computational geometry, the Bowyer–Watson algorithm is a method for computing the Delaunay triangulation of a finite set of points in any number of dimensions. The algorithm can be also used to obtain a Voronoi diagram of the points, which is the dual graph of the Delaunay triangulation.

<span class="mw-page-title-main">Centroidal Voronoi tessellation</span> Voronoi tessellation where the generating point of each Voronoi cell is also its centroid

In geometry, a centroidal Voronoi tessellation (CVT) is a special type of Voronoi tessellation in which the generating point of each Voronoi cell is also its centroid. It can be viewed as an optimal partition corresponding to an optimal distribution of generators. A number of algorithms can be used to generate centroidal Voronoi tessellations, including Lloyd's algorithm for K-means clustering or Quasi-Newton methods like BFGS.

<span class="mw-page-title-main">Smallest-circle problem</span> Finding the smallest circle that contains all given points

The smallest-circle problem is a computational geometry problem of computing the smallest circle that contains all of a given set of points in the Euclidean plane. The corresponding problem in n-dimensional space, the smallest bounding sphere problem, is to compute the smallest n-sphere that contains all of a given set of points. The smallest-circle problem was initially proposed by the English mathematician James Joseph Sylvester in 1857.

<span class="mw-page-title-main">Weighted Voronoi diagram</span> Generalization of Voronoi diagrams

In mathematics, a weighted Voronoi diagram in n dimensions is a generalization of a Voronoi diagram. The Voronoi cells in a weighted Voronoi diagram are defined in terms of a distance function. The distance function may specify the usual Euclidean distance, or may be some other, special distance function. In weighted Voronoi diagrams, each site has a weight that influences the distance computation. The idea is that larger weights indicate more important sites, and such sites will get bigger Voronoi cells.

<span class="mw-page-title-main">Power diagram</span> Partition of the Euclidean plane

In computational geometry, a power diagram, also called a Laguerre–Voronoi diagram, Dirichlet cell complex, radical Voronoi tesselation or a sectional Dirichlet tesselation, is a partition of the Euclidean plane into polygonal cells defined from a set of circles. The cell for a given circle C consists of all the points for which the power distance to C is smaller than the power distance to the other circles. The power diagram is a form of generalized Voronoi diagram, and coincides with the Voronoi diagram of the circle centers in the case that all the circles have equal radii.

A mesh is a representation of a larger geometric domain by smaller discrete cells. Meshes are commonly used to compute solutions of partial differential equations and render computer graphics, and to analyze geographical and cartographic data. A mesh partitions space into elements over which the equations can be solved, which then approximates the solution over the larger domain. Element boundaries may be constrained to lie on internal or external boundaries within a model. Higher-quality (better-shaped) elements have better numerical properties, where what constitutes a "better" element depends on the general governing equations and the particular solution to the model instance.

<span class="mw-page-title-main">Farthest-first traversal</span> Sequence of points far from previous points

In computational geometry, the farthest-first traversal of a compact metric space is a sequence of points in the space, where the first point is selected arbitrarily and each successive point is as far as possible from the set of previously-selected points. The same concept can also be applied to a finite set of geometric points, by restricting the selected points to belong to the set or equivalently by considering the finite metric space generated by these points. For a finite metric space or finite set of geometric points, the resulting sequence forms a permutation of the points, also known as the greedy permutation.

References

  1. 1 2 Lloyd, Stuart P. (1982), "Least squares quantization in PCM", IEEE Transactions on Information Theory , 28 (2): 129–137, doi:10.1109/TIT.1982.1056489, S2CID   10833328 .
  2. Du, Qiang; Faber, Vance; Gunzburger, Max (1999), "Centroidal Voronoi tessellations: applications and algorithms", SIAM Review, 41 (4): 637–676, Bibcode:1999SIAMR..41..637D, doi:10.1137/S0036144599352836 .
  3. Max, Joel (1960), "Quantizing for minimum distortion", IRE Transactions on Information Theory , 6 (1): 7–12, doi:10.1109/TIT.1960.1057548 .
  4. Du, Qiang; Emelianenko, Maria; Ju, Lili (2006), "Convergence of the Lloyd algorithm for computing centroidal Voronoi tessellations", SIAM Journal on Numerical Analysis, 44: 102–119, CiteSeerX   10.1.1.591.9903 , doi:10.1137/040617364 .
  5. Sabin, M. J.; Gray, R. M. (1986), "Global convergence and empirical consistency of the generalized Lloyd algorithm", IEEE Transactions on Information Theory , 32 (2): 148–155, doi:10.1109/TIT.1986.1057168 .
  6. Emelianenko, Maria; Ju, Lili; Rand, Alexander (2009), "Nondegeneracy and Weak Global Convergence of the Lloyd Algorithm in Rd", SIAM Journal on Numerical Analysis, 46: 1423–1441, doi:10.1137/070691334 .
  7. Xiao, Xiao. "Over-relaxation Lloyd method for computing centroidal Voronoi tessellations." (2010).
  8. Deussen, Oliver; Hiller, Stefan; van Overveld, Cornelius; Strothotte, Thomas (2000), "Floating points: a method for computing stipple drawings", Computer Graphics Forum, 19 (3): 41–50, CiteSeerX   10.1.1.233.5810 , doi:10.1111/1467-8659.00396, S2CID   142991, Proceedings of Eurographics .
  9. Secord, Adrian (2002), "Weighted Voronoi stippling", Proceedings of the Symposium on Non-Photorealistic Animation and Rendering (NPAR), ACM SIGGRAPH, pp. 37–43, doi:10.1145/508530.508537, S2CID   12153589 .
  10. Du, Qiang; Gunzburger, Max (2002), "Grid generation and optimization based on centroidal Voronoi tessellations", Applied Mathematics and Computation, 133 (2–3): 591–607, CiteSeerX   10.1.1.324.5020 , doi:10.1016/S0096-3003(01)00260-0 .
  11. Hausner, Alejo (2001), "Simulating decorative mosaics", Proceedings of the 28th annual conference on Computer graphics and interactive techniques, ACM SIGGRAPH, pp. 573–580, doi:10.1145/383259.383327, S2CID   7188986 .
  12. Dickerson, Matthew T.; Eppstein, David; Wortman, Kevin A. (2010), "Planar Voronoi diagrams for sums of convex functions, smoothed distance and dilation", Proc. 7th International Symposium on Voronoi Diagrams in Science and Engineering (ISVD 2010), pp. 13–22, arXiv: 0812.0607 , doi:10.1109/ISVD.2010.12, S2CID   15971504 .