Geometric discrepancy

Last updated

Geometric discrepancy theory [1] is a sub-field of discrepancy theory, that deals with balancing geometric sets, such as intervals or rectangles. The general research question in this field is: given a set of points in a geometric space, and a set of objects in the same space, can we color each point in one of two different colors (e.g. black and white), such that each object contains roughly the same number of points of each color?

Contents

Formally, the discrepancy of an object is defined as the difference between the number of white points and the number of black points in that object; the objective is to color the points such that the maximum discrepancy of an object is as small as possible.

Intervals

In the simplest geometric discrepancy setting, the set of objects is the set of all sub-intervals of the real interval [0,1]. In this setting, it is possible to attain discrepancy 1: simply color the points alternately black - white - black - white - etc. Then, the discrepancy of every interval is either 0 or 1.

The problem becomes more challenging when the points are not available in advance, but arrive one by one, and each point should be colored immediately when it arrives. This setting is called the "Online Interval Discrepancy". Jiang, Kulkarni and Singla prove that: [2] :Sec.3.2

Their proof uses a reduction to the problem of Online Tree Balancing, which is a problem of discrepancy in which the set of objects is the set of sub-trees of a complete m-ary tree with height h. For this problem, they prove that, if for a sufficiently large constant C, and m ≥ 100, then there is an online algorithm that attains discrepancy . [2] :Sec.2

Rectangles and boxes

Tusnady asked what is the discrepancy when the set of objects is the set of axes-parallel rectangles contained in the unit square.

When the set of objects is the set of all rectangles (possibly rotated), then:

Matousek [5] studied the d-dimensional extension of Tusnady's problem. Improving previous results by Roth, Schmidt, Beck, Bohus, and Srinivasan, he proved an upper bound of with a simple proof.

Stripes

When the set of objects is the set of stripes—rectangles of the form [a,b]x[0,1] and [0,1]x[a,b], the setting is equivalent to the problem of "two permutations": given two permutations on a set of n elements, we should color each element either black or white, such that the discrepancy in each interval of each permutation is minimized (the two permutations are the order of the x coordinates and the order of the y coordinates of the points).

Jiang, Kulkarni and Singla [2] study the online setting with stochastic point arrival, and prove that:

Convex polytopes

Matousek [5] and Nikolov [4] studied a more general setting, where the set of objects is induced by dilations and translations of a fixed convex polytope. He proved upper and lower bounds on the discrepancy. The results are analogous to the results for rectangles and boxes.

Half-spaces

When the set of objects is the set of half-spaces in the Euclidean d-dimensional space:

Related Research Articles

In theoretical computer science, communication complexity studies the amount of communication required to solve a problem when the input to the problem is distributed among two or more parties. The study of communication complexity was first introduced by Andrew Yao in 1979, while studying the problem of computation distributed among several machines. The problem is usually stated as follows: two parties each receive a -bit string and . The goal is for Alice to compute the value of a certain function, , that depends on both and , with the least amount of communication between them.

In quantum computing, Grover's algorithm, also known as the quantum search algorithm, is a quantum algorithm for unstructured search that finds with high probability the unique input to a black box function that produces a particular output value, using just evaluations of the function, where is the size of the function's domain. It was devised by Lov Grover in 1996.

In Vapnik–Chervonenkis theory, the Vapnik–Chervonenkis (VC) dimension is a measure of the size of a class of sets. The notion can be extended to classes of binary functions. It is defined as the cardinality of the largest set of points that the algorithm can shatter, which means the algorithm can always learn a perfect classifier for any labeling of at least one configuration of those data points. It was originally defined by Vladimir Vapnik and Alexey Chervonenkis.

In mathematics, a low-discrepancy sequence is a sequence with the property that for all values of N, its subsequence x1, ..., xN has a low discrepancy.

<span class="mw-page-title-main">Quasi-Monte Carlo method</span> Numerical integration process

In numerical analysis, the quasi-Monte Carlo method is a method for numerical integration and solving some other problems using low-discrepancy sequences to achieve variance reduction. This is in contrast to the regular Monte Carlo method or Monte Carlo integration, which are based on sequences of pseudorandom numbers.

<span class="mw-page-title-main">Euclidean minimum spanning tree</span> Shortest network connecting points

A Euclidean minimum spanning tree of a finite set of points in the Euclidean plane or higher-dimensional Euclidean space connects the points by a system of line segments with the points as endpoints, minimizing the total length of the segments. In it, any two points can reach each other along a path through the line segments. It can be found as the minimum spanning tree of a complete graph with the points as vertices and the Euclidean distances between points as edge weights.

<span class="mw-page-title-main">Guillotine cutting</span> Process of producing small rectangular items of fixed dimensions

Guillotine cutting is the process of producing small rectangular items of fixed dimensions from a given large rectangular sheet, using only guillotine-cuts. A guillotine-cut is a straight bisecting line going from one edge of an existing rectangle to the opposite edge, similarly to a paper guillotine.

<span class="mw-page-title-main">K-set (geometry)</span> Points separated from others by a line

In discrete geometry, a -set of a finite point set in the Euclidean plane is a subset of elements of that can be strictly separated from the remaining points by a line. More generally, in Euclidean space of higher dimensions, a -set of a finite point set is a subset of elements that can be separated from the remaining points by a hyperplane. In particular, when , the line or hyperplane that separates a -set from the rest of is a halving line or halving plane.

<span class="mw-page-title-main">Range searching</span>

In computer science, the range searching problem consists of processing a set S of objects, in order to determine which objects from S intersect with a query object, called the range. For example, if S is a set of points corresponding to the coordinates of several cities, find the subset of cities within a given range of latitudes and longitudes.

<span class="mw-page-title-main">Boxicity</span> Smallest dimension where a graph can be represented as an intersection graph of boxes

In graph theory, boxicity is a graph invariant, introduced by Fred S. Roberts in 1969.

Discrepancy of hypergraphs is an area of discrepancy theory that studies the discrepancy of general set systems.

<span class="mw-page-title-main">Random geometric graph</span> In graph theory, the mathematically simplest spatial network

In graph theory, a random geometric graph (RGG) is the mathematically simplest spatial network, namely an undirected graph constructed by randomly placing N nodes in some metric space and connecting two nodes by a link if and only if their distance is in a given range, e.g. smaller than a certain neighborhood radius, r.

<span class="mw-page-title-main">Decision tree model</span> Model of computational complexity

In computational complexity the decision tree model is the model of computation in which an algorithm is considered to be basically a decision tree, i.e., a sequence of queries or tests that are done adaptively, so the outcome of previous tests can influence the tests performed next.

In discrete geometry, the Erdős distinct distances problem states that every set of points in the plane has a nearly-linear number of distinct distances. It was posed by Paul Erdős in 1946 and almost proven by Larry Guth and Nets Katz in 2015.

A geometric separator is a line that partitions a collection of geometric shapes into two subsets, such that proportion of shapes in each subset is bounded, and the number of shapes that do not belong to any subset is small.

In computational geometry, a maximum disjoint set (MDS) is a largest set of non-overlapping geometric shapes selected from a given set of candidate shapes.

In geometry, a partition of a polygon is a set of primitive units, which do not overlap and whose union equals the polygon. A polygon partition problem is a problem of finding a partition which is minimal in some sense, for example a partition with a smallest number of units or with units of smallest total side-length.

The geometric set cover problem is the special case of the set cover problem in geometric settings. The input is a range space where is a universe of points in and is a family of subsets of called ranges, defined by the intersection of and geometric shapes such as disks and axis-parallel rectangles. The goal is to select a minimum-size subset of ranges such that every point in the universe is covered by some range in .

Online fair division is a class of fair division problems in which the resources, or the people to whom they should be allocated, or both, are not all available when the allocation decision is made. Some situations in which not all resources are available include:

Discrepancy of permutations is a sub-field of discrepancy theory, that deals with balancing intervals induced by permutations of elements. There is a set of n elements, and there are m different permutations on this set. The general research question is: can we color each element in one of two different colors, such that in each permutation, every interval contains roughly the same number of elements of each color?

References

  1. Matoušek, Jiří (1999). Geometric Discrepancy: An Illustrated Guide. Springer. ISBN   3-540-65528-X.
  2. 1 2 3 4 Jiang, Haotian; Kulkarni, Janardhan; Singla, Sahil (2019-10-02), Online Geometric Discrepancy for Stochastic Arrivals with Applications to Envy Minimization, doi:10.48550/arXiv.1910.01073 , retrieved 2024-07-21
  3. 1 2 Beck, József (1981-12-01). "Balanced two-colorings of finite sets in the square I". Combinatorica. 1 (4): 327–335. doi:10.1007/BF02579453. ISSN   1439-6912.
  4. 1 2 Nikolov, Aleksandar (January 2017). "Tighter Bounds for the Discrepancy of Boxes and Polytopes". Mathematika. 63 (3): 1091–1113. doi:10.1112/S0025579317000250. ISSN   0025-5793.
  5. 1 2 3 Alexander, R. (1990-06-01). "Geometric methods in the study of irregularities of distribution". Combinatorica. 10 (2): 115–136. doi:10.1007/BF02123006. ISSN   1439-6912.
  6. Matoušek, J. (1995-06-01). "Tight upper bounds for the discrepancy of half-spaces". Discrete & Computational Geometry. 13 (3): 593–601. doi:10.1007/BF02574066. ISSN   1432-0444.