Alignments of random points

Last updated
80 4-point near-alignments of 137 random points Ley lines.svg
80 4-point near-alignments of 137 random points

The study of alignments of random points in a plane seeks to discover subsets of points that occupy an approximately straight line within a larger set of points that are randomly placed in a planar region. Studies have shown that such near-alignments occur by chance with greater frequency than one might intuitively expect.

Contents

This has been put forward as a demonstration that ley lines and other similar mysterious alignments believed by some to be phenomena of deep significance might exist solely due to chance alone, as opposed to the supernatural or anthropological explanations put forward by their proponents. The topic has also been studied in the fields of computer vision and astronomy.

A number of studies have examined the mathematics of alignment of random points on the plane. [1] [2] [3] [4] In all of these, the width of the line — the allowed displacement of the positions of the points from a perfect straight line — is important. It allows the fact that real-world features are not mathematical points, and that their positions need not line up exactly for them to be considered in alignment. Alfred Watkins, in his classic work on ley lines The Old Straight Track , used the width of a pencil line on a map as the threshold for the tolerance of what might be regarded as an alignment. For example, using a 1 mm pencil line to draw alignments on a 1:50,000 scale Ordnance Survey map, the corresponding width on the ground would be 50 m. [5]

Estimate of probability of chance alignments

Contrary to intuition, finding alignments between randomly placed points on a landscape gets progressively easier as the geographic area to be considered increases. One way of understanding this phenomenon is to see that the increase in the number of possible combinations of sets of points in that area overwhelms the decrease in the probability that any given set of points in that area line up.

One definition which expresses the generally accepted meaning of "alignment" is:

A set of points, chosen from a given set of landmark points, all of which lie within at least one straight path of a given width

More precisely, a path of width w may be defined as the set of all points within a distance of w/2 of a straight line on a plane, or a great circle on a sphere, or in general any geodesic on any other kind of manifold. Note that, in general, any given set of points that are aligned in this way will contain a large number of infinitesimally different straight paths. Therefore, only the existence of at least one straight path is necessary to determine whether a set of points is an alignment. For this reason, it is easier to count the sets of points, rather than the paths themselves. The number of alignments found is very sensitive to the allowed width w, increasing approximately proportionately to wk−2, where k is the number of points in an alignment.

The following is a very approximate order-of-magnitude estimate of the likelihood of alignments, assuming a plane covered with uniformly distributed "significant" points.

Consider a set of n points in a compact area with approximate diameter L and area approximately L2. Consider a valid line to be one where every point is within distance w/2 of the line (that is, lies on a track of width w, where wL).

Consider all the unordered sets of k points from the n points, of which there are:

(see factorial and binomial coefficient for notation).

To make a rough estimate of the probability that any given subset of k points is approximately collinear in the way defined above, consider the line between the "leftmost" and "rightmost" two points in that subset (for some arbitrary left/right axis: we can choose top and bottom for the exceptional vertical case). A straight line can trivially be drawn through those two points. For each of the remaining k-2 points in the subset, the probability that the point is "near enough" to the line is roughly w/L, which can be seen by considering the ratio of the area of the line tolerance zone (roughly wL) and the overall area (roughly L2).

So, based on the approximate estimates above, we can estimate the expected number of k-point alignments in the overall set to be very roughly equal to

Among other things this can be used to show that, contrary to intuition, the number of k-point lines expected from random chance in a plane covered with points at a given density, for a given line width, increases much more than linearly with the size of the area considered, since the combinatorial explosion of growth in the number of possible combinations of points more than makes up for the increase in difficulty of any given combination lining up.

More precise estimate of expected number of alignments

Using a similar, but more careful analysis, we can find a more precise expression for the number of 3-point alignments of maximum width w and maximum length d expected by chance among n points placed randomly on a square of side L as [2]

If we set dL and k = 3, we can see that this makes the same prediction as the rough estimate above, up to a constant factor.

If edge effects (alignments lost over the boundaries of the square) are included, then the expression becomes

A generalisation to k-point alignments (ignoring edge effects) is [3]

which has roughly similar asymptotic scaling properties as the crude approximation in the previous section, for the same reason; that combinatorial explosion for large n overwhelms the effects of other variables.

Computer simulation of alignments

607 4-point alignments of 269 random points Leylines.png
607 4-point alignments of 269 random points

Computer simulations show that points on a plane tend to form alignments similar to those found by ley hunters in numbers consistent with the order-of-magnitude estimates above, suggesting that ley lines may also be generated by chance. This phenomenon occurs regardless of whether the points are generated pseudo-randomly by computer, or from data sets of mundane features such as pizza restaurants or telephone booths.

On a map with a width of tens of kilometers, it is easy to find alignments of 4 to 8 points even in relatively small sets of features with w = 50 m. Choosing larger areas, denser sets of features, or larger values of w makes it easy to find alignments of 20 or more points.

See also

Related Research Articles

<span class="mw-page-title-main">Cauchy distribution</span> Probability distribution

The Cauchy distribution, named after Augustin Cauchy, is a continuous probability distribution. It is also known, especially among physicists, as the Lorentz distribution, Cauchy–Lorentz distribution, Lorentz(ian) function, or Breit–Wigner distribution. The Cauchy distribution is the distribution of the x-intercept of a ray issuing from with a uniformly distributed angle. It is also the distribution of the ratio of two independent normally distributed random variables with mean zero.

<span class="mw-page-title-main">Normal distribution</span> Probability distribution

In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is

The number π is a mathematical constant that is the ratio of a circle's circumference to its diameter, approximately equal to 3.14159. The number π appears in many formulae across mathematics and physics. It is an irrational number, meaning that it cannot be expressed exactly as a ratio of two integers, although fractions such as are commonly used to approximate it. Consequently, its decimal representation never ends, nor enters a permanently repeating pattern. It is a transcendental number, meaning that it cannot be a solution of an equation involving only finite sums, products, powers, and integers. The transcendence of π implies that it is impossible to solve the ancient challenge of squaring the circle with a compass and straightedge. The decimal digits of π appear to be randomly distributed, but no proof of this conjecture has been found.

<span class="mw-page-title-main">Probability density function</span> Function whose integral over a region describes the probability of an event occurring in that region

In probability theory, a probability density function (PDF), density function, or density of an absolutely continuous random variable, is a function whose value at any given sample in the sample space can be interpreted as providing a relative likelihood that the value of the random variable would be equal to that sample. Probability density is the probability per unit length, in other words, while the absolute likelihood for a continuous random variable to take on any particular value is 0, the value of the PDF at two different samples can be used to infer, in any particular draw of the random variable, how much more likely it is that the random variable would be close to one sample compared to the other sample.

<span class="mw-page-title-main">Residue theorem</span> Concept of complex analysis

In complex analysis, the residue theorem, sometimes called Cauchy's residue theorem, is a powerful tool to evaluate line integrals of analytic functions over closed curves; it can often be used to compute real integrals and infinite series as well. It generalizes the Cauchy integral theorem and Cauchy's integral formula. The residue theorem should not be confused with special cases of the generalized Stokes' theorem; however, the latter can be used as an ingredient of its proof.

<span class="mw-page-title-main">Error function</span> Sigmoid shape special function

In mathematics, the error function, often denoted by erf, is a function defined as:

In mathematics, a Gaussian function, often simply referred to as a Gaussian, is a function of the base form

<span class="mw-page-title-main">Buffon's needle problem</span> Question in geometric probability

In probability theory, Buffon's needle problem is a question first posed in the 18th century by Georges-Louis Leclerc, Comte de Buffon:

<span class="mw-page-title-main">Stable distribution</span> Distribution of variables which satisfies a stability property under linear combinations

In probability theory, a distribution is said to be stable if a linear combination of two independent random variables with this distribution has the same distribution, up to location and scale parameters. A random variable is said to be stable if its distribution is stable. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution, after Paul Lévy, the first mathematician to have studied it.

Variational Bayesian methods are a family of techniques for approximating intractable integrals arising in Bayesian inference and machine learning. They are typically used in complex statistical models consisting of observed variables as well as unknown parameters and latent variables, with various sorts of relationships among the three types of random variables, as might be described by a graphical model. As typical in Bayesian inference, the parameters and latent variables are grouped together as "unobserved variables". Variational Bayesian methods are primarily used for two purposes:

  1. To provide an analytical approximation to the posterior probability of the unobserved variables, in order to do statistical inference over these variables.
  2. To derive a lower bound for the marginal likelihood of the observed data. This is typically used for performing model selection, the general idea being that a higher marginal likelihood for a given model indicates a better fit of the data by that model and hence a greater probability that the model in question was the one that generated the data.
<span class="mw-page-title-main">Directional statistics</span>

Directional statistics is the subdiscipline of statistics that deals with directions, axes or rotations in Rn. More generally, directional statistics deals with observations on compact Riemannian manifolds including the Stiefel manifold.

In mathematics, and specifically in potential theory, the Poisson kernel is an integral kernel, used for solving the two-dimensional Laplace equation, given Dirichlet boundary conditions on the unit disk. The kernel can be understood as the derivative of the Green's function for the Laplace equation. It is named for Siméon Poisson.

In mathematics, a π-system on a set is a collection of certain subsets of such that

In mathematics, the Crofton formula, named after Morgan Crofton (1826–1915), is a classic result of integral geometry relating the length of a curve to the expected number of times a "random" line intersects it.

<span class="mw-page-title-main">Complex logarithm</span> Logarithm of a complex number

In mathematics, a complex logarithm is a generalization of the natural logarithm to nonzero complex numbers. The term refers to one of the following, which are strongly related:

The statistics of random permutations, such as the cycle structure of a random permutation are of fundamental importance in the analysis of algorithms, especially of sorting algorithms, which operate on random permutations. Suppose, for example, that we are using quickselect to select a random element of a random permutation. Quickselect will perform a partial sort on the array, as it partitions the array according to the pivot. Hence a permutation will be less disordered after quickselect has been performed. The amount of disorder that remains may be analysed with generating functions. These generating functions depend in a fundamental way on the generating functions of random permutation statistics. Hence it is of vital importance to compute these generating functions.

In geometric probability, the problem of Buffon's noodle is a variation on the well-known problem of Buffon's needle, named after Georges-Louis Leclerc, Comte de Buffon who lived in the 18th century. This approach to the problem was published by Joseph-Émile Barbier in 1860.

In mathematics, a line integral is an integral where the function to be integrated is evaluated along a curve. The terms path integral, curve integral, and curvilinear integral are also used; contour integral is used as well, although that is typically reserved for line integrals in the complex plane.

In mathematics and probability theory, continuum percolation theory is a branch of mathematics that extends discrete percolation theory to continuous space. More specifically, the underlying points of discrete percolation form types of lattices whereas the underlying points of continuum percolation are often randomly positioned in some continuous space and form a type of point process. For each point, a random shape is frequently placed on it and the shapes overlap each with other to form clumps or components. As in discrete percolation, a common research focus of continuum percolation is studying the conditions of occurrence for infinite or giant components. Other shared concepts and analysis techniques exist in these two types of percolation theory as well as the study of random graphs and random geometric graphs.

In geometry, the mean line segment length is the average length of a line segment connecting two points chosen uniformly at random in a given shape. In other words, it is the expected Euclidean distance between two random points, where each point in the shape is equally likely to be chosen.

References

  1. Kendall, David G.; Kendall, Wilfrid S. (1980). "Alignments in Two-Dimensional Random Sets of Points". Advances in Applied Probability. 12 (2): 380–424. doi:10.2307/1426603. ISSN   0001-8678.
  2. 1 2 Edmunds, M. G.; George, G. H. (April 1981). "Random alignment of quasars". Nature. 290 (5806): 481–483. doi:10.1038/290481a0. ISSN   1476-4687.
  3. 1 2 George, G.H (2003-08-03). "Ph.D. Thesis of Glyn George: The Alignment and Clustering of Quasars" . Retrieved 2017-02-17.
  4. Lezama, José; Gioi, Rafael Grompone von; Morel, Jean-Michel; Randall, Grégory (2014-03-06), A Contrario 2D Point Alignment Detection , retrieved 2023-09-29
  5. Watkins, Alfred (1988). The Old Straight Track: Its Mounds, Beacons, Moats, Sites and Mark Stones. Abacus. ISBN   9780349137070.