Parameter space

Last updated

The parameter space is the space of all possible parameter values that define a particular mathematical model. It is also sometimes called weight space, and is often a subset of finite-dimensional Euclidean space.

Contents

In statistics, parameter spaces are particularly useful for describing parametric families of probability distributions. They also form the background for parameter estimation. In the case of extremum estimators for parametric models, a certain objective function is maximized or minimized over the parameter space. [1] Theorems of existence and consistency of such estimators require some assumptions about the topology of the parameter space. For instance, compactness of the parameter space, together with continuity of the objective function, suffices for the existence of an extremum estimator. [1]

Sometimes, parameters are analyzed to view how they affect their statistical model. In that context, they can be viewed as inputs of a function, in which case the technical term for the parameter space is domain of a function. The ranges of values of the parameters may form the axes of a plot, and particular outcomes of the model may be plotted against these axes to illustrate how different regions of the parameter space produce different types of behavior in the model.

Examples

For some values of r, this function ends up cycling around a few values or becomes fixed on one value. These long-term values can be plotted against r in a bifurcation diagram to show the different behaviours of the function for different values of r.
The famous Mandelbrot set is a subset of this parameter space, consisting of the points in the complex plane which give a bounded set of numbers when a particular iterated function is repeatedly applied from that starting point. The remaining points, which are not in the set, give an unbounded set of numbers (they tend to infinity) when this function is repeatedly applied from that starting point.

History

Parameter space contributed to the liberation of geometry from the confines of three-dimensional space. For instance, the parameter space of spheres in three dimensions, has four dimensions—three for the sphere center and another for the radius. According to Dirk Struik, it was the book Neue Geometrie des Raumes (1849) by Julius Plücker that showed

...geometry need not solely be based on points as basic elements. Lines, planes, circles, spheres can all be used as the elements (Raumelemente) on which a geometry can be based. This fertile conception threw new light on both synthetic and algebraic geometry and created new forms of duality. The number of dimensions of a particular form of geometry could now be any positive number, depending on the number of parameters necessary to define the "element". [5] :165

The requirement for higher dimensions is illustrated by Plücker's line geometry. Struik writes

[Plücker's] geometry of lines in three-space could be considered as a four-dimensional geometry, or, as Klein has stressed, as the geometry of a four-dimensional quadric in a five-dimensional space. [5] :168

Thus the Klein quadric describes the parameters of lines in space.

See also

Related Research Articles

In mathematics, analytic geometry, also known as coordinate geometry or Cartesian geometry, is the study of geometry using a coordinate system. This contrasts with synthetic geometry.

In mathematics, an equation is a mathematical formula that expresses the equality of two expressions, by connecting them with the equals sign =. The word equation and its cognates in other languages may have subtly different meanings; for example, in French an équation is defined as containing one or more variables, while in English, any well-formed formula consisting of two expressions related with an equals sign is an equation.

In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule, the quantity of interest and its result are distinguished. For example, the sample mean is a commonly used estimator of the population mean.

<span class="mw-page-title-main">Supervised learning</span> Paradigm in machine learning

Supervised learning (SL) is a paradigm in machine learning where input objects and a desired output value train a model. The training data is processed, building a function that maps new data to expected output values. An optimal scenario will allow for the algorithm to correctly determine output values for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way. This statistical quality of an algorithm is measured through the so-called generalization error.

A parameter, generally, is any characteristic that can help in defining or classifying a particular system. That is, a parameter is an element of a system that is useful, or critical, when identifying the system, or when evaluating its performance, status, condition, etc.

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

In mathematics, a quadric or quadric surface (quadric hypersurface in higher dimensions), is a generalization of conic sections (ellipses, parabolas, and hyperbolas). It is a hypersurface (of dimension D) in a (D + 1)-dimensional space, and it is defined as the zero set of an irreducible polynomial of degree two in D + 1 variables; for example, D = 1 in the case of conic sections. When the defining polynomial is not absolutely irreducible, the zero set is generally not considered a quadric, although it is often called a degenerate quadric or a reducible quadric.

In mathematics, conformal geometry is the study of the set of angle-preserving (conformal) transformations on a space.

In Vapnik–Chervonenkis theory, the Vapnik–Chervonenkis (VC) dimension is a measure of the size of a class of sets. The notion can be extended to classes of binary functions. It is defined as the cardinality of the largest set of points that the algorithm can shatter, which means the algorithm can always learn a perfect classifier for any labeling of at least one configuration of those data points. It was originally defined by Vladimir Vapnik and Alexey Chervonenkis.

<span class="mw-page-title-main">Nonlinear dimensionality reduction</span> Projection of data onto lower-dimensional manifolds

Nonlinear dimensionality reduction, also known as manifold learning, is any of various related techniques that aim to project high-dimensional data, potentially existing across non-linear manifolds which cannot be adequately captured by linear decomposition methods, onto lower-dimensional latent manifolds, with the goal of either visualizing the data in the low-dimensional space, or learning the mapping itself. The techniques described below can be understood as generalizations of linear decomposition methods used for dimensionality reduction, such as singular value decomposition and principal component analysis.

In statistics, a mixture model is a probabilistic model for representing the presence of subpopulations within an overall population, without requiring that an observed data set should identify the sub-population to which an individual observation belongs. Formally a mixture model corresponds to the mixture distribution that represents the probability distribution of observations in the overall population. However, while problems associated with "mixture distributions" relate to deriving the properties of the overall population from those of the sub-populations, "mixture models" are used to make statistical inferences about the properties of the sub-populations given only observations on the pooled population, without sub-population identity information. Mixture models are used for clustering, under the name model-based clustering, and also for density estimation.

In statistics, a parametric model or parametric family or finite-dimensional model is a particular class of statistical models. Specifically, a parametric model is a family of probability distributions that has a finite number of parameters.

Robust statistics are statistics that maintain their properties even if the underlying distributional assumptions are incorrect. Robust statistical methods have been developed for many common problems, such as estimating location, scale, and regression parameters. One motivation is to produce statistical methods that are not unduly affected by outliers. Another motivation is to provide methods with good performance when there are small departures from a parametric distribution. For example, robust methods work well for mixtures of two normal distributions with different standard deviations; under this model, non-robust methods like a t-test work poorly.

<span class="mw-page-title-main">Three-dimensional space</span> Geometric model of the physical space

In geometry, a three-dimensional space is a mathematical space in which three values (coordinates) are required to determine the position of a point. Most commonly, it is the three-dimensional Euclidean space, that is, the Euclidean space of dimension three, which models physical space. More general three-dimensional spaces are called 3-manifolds. The term may also refer colloquially to a subset of space, a three-dimensional region, a solid figure.

<span class="mw-page-title-main">Surface (mathematics)</span> Mathematical idealization of the surface of a body

In mathematics, a surface is a mathematical model of the common concept of a surface. It is a generalization of a plane, but, unlike a plane, it may be curved; this is analogous to a curve generalizing a straight line.

Structural risk minimization (SRM) is an inductive principle of use in machine learning. Commonly in machine learning, a generalized model must be selected from a finite data set, with the consequent problem of overfitting – the model becoming too strongly tailored to the particularities of the training set and generalizing poorly to new data. The SRM principle addresses this problem by balancing the model's complexity against its success at fitting the training data. This principle was first set out in a 1974 book by Vladimir Vapnik and Alexey Chervonenkis and uses the VC dimension.

<span class="mw-page-title-main">Lie sphere geometry</span> Geometry founded on spheres

Lie sphere geometry is a geometrical theory of planar or spatial geometry in which the fundamental concept is the circle or sphere. It was introduced by Sophus Lie in the nineteenth century. The main idea which leads to Lie sphere geometry is that lines should be regarded as circles of infinite radius and that points in the plane should be regarded as circles of zero radius.

In statistics and econometrics, extremum estimators are a wide class of estimators for parametric models that are calculated through maximization of a certain objective function, which depends on the data. The general theory of extremum estimators was developed by Amemiya (1985).

Bayesian optimization is a sequential design strategy for global optimization of black-box functions, that does not assume any functional forms. It is usually employed to optimize expensive-to-evaluate functions. With the rise of artificial intelligence innovation in the 21st century, Bayesian optimizations have found prominent use in machine learning problems, for optimizing hyperparameter values.

In machine learning, hyperparameter optimization or tuning is the problem of choosing a set of optimal hyperparameters for a learning algorithm. A hyperparameter is a parameter whose value is used to control the learning process, which must be configured before the process starts.

References

  1. 1 2 Hayashi, Fumio (2000). Econometrics. Princeton University Press. p. 446. ISBN   0-691-01018-8.
  2. Gasperino, J.; Rom, W. N. (2004). "Gender and lung cancer". Clinical Lung Cancer. 5 (6): 353–359. doi:10.3816/CLC.2004.n.013. PMID   15217534.
  3. 1 2 Navon, Aviv; Shamsian, Aviv; Achituve, Idan; Fetaya, Ethan; Chechik, Gal; Maron, Haggai (2023-07-03). "Equivariant Architectures for Learning in Deep Weight Spaces". Proceedings of the 40th International Conference on Machine Learning. PMLR: 25790–25816.
  4. Hecht-Nielsen, Robert (1990-01-01), Eckmiller, Rolf (ed.), "ON THE ALGEBRAIC STRUCTURE OF FEEDFORWARD NETWORK WEIGHT SPACES", Advanced Neural Computers, Amsterdam: North-Holland, pp. 129–135, ISBN   978-0-444-88400-8 , retrieved 2023-12-01
  5. 1 2 Dirk Struik (1967) A Concise History of Mathematics, 3rd edition, Dover Books