Moving least squares

Last updated

Moving least squares is a method of reconstructing continuous functions from a set of unorganized point samples via the calculation of a weighted least squares measure biased towards the region around the point at which the reconstructed value is requested.

Contents

In computer graphics, the moving least squares method is useful for reconstructing a surface from a set of points. Often it is used to create a 3D surface from a point cloud through either downsampling or upsampling.

In numerical analysis to handle contributions of geometry where it is difficult to obtain discretizations, the moving least squares methods have also been used and generalized to solve PDEs on curved surfaces and other geometries. [1] [2] [3] This includes numerical methods developed for curved surfaces for solving scalar parabolic PDEs [1] [3] and vector-valued hydrodynamic PDEs. [2]

In machine learning, moving least squares methods have also been used to develop model classes and learning methods. This includes function regression methods [4] and neural network function and operator regression approaches, such as GMLS-Nets. [5]

Definition

Here is a 2D example. The circles are the samples and the polygon is a linear interpolation. The blue curve is a smooth approximation of order 3. Moving Least Squares2.png
Here is a 2D example. The circles are the samples and the polygon is a linear interpolation. The blue curve is a smooth approximation of order 3.

Consider a function and a set of sample points . Then, the moving least square approximation of degree at the point is where minimizes the weighted least-square error

over all polynomials of degree in . is the weight and it tends to zero as .

In the example . The smooth interpolator of "order 3" is a quadratic interpolator.

See also

Related Research Articles

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator measures the average of the squares of the errors—that is, the average squared difference between the estimated values and the actual value. MSE is a risk function, corresponding to the expected value of the squared error loss. The fact that MSE is almost always strictly positive is because of randomness or because the estimator does not account for information that could produce a more accurate estimate. In machine learning, specifically empirical risk minimization, MSE may refer to the empirical risk, as an estimate of the true MSE.

<span class="mw-page-title-main">Expectation–maximization algorithm</span> Iterative method for finding maximum likelihood estimates in statistical models

In statistics, an expectation–maximization (EM) algorithm is an iterative method to find (local) maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variables. The EM iteration alternates between performing an expectation (E) step, which creates a function for the expectation of the log-likelihood evaluated using the current estimate for the parameters, and a maximization (M) step, which computes parameters maximizing the expected log-likelihood found on the E step. These parameter-estimates are then used to determine the distribution of the latent variables in the next E step.

In numerical analysis and computational statistics, rejection sampling is a basic technique used to generate observations from a distribution. It is also commonly called the acceptance-rejection method or "accept-reject algorithm" and is a type of exact simulation method. The method works for any distribution in with a density.

In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.

<span class="mw-page-title-main">Random forest</span> Binary search tree based ensemble machine learning method

Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of the random forest is the class selected by most trees. For regression tasks, the mean or average prediction of the individual trees is returned. Random decision forests correct for decision trees' habit of overfitting to their training set. Random forests generally outperform decision trees, but their accuracy is lower than gradient boosted trees. However, data characteristics can affect their performance.

Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. Many classes of convex optimization problems admit polynomial-time algorithms, whereas mathematical optimization is in general NP-hard.

In statistics, Poisson regression is a generalized linear model form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown parameters. A Poisson regression model is sometimes known as a log-linear model, especially when used to model contingency tables.

In statistics, M-estimators are a broad class of extremum estimators for which the objective function is a sample average. Both non-linear least squares and maximum likelihood estimation are special cases of M-estimators. The definition of M-estimators was motivated by robust statistics, which contributed new types of M-estimators. The statistical procedure of evaluating an M-estimator on a data set is called M-estimation. 48 samples of robust M-estimators can be found in a recent review study.

<span class="mw-page-title-main">Meshfree methods</span> Methods in numerical analysis not requiring knowledge of neighboring points

In the field of numerical analysis, meshfree methods are those that do not require connection between nodes of the simulation domain, i.e. a mesh, but are rather based on interaction of each node with all its neighbors. As a consequence, original extensive properties such as mass or kinetic energy are no longer assigned to mesh elements but rather to the single nodes. Meshfree methods enable the simulation of some otherwise difficult types of problems, at the cost of extra computing time and programming effort. The absence of a mesh allows Lagrangian simulations, in which the nodes can move according to the velocity field.

Stochastic approximation methods are a family of iterative methods typically used for root-finding problems or for optimization problems. The recursive update rules of stochastic approximation methods can be used, among other things, for solving linear systems when the collected data is corrupted by noise, or for approximating extreme values of functions which cannot be computed directly, but only estimated via noisy observations.

<span class="mw-page-title-main">Quantile regression</span> Statistics concept

Quantile regression is a type of regression analysis used in statistics and econometrics. Whereas the method of least squares estimates the conditional mean of the response variable across values of the predictor variables, quantile regression estimates the conditional median of the response variable. Quantile regression is an extension of linear regression used when the conditions of linear regression are not met.

Minimum-distance estimation (MDE) is a conceptual method for fitting a statistical model to data, usually the empirical distribution. Often-used estimators such as ordinary least squares can be thought of as special cases of minimum-distance estimation.

Miniaturizing components has always been a primary goal in the semiconductor industry because it cuts production cost and lets companies build smaller computers and other devices. Miniaturization, however, has increased dissipated power per unit area and made it a key limiting factor in integrated circuit performance. Temperature increase becomes relevant for relatively small-cross-sections wires, where it may affect normal semiconductor behavior. Besides, since the generation of heat is proportional to the frequency of operation for switching circuits, fast computers have larger heat generation than slow ones, an undesired effect for chips manufacturers. This article summaries physical concepts that describe the generation and conduction of heat in an integrated circuit, and presents numerical methods that model heat transfer from a macroscopic point of view.

Least-squares support-vector machines (LS-SVM) for statistics and in statistical modeling, are least-squares versions of support-vector machines (SVM), which are a set of related supervised learning methods that analyze data and recognize patterns, and which are used for classification and regression analysis. In this version one finds the solution by solving a set of linear equations instead of a convex quadratic programming (QP) problem for classical SVMs. Least-squares SVM classifiers were proposed by Johan Suykens and Joos Vandewalle. LS-SVMs are a class of kernel-based learning methods.

The closest point method (CPM) is an embedding method for solving partial differential equations on surfaces. The closest point method uses standard numerical approaches such as finite differences, finite element or spectral methods in order to solve the embedding partial differential equation (PDE) which is equal to the original PDE on the surface. The solution is computed in a band surrounding the surface in order to be computationally efficient. In order to extend the data off the surface, the closest point method uses a closest point representation. This representation extends function values to be constant along directions normal to the surface.

In statistics, the class of vector generalized linear models (VGLMs) was proposed to enlarge the scope of models catered for by generalized linear models (GLMs). In particular, VGLMs allow for response variables outside the classical exponential family and for more than one parameter. Each parameter can be transformed by a link function. The VGLM framework is also large enough to naturally accommodate multiple responses; these are several independent responses each coming from a particular statistical distribution with possibly different parameter values.

<span class="mw-page-title-main">Weak supervision</span> Branch of machine learning

Weak supervision, also called semi-supervised learning, is a branch of machine learning that combines a small amount of labeled data with a large amount of unlabeled data during training. Semi-supervised learning falls between unsupervised learning and supervised learning. Semi-supervised learning aims to alleviate the issue of having limited amounts of labeled data available for training.

In mathematics, Anderson acceleration, also called Anderson mixing, is a method for the acceleration of the convergence rate of fixed-point iterations. Introduced by Donald G. Anderson, this technique can be used to find the solution to fixed point equations often arising in the field of computational science.

Probabilistic numerics is a scientific field at the intersection of statistics, machine learning and applied mathematics, where tasks in numerical analysis including finding numerical solutions for integration, linear algebra, optimisation and differential equations are seen as problems of statistical, probabilistic, or Bayesian inference.

References

  1. 1 2 Liang, Jian; Zhao, Hongkai (January 2013). "Solving Partial Differential Equations on Point Clouds". SIAM Journal on Scientific Computing. 35 (3): A1461–A1486. doi:10.1137/120869730.
  2. 1 2 Gross, B. J.; Trask, N.; Kuberry, P.; Atzberger, P. J. (15 May 2020). "Meshfree methods on manifolds for hydrodynamic flows on curved surfaces: A Generalized Moving Least-Squares (GMLS) approach". Journal of Computational Physics. 409: 109340. doi:10.1016/j.jcp.2020.109340.
  3. 1 2 Gross, B. J.; Kuberry, P.; Atzberger, P. J. (15 March 2022). "First-passage time statistics on surfaces of general shape: Surface PDE solvers using Generalized Moving Least Squares (GMLS)". Journal of Computational Physics. 453: 110932. doi:10.1016/j.jcp.2021.110932. ISSN   0021-9991.
  4. Wang, Hong-Yan; Xiang, Dao-Hong; Zhou, Ding-Xuan (1 March 2010). "Moving least-square method in learning theory". Journal of Approximation Theory. 162 (3): 599–614. doi:10.1016/j.jat.2009.12.002. ISSN   0021-9045.
  5. Trask, Nathaniel; Patel, Ravi G.; Gross, Ben J.; Atzberger, Paul J. (13 September 2019). "GMLS-Nets: A framework for learning from unstructured data". arXiv:1909.05371 [physics, stat].