Moving least squares

Last updated

Moving least squares is a method of reconstructing continuous functions from a set of unorganized point samples via the calculation of a weighted least squares measure biased towards the region around the point at which the reconstructed value is requested.

Contents

In computer graphics, the moving least squares method is useful for reconstructing a surface from a set of points. Often it is used to create a 3D surface from a point cloud through either downsampling or upsampling.

In numerical analysis to handle contributions of geometry where it is difficult to obtain discretizations, the moving least squares methods have also been used and generalized to solve PDEs on curved surfaces and other geometries. [1] [2] [3] This includes numerical methods developed for curved surfaces for solving scalar parabolic PDEs [1] [3] and vector-valued hydrodynamic PDEs. [2]

In machine learning, moving least squares methods have also been used to develop model classes and learning methods. This includes function regression methods [4] and neural network function and operator regression approaches, such as GMLS-Nets. [5]

Definition

Here is a 2D example. The circles are the samples and the polygon is a linear interpolation. The blue curve is a smooth approximation of order 3. Moving Least Squares2.png
Here is a 2D example. The circles are the samples and the polygon is a linear interpolation. The blue curve is a smooth approximation of order 3.

Consider a function and a set of sample points . Then, the moving least square approximation of degree at the point is where minimizes the weighted least-square error

over all polynomials of degree in . is the weight and it tends to zero as .

In the example . The smooth interpolator of "order 3" is a quadratic interpolator.

See also

Related Research Articles

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator measures the average of the squares of the errors—that is, the average squared difference between the estimated values and the actual value. MSE is a risk function, corresponding to the expected value of the squared error loss. The fact that MSE is almost always strictly positive is because of randomness or because the estimator does not account for information that could produce a more accurate estimate. In machine learning, specifically empirical risk minimization, MSE may refer to the empirical risk, as an estimate of the true MSE.

In probability theory and statistics, a Gaussian process is a stochastic process, such that every finite collection of those random variables has a multivariate normal distribution. The distribution of a Gaussian process is the joint distribution of all those random variables, and as such, it is a distribution over functions with a continuous domain, e.g. time or space.

<span class="mw-page-title-main">Expectation–maximization algorithm</span> Iterative method for finding maximum likelihood estimates in statistical models

In statistics, an expectation–maximization (EM) algorithm is an iterative method to find (local) maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variables. The EM iteration alternates between performing an expectation (E) step, which creates a function for the expectation of the log-likelihood evaluated using the current estimate for the parameters, and a maximization (M) step, which computes parameters maximizing the expected log-likelihood found on the E step. These parameter-estimates are then used to determine the distribution of the latent variables in the next E step. It can be used, for example, to estimate a mixture of gaussians, or to solve the multiple linear regression problem.

In numerical analysis and computational statistics, rejection sampling is a basic technique used to generate observations from a distribution. It is also commonly called the acceptance-rejection method or "accept-reject algorithm" and is a type of exact simulation method. The method works for any distribution in with a density.

In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.

Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of the random forest is the class selected by most trees. For regression tasks, the mean or average prediction of the individual trees is returned. Random decision forests correct for decision trees' habit of overfitting to their training set.

<span class="mw-page-title-main">Ordinary least squares</span> Method for estimating the unknown parameters in a linear regression model

In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the input dataset and the output of the (linear) function of the independent variable. Some sources consider OLS to be linear regression.

In statistics, the method of moments is a method of estimation of population parameters. The same principle is used to derive higher moments like skewness and kurtosis.

In statistics, Poisson regression is a generalized linear model form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown parameters. A Poisson regression model is sometimes known as a log-linear model, especially when used to model contingency tables.

In statistics, M-estimators are a broad class of extremum estimators for which the objective function is a sample average. Both non-linear least squares and maximum likelihood estimation are special cases of M-estimators. The definition of M-estimators was motivated by robust statistics, which contributed new types of M-estimators. However, M-estimators are not inherently robust, as is clear from the fact that they include maximum likelihood estimators, which are in general not robust. The statistical procedure of evaluating an M-estimator on a data set is called M-estimation.

<span class="mw-page-title-main">Meshfree methods</span> Methods in numerical analysis not requiring knowledge of neighboring points

In the field of numerical analysis, meshfree methods are those that do not require connection between nodes of the simulation domain, i.e. a mesh, but are rather based on interaction of each node with all its neighbors. As a consequence, original extensive properties such as mass or kinetic energy are no longer assigned to mesh elements but rather to the single nodes. Meshfree methods enable the simulation of some otherwise difficult types of problems, at the cost of extra computing time and programming effort. The absence of a mesh allows Lagrangian simulations, in which the nodes can move according to the velocity field.

Stochastic approximation methods are a family of iterative methods typically used for root-finding problems or for optimization problems. The recursive update rules of stochastic approximation methods can be used, among other things, for solving linear systems when the collected data is corrupted by noise, or for approximating extreme values of functions which cannot be computed directly, but only estimated via noisy observations.

Miniaturizing components has always been a primary goal in the semiconductor industry because it cuts production cost and lets companies build smaller computers and other devices. Miniaturization, however, has increased dissipated power per unit area and made it a key limiting factor in integrated circuit performance. Temperature increase becomes relevant for relatively small-cross-sections wires, where it may affect normal semiconductor behavior. Besides, since the generation of heat is proportional to the frequency of operation for switching circuits, fast computers have larger heat generation than slow ones, an undesired effect for chips manufacturers. This article summaries physical concepts that describe the generation and conduction of heat in an integrated circuit, and presents numerical methods that model heat transfer from a macroscopic point of view.

The closest point method (CPM) is an embedding method for solving partial differential equations on surfaces. The closest point method uses standard numerical approaches such as finite differences, finite element or spectral methods in order to solve the embedding partial differential equation (PDE) which is equal to the original PDE on the surface. The solution is computed in a band surrounding the surface in order to be computationally efficient. In order to extend the data off the surface, the closest point method uses a closest point representation. This representation extends function values to be constant along directions normal to the surface.

In statistics, the class of vector generalized linear models (VGLMs) was proposed to enlarge the scope of models catered for by generalized linear models (GLMs). In particular, VGLMs allow for response variables outside the classical exponential family and for more than one parameter. Each parameter can be transformed by a link function. The VGLM framework is also large enough to naturally accommodate multiple responses; these are several independent responses each coming from a particular statistical distribution with possibly different parameter values.

In the study of artificial neural networks (ANNs), the neural tangent kernel (NTK) is a kernel that describes the evolution of deep artificial neural networks during their training by gradient descent. It allows ANNs to be studied using theoretical tools from kernel methods.

In mathematics, Anderson acceleration, also called Anderson mixing, is a method for the acceleration of the convergence rate of fixed-point iterations. Introduced by Donald G. Anderson, this technique can be used to find the solution to fixed point equations often arising in the field of computational science.

Probabilistic numerics is an active field of study at the intersection of applied mathematics, statistics, and machine learning centering on the concept of uncertainty in computation. In probabilistic numerics, tasks in numerical analysis such as finding numerical solutions for integration, linear algebra, optimization and simulation and differential equations are seen as problems of statistical, probabilistic, or Bayesian inference.

<span class="mw-page-title-main">Deep backward stochastic differential equation method</span>

Deep backward stochastic differential equation method is a numerical method that combines deep learning with Backward stochastic differential equation (BSDE). This method is particularly useful for solving high-dimensional problems in financial derivatives pricing and risk management. By leveraging the powerful function approximation capabilities of deep neural networks, deep BSDE addresses the computational challenges faced by traditional numerical methods in high-dimensional settings.

References

  1. 1 2 Liang, Jian; Zhao, Hongkai (January 2013). "Solving Partial Differential Equations on Point Clouds". SIAM Journal on Scientific Computing. 35 (3): A1461–A1486. Bibcode:2013SJSC...35A1461L. doi:10.1137/120869730. S2CID   9984491.
  2. 1 2 Gross, B. J.; Trask, N.; Kuberry, P.; Atzberger, P. J. (15 May 2020). "Meshfree methods on manifolds for hydrodynamic flows on curved surfaces: A Generalized Moving Least-Squares (GMLS) approach". Journal of Computational Physics. 409: 109340. arXiv: 1905.10469 . Bibcode:2020JCoPh.40909340G. doi:10.1016/j.jcp.2020.109340. S2CID   166228451.
  3. 1 2 Gross, B. J.; Kuberry, P.; Atzberger, P. J. (15 March 2022). "First-passage time statistics on surfaces of general shape: Surface PDE solvers using Generalized Moving Least Squares (GMLS)". Journal of Computational Physics. 453: 110932. arXiv: 2102.02421 . Bibcode:2022JCoPh.45310932G. doi:10.1016/j.jcp.2021.110932. ISSN   0021-9991. S2CID   231802303.
  4. Wang, Hong-Yan; Xiang, Dao-Hong; Zhou, Ding-Xuan (1 March 2010). "Moving least-square method in learning theory". Journal of Approximation Theory. 162 (3): 599–614. doi: 10.1016/j.jat.2009.12.002 . ISSN   0021-9045.
  5. Trask, Nathaniel; Patel, Ravi G.; Gross, Ben J.; Atzberger, Paul J. (13 September 2019). "GMLS-Nets: A framework for learning from unstructured data". arXiv: 1909.05371 [cs.LG].