S-estimator

Last updated June 16, 2021

The goal of S-estimators is to have a simple high-breakdown regression estimator, which share the flexibility and nice asymptotic properties of M-estimators. The name "S-estimators" was chosen as they are based on estimators of scale.

We will consider estimators of scale defined by a function $\rho$ , which satisfy

R1 – $\rho$ is symmetric, continuously differentiable and $\rho (0)=0$ .
R2 – there exists $c>0$ such that $\rho$ is strictly increasing on $[c,\infty ]$

For any sample $\{r_{1},...,r_{n}\}$ of real numbers, we define the scale estimate $s(r_{1},...,r_{n})$ as the solution of

${\textstyle {\frac {1}{n}}\sum _{i=1}^{n}\rho (r_{i}/s)=K}$ ,

where $K$ is the expectation value of $\rho$ for a standard normal distribution. (If there are more solutions to the above equation, then we take the one with the smallest solution for s; if there is no solution, then we put $s(r_{1},...,r_{n})=0$ .)

Definition:

Let $(x_{1},y_{1}),...,(x_{n},y_{n})$ be a sample of regression data with p-dimensional $x_{i}$ . For each vector $\theta$ , we obtain residuals $s(r_{1}(\theta ),...,r_{n}(\theta ))$ by solving the equation of scale above, where $\rho$ satisfy R1 and R2. The S-estimator ${\hat {\theta }}$ is defined by

${\hat {\theta }}=\min _{\theta }\,s(r_{1}(\theta ),...,r_{n}(\theta ))$

and the final scale estimator ${\hat {\sigma }}$ is then

${\hat {\sigma }}=s(r_{1}({\hat {\theta }}),...,r_{n}({\hat {\theta }}))$ .^[1]

Related Research Articles

Navier–Stokes equations Equations describing the motion of viscous fluid substances

In physics, the Navier–Stokes equations are a set of partial differential equations which describe the motion of viscous fluid substances, named after French engineer and physicist Claude-Louis Navier and Anglo-Irish physicist and mathematician George Gabriel Stokes.

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of a probability distribution by maximizing a likelihood function, so that under the assumed statistical model the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator measures the average of the squares of the errors—that is, the average squared difference between the estimated values and the actual value. MSE is a risk function, corresponding to the expected value of the squared error loss. The fact that MSE is almost always strictly positive is because of randomness or because the estimator does not account for information that could produce a more accurate estimate.

In statistics, the Pearson correlation coefficient is a measure of linear correlation between two sets of data. It is the covariance of two variables, divided by the product of their standard deviations; thus it is essentially a normalised measurement of the covariance, such that the result always has a value between −1 and 1. As with covariance itself, the measure can only reflect a linear correlation of variables, and ignores many other types of relationship or correlation. As a simple example, one would expect the age and height of a sample of teenagers from a high school to have a Pearson correlation coefficient significantly greater than 0, but less than 1.

Linear elasticity is a mathematical model of how solid objects deform and become internally stressed due to prescribed loading conditions. It is a simplification of the more general nonlinear theory of elasticity and a branch of continuum mechanics.

In quantum mechanics and computing, the Bloch sphere is a geometrical representation of the pure state space of a two-level quantum mechanical system (qubit), named after the physicist Felix Bloch.

In differential topology, the jet bundle is a certain construction that makes a new smooth fiber bundle out of a given smooth fiber bundle. It makes it possible to write differential equations on sections of a fiber bundle in an invariant form. Jets may also be seen as the coordinate free versions of Taylor expansions.

In statistics, ordinary least squares (OLS) is a type of linear least squares method for estimating the unknown parameters in a linear regression model. OLS chooses the parameters of a linear function of a set of explanatory variables by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the given dataset and those predicted by the linear function of the independent variable.

Robust statistics are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have been developed for many common problems, such as estimating location, scale, and regression parameters. One motivation is to produce statistical methods that are not unduly affected by outliers. Another motivation is to provide methods with good performance when there are small departures from parametric distribution. For example, robust methods work well for mixtures of two normal distributions with different standard-deviations; under this model, non-robust methods like a t-test work poorly.

In statistics, M-estimators are a broad class of extremum estimators for which the objective function is a sample average. Both non-linear least squares and maximum likelihood estimation are special cases of M-estimators. The definition of M-estimators was motivated by robust statistics, which contributed new types of M-estimators. The statistical procedure of evaluating an M-estimator on a data set is called M-estimation. 48 samples of robust M-estimators can be founded in a recent review study.

In general relativity, optical scalars refer to a set of three scalar functions $(expansion), (shear) and (twist/rotation/vorticity) describing the propagation of a geodesic null congruence.$

In statistics, a pivotal quantity or pivot is a function of observations and unobservable parameters such that the function's probability distribution does not depend on the unknown parameters. A pivot quantity need not be a statistic—the function and its value can depend on the parameters of the model, but its distribution must not. If it is a statistic, then it is known as an ancillary statistic.

In statistics, Bayesian linear regression is an approach to linear regression in which the statistical analysis is undertaken within the context of Bayesian inference. When the regression model has errors that have a normal distribution, and if a particular form of prior distribution is assumed, explicit results are available for the posterior probability distributions of the model's parameters.

In statistics, Bayesian multivariate linear regression is a Bayesian approach to multivariate linear regression, i.e. linear regression where the predicted outcome is a vector of correlated random variables rather than a single scalar random variable. A more general treatment of this approach can be found in the article MMSE estimator.

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

In statistics and in particular statistical theory, unbiased estimation of a standard deviation is the calculation from a statistical sample of an estimated value of the standard deviation of a population of values, in such a way that the expected value of the calculation equals the true value. Except in some important situations, outlined later, the task has little relevance to applications of statistics since its need is avoided by standard procedures, such as the use of significance tests and confidence intervals, or by using Bayesian analysis.

In fluid mechanics and mathematics, a capillary surface is a surface that represents the interface between two different fluids. As a consequence of being a surface, a capillary surface has no thickness in slight contrast with most real fluid interfaces.

In mathematics, a Coulomb wave function is a solution of the Coulomb wave equation, named after Charles-Augustin de Coulomb. They are used to describe the behavior of charged particles in a Coulomb potential and can be written in terms of confluent hypergeometric functions or Whittaker functions of imaginary argument.

In optics, the Fraunhofer diffraction equation is used to model the diffraction of waves when the diffraction pattern is viewed at a long distance from the diffracting object, and also when it is viewed at the focal plane of an imaging lens.

Given a probit model y=1[y* > 0] where y* = x₁ β + zδ + u, and u ~ N(0,1), without losing generality, z can be represented as z = x₁ θ₁ + x₂ θ₂ + v. When u is correlated with v, there will be an issue of endogeneity. This can be caused by omitted variables and measurement errors. There are also many cases where z is partially determined by y and endogeneity issue arises. For instance, in a model evaluating the effect of different patient features on their choice of whether going to hospital, y is the choice and z is the amount of the medicine a respondent took, then it is very intuitive that more often the respondent goes to hospital, it is more likely that she took more medicine, hence endogeneity issue arises. When there are endogenous explanatory variables, the estimator generated by usual estimation procedure will be inconsistent, then the corresponding estimated Average Partial Effect (APE) will be inconsistent, too.

References

↑ P. Rousseeuw and V. Yohai, Robust Regression by Means of S-estimators, from the book: Robust and nonlinear time series analysis, pages 256–272, 1984

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] P. Rousseeuw and V. Yohai, Robust Regression by Means of S-estimators, from the book: Robust and nonlinear time series analysis, pages 256–272, 1984

[1]