Shrinkage Fields (image restoration)

Last updated December 14, 2019

Shrinkage fields is a random field-based machine learning technique that aims to perform high quality image restoration (denoising and deblurring) using low computational overhead.

Method

The restored image $x$ is predicted from a corrupted observation $y$ after training on a set of sample images $S$ .

A shrinkage (mapping) function ${f}_{{\pi }_{i}}\left(v\right)={\sum }_{j=1}^{M}{\pi }_{i,j}\exp \left(-{\frac {\gamma }{2}}{\left(v-{\mu }_{j}\right)}^{2}\right)$ is directly modeled as a linear combination of radial basis function kernels, where $\gamma$ is the shared precision parameter, $\mu$ denotes the (equidistant) kernel positions, and M is the number of Gaussian kernels.

Because the shrinkage function is directly modeled, the optimization procedure is reduced to a single quadratic minimization per iteration, denoted as the prediction of a shrinkage field ${g}_{\mathrm {\Theta } }\left({\text{x}}\right)={\mathcal {F}}^{-1}\left\lbrack {\frac {{\mathcal {F}}\left(\lambda {K}^{T}y+{\sum }_{i=1}^{N}{F}_{i}^{T}{f}_{{\pi }_{i}}\left({F}_{i}x\right)\right)}{\lambda {\check {K}}^{\text{*}}\circ {\check {K}}+{\sum }_{i=1}^{N}{\check {F}}_{i}^{\text{*}}\circ {\check {F}}_{i}}}\right\rbrack ={\mathrm {\Omega } }^{-1}\eta$ where ${\mathcal {F}}$ denotes the discrete Fourier transform and $F_{x}$ is the 2D convolution ${\text{f}}\otimes {\text{x}}$ with point spread function filter, ${\breve {F}}$ is an optical transfer function defined as the discrete Fourier transform of ${\text{f}}$ , and ${\breve {F}}^{\text{*}}$ is the complex conjugate of ${\breve {F}}$ .

${\hat {x}}_{t}$ is learned as ${\hat {x}}_{t}={g}_{{\mathrm {\Theta } }_{t}}\left({\hat {x}}_{t-1}\right)$ for each iteration $t$ with the initial case ${\hat {x}}_{0}=y$ , this forms a cascade of Gaussian conditional random fields (or cascade of shrinkage fields (CSF)). Loss-minimization is used to learn the model parameters ${\mathrm {\Theta } }_{t}={\left\lbrace {\lambda }_{t},{\pi }_{\mathit {ti}},{f}_{\mathit {ti}}\right\rbrace }_{i=1}^{N}$ .

The learning objective function is defined as $J\left({\mathrm {\Theta } }_{t}\right)={\sum }_{s=1}^{S}l\left({\hat {x}}_{t}^{\left(s\right)};{x}_{gt}^{\left(s\right)}\right)$ , where $l$ is a differentiable loss function which is greedily minimized using training data ${\left\lbrace {x}_{gt}^{\left(s\right)},{y}^{\left(s\right)},{k}^{\left(s\right)}\right\rbrace }_{s=1}^{S}$ and ${\hat {x}}_{t}^{\left(s\right)}$ .

Performance

Preliminary tests by the author suggest that RTF₅^[1] obtains slightly better denoising performance than ${\text{CSF}}_{7\times 7}^{\left\lbrace \mathrm {3,4,5} \right\rbrace }$ , followed by ${\text{CSF}}_{5\times 5}^{5}$ , ${\text{CSF}}_{7\times 7}^{2}$ , ${\text{CSF}}_{5\times 5}^{\left\lbrace \mathrm {3,4} \right\rbrace }$ , and BM3D.

BM3D denoising speed falls between that of ${\text{CSF}}_{5\times 5}^{4}$ and ${\text{CSF}}_{7\times 7}^{4}$ , RTF being an order of magnitude slower.

Advantages

Results are comparable to those obtained by BM3D (reference in state of the art denoising since its inception in 2007)
Minimal runtime compared to other high-performance methods (potentially applicable within embedded devices)
Parallelizable (e.g.: possible GPU implementation)
Predictability: $O(D\log D)$ runtime where $D$ is the number of pixels
Fast training even with CPU

Implementations

A reference implementation has been written in MATLAB and released under the BSD 2-Clause license: shrinkage-fields

Related Research Articles

A centripetal force is a force that makes a body follow a curved path. Its direction is always orthogonal to the motion of the body and towards the fixed point of the instantaneous center of curvature of the path. Isaac Newton described it as "a force by which bodies are drawn or impelled, or in any way tend, towards a point as to a centre". In Newtonian mechanics, gravity provides the centripetal force responsible for astronomical orbits.

In mathematics, a spherical coordinate system is a coordinate system for three-dimensional space where the position of a point is specified by three numbers: the radial distance of that point from a fixed origin, its polar angle measured from a fixed zenith direction, and the azimuthal angle of its orthogonal projection on a reference plane that passes through the origin and is orthogonal to the zenith, measured from a fixed reference direction on that plane. It can be seen as the three-dimensional version of the polar coordinate system.

In mathematics, the trigonometric functions are real functions which relate an angle of a right-angled triangle to ratios of two side lengths. They are widely used in all sciences that are related to geometry, such as navigation, solid mechanics, celestial mechanics, geodesy, and many others. They are among the simplest periodic functions, and as such are also widely used for studying periodic phenomena, through Fourier analysis.

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of a probability distribution by maximizing a likelihood function, so that under the assumed statistical model the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

Synchrotron radiation electromagnetic radiation emitted when charged particles are accelerated radially

Synchrotron radiation is the electromagnetic radiation emitted when charged particles are accelerated radially, e.g., when they are subject to an acceleration perpendicular to their velocity. It is produced, for example, in synchrotrons using bending magnets, undulators and/or wigglers. If the particle is non-relativistic, then the emission is called cyclotron emission. If, on the other hand, the particles are relativistic, sometimes referred to as ultrarelativistic, the emission is called synchrotron emission. Synchrotron radiation may be achieved artificially in synchrotrons or storage rings, or naturally by fast electrons moving through magnetic fields. The radiation produced in this way has a characteristic polarization and the frequencies generated can range over the entire electromagnetic spectrum which is also called continuum radiation.

In mathematics and physical science, spherical harmonics are special functions defined on the surface of a sphere. They are often employed in solving partial differential equations in many scientific fields. The spherical harmonics are a complete set of orthogonal functions on the sphere, and thus may be used to represent functions defined on the surface of a sphere, just as circular functions are used to represent functions on a circle via Fourier series. Like the sines and cosines in Fourier series, the spherical harmonics may be organized by (spatial) angular frequency, as seen in the rows of functions in the illustration on the right. Further, spherical harmonics are basis functions for SO(3), the group of rotations in three dimensions, and thus play a central role in the group theoretic discussion of SO(3).

In control theory and signal processing, a linear, time-invariant system is said to be minimum-phase if the system and its inverse are causal and stable.

In mathematical statistics, the Fisher information is a way of measuring the amount of information that an observable random variable X carries about an unknown parameter θ of a distribution that models X. Formally, it is the variance of the score, or the expected value of the observed information. In Bayesian statistics, the asymptotic distribution of the posterior mode depends on the Fisher information and not on the prior. The role of the Fisher information in the asymptotic theory of maximum-likelihood estimation was emphasized by the statistician Ronald Fisher. The Fisher information is also used in the calculation of the Jeffreys prior, which is used in Bayesian statistics.

The Pythagorean trigonometric identity, also called the fundamental Pythagorean trigonometric identity or simply Pythagorean identity is an identity expressing the Pythagorean theorem in terms of trigonometric functions. Along with the sum-of-angles formulae, it is one of the basic relations between the sine and cosine functions.

A rotating frame of reference is a special case of a non-inertial reference frame that is rotating relative to an inertial reference frame. An everyday example of a rotating reference frame is the surface of the Earth.

Sea state The general condition of the free surface on a large body of water

In oceanography, sea state is the general condition of the free surface on a large body of water—with respect to wind waves and swell—at a certain location and moment. A sea state is characterized by statistics, including the wave height, period, and power spectrum. The sea state varies with time, as the wind conditions or swell conditions change. The sea state can either be assessed by an experienced observer, like a trained mariner, or through instruments like weather buoys, wave radar or remote sensing satellites.

In Bayesian statistics, a maximum a posteriori probability (MAP) estimate is an estimate of an unknown quantity, that equals the mode of the posterior distribution. The MAP can be used to obtain a point estimate of an unobserved quantity on the basis of empirical data. It is closely related to the method of maximum likelihood (ML) estimation, but employs an augmented optimization objective which incorporates a prior distribution over the quantity one wants to estimate. MAP estimation can therefore be seen as a regularization of ML estimation.

In physics, the optical theorem is a general law of wave scattering theory, which relates the forward scattering amplitude to the total cross section of the scatterer. It is usually written in the form

Photon polarization is the quantum mechanical description of the classical polarized sinusoidal plane electromagnetic wave. An individual photon can be described as having right or left circular polarization, or a superposition of the two. Equivalently, a photon can be described as having horizontal or vertical linear polarization, or a superposition of the two.

The main trigonometric identities between trigonometric functions are proved, using mainly the geometry of the right triangle. For greater and negative angles, see Trigonometric functions.

In polynomial interpolation of two variables, the Padua points are the first known example of a unisolvent point set with minimal growth of their Lebesgue constant, proven to be O(log² n) . Their name is due to the University of Padua, where they were originally discovered.

In probability theory and statistics, the half-normal distribution is a special case of the folded normal distribution.

In mathematical physics, the Berezin integral, named after Felix Berezin,, is a way to define integration for functions of Grassmann variables. It is not an integral in the Lebesgue sense; the word "integral" is used because the Berezin integral has properties analogous to the Lebesgue integral and because it extends the path integral in physics, where it is used as a sum over histories for fermions.

Wrapped Cauchy distribution wrapped probability distribution that results from the "wrapping" of the Cauchy distribution around the unit circle

In probability theory and directional statistics, a wrapped Cauchy distribution is a wrapped probability distribution that results from the "wrapping" of the Cauchy distribution around the unit circle. The Cauchy distribution is sometimes known as a Lorentzian distribution, and the wrapped Cauchy distribution may sometimes be referred to as a wrapped Lorentzian distribution.

In optics, the Fraunhofer diffraction equation is used to model the diffraction of waves when the diffraction pattern is viewed at a long distance from the diffracting object, and also when it is viewed at the focal plane of an imaging lens.

References

↑ Jancsary, Jeremy; Nowozin, Sebastian; Sharp, Toby; Rother, Carsten (10 April 2012). Regression Tree Fields – An Efficient, Non-parametric Approach to Image Labeling Problems. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). Providence, RI, USA: IEEE Computer Society. doi:10.1109/CVPR.2012.6247950.

Schmidt, Uwe; Roth, Stefan (2014). Shrinkage Fields for Effective Image Restoration (PDF). Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. Columbus, OH, USA: IEEE. doi:10.1109/CVPR.2014.349. ISBN 978-1-4799-5118-5.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Jancsary, Jeremy; Nowozin, Sebastian; Sharp, Toby; Rother, Carsten (10 April 2012). Regression Tree Fields – An Efficient, Non-parametric Approach to Image Labeling Problems. IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). Providence, RI, USA: IEEE Computer Society. doi:10.1109/CVPR.2012.6247950.