Total variation denoising

Last updated
Example of application of the Rudin et al. total variation denoising technique to an image corrupted by Gaussian noise. This example created using demo_tv.m by Guy Gilboa, see external links. ROF Denoising Example.png
Example of application of the Rudin et al. total variation denoising technique to an image corrupted by Gaussian noise. This example created using demo_tv.m by Guy Gilboa, see external links.

In signal processing, particularly image processing, total variation denoising, also known as total variation regularization or total variation filtering, is a noise removal process (filter). It is based on the principle that signals with excessive and possibly spurious detail have high total variation , that is, the integral of the absolute image gradient is high. According to this principle, reducing the total variation of the signal—subject to it being a close match to the original signal—removes unwanted detail whilst preserving important details such as edges. The concept was pioneered by L. I. Rudin, S. Osher, and E. Fatemi in 1992 and so is today known as the ROF model. [1]

Contents

This noise removal technique has advantages over simple techniques such as linear smoothing or median filtering which reduce noise but at the same time smooth away edges to a greater or lesser degree. By contrast, total variation denoising is a remarkably effective edge-preserving filter, i.e., simultaneously preserving edges whilst smoothing away noise in flat regions, even at low signal-to-noise ratios. [2]

1D signal series

Application of 1D total-variation denoising to a signal obtained from a single-molecule experiment. Gray is the original signal, black is the denoised signal. TVD 1D Example.png
Application of 1D total-variation denoising to a signal obtained from a single-molecule experiment. Gray is the original signal, black is the denoised signal.

For a digital signal , we can, for example, define the total variation as

Given an input signal , the goal of total variation denoising is to find an approximation, call it , that has smaller total variation than but is "close" to . One measure of closeness is the sum of square errors:

So the total-variation denoising problem amounts to minimizing the following discrete functional over the signal :

By differentiating this functional with respect to , we can derive a corresponding Euler–Lagrange equation, that can be numerically integrated with the original signal as initial condition. This was the original approach. [1] Alternatively, since this is a convex functional, techniques from convex optimization can be used to minimize it and find the solution . [3]

Regularization properties

The regularization parameter plays a critical role in the denoising process. When , there is no smoothing and the result is the same as minimizing the sum of squares. As , however, the total variation term plays an increasingly strong role, which forces the result to have smaller total variation, at the expense of being less like the input (noisy) signal. Thus, the choice of regularization parameter is critical to achieving just the right amount of noise removal.

2D signal images

We now consider 2D signals y, such as images. The total-variation norm proposed by the 1992 article is

and is isotropic and not differentiable. A variation that is sometimes used, since it may sometimes be easier to minimize, is an anisotropic version

The standard total-variation denoising problem is still of the form

where E is the 2D L2 norm. In contrast to the 1D case, solving this denoising is non-trivial. A recent algorithm that solves this is known as the primal dual method. [4]

Due in part to much research in compressed sensing in the mid-2000s, there are many algorithms, such as the split-Bregman method, that solve variants of this problem.

Rudin–Osher–Fatemi PDE

Suppose that we are given a noisy image and wish to compute a denoised image over a 2D space. ROF showed that the minimization problem we are looking to solve is: [5]

where is the set of functions with bounded variation over the domain , is the total variation over the domain, and is a penalty term. When is smooth, the total variation is equivalent to the integral of the gradient magnitude:

where is the Euclidean norm. Then the objective function of the minimization problem becomes:

From this functional, the Euler-Lagrange equation for minimization – assuming no time-dependence – gives us the nonlinear elliptic partial differential equation:

For some numerical algorithms, it is preferable to instead solve the time-dependent version of the ROF equation:

Applications

The Rudin–Osher–Fatemi model was a pivotal component in producing the first image of a black hole. [6]

See also

Related Research Articles

In mathematical optimization, the method of Lagrange multipliers is a strategy for finding the local maxima and minima of a function subject to equation constraints. It is named after the mathematician Joseph-Louis Lagrange. The basic idea is to convert a constrained problem into a form such that the derivative test of an unconstrained problem can still be applied. The relationship between the gradient of the function and gradients of the constraints rather naturally leads to a reformulation of the original problem, known as the Lagrangian function.

<span class="mw-page-title-main">Jensen's inequality</span> Theorem of convex functions

In mathematics, Jensen's inequality, named after the Danish mathematician Johan Jensen, relates the value of a convex function of an integral to the integral of the convex function. It was proved by Jensen in 1906, building on an earlier proof of the same inequality for doubly-differentiable functions by Otto Hölder in 1889. Given its generality, the inequality appears in many forms depending on the context, some of which are presented below. In its simplest form the inequality states that the convex transformation of a mean is less than or equal to the mean applied after convex transformation; it is a simple corollary that the opposite is true of concave transformations.

In mathematical analysis, a function of bounded variation, also known as BV function, is a real-valued function whose total variation is bounded (finite): the graph of a function having this property is well behaved in a precise sense. For a continuous function of a single variable, being of bounded variation means that the distance along the direction of the y-axis, neglecting the contribution of motion along x-axis, traveled by a point moving along the graph has a finite value. For a continuous function of several variables, the meaning of the definition is the same, except for the fact that the continuous path to be considered cannot be the whole graph of the given function, but can be every intersection of the graph itself with a hyperplane parallel to a fixed x-axis and to the y-axis.

In mathematics, the total variation identifies several slightly different concepts, related to the (local or global) structure of the codomain of a function or a measure. For a real-valued continuous function f, defined on an interval [a, b] ⊂ R, its total variation on the interval of definition is a measure of the one-dimensional arclength of the curve with parametric equation xf(x), for x ∈ [a, b]. Functions whose total variation is finite are called functions of bounded variation.

<span class="mw-page-title-main">Interior-point method</span> Algorithms for solving convex optimization problems

Interior-point methods are a certain class of algorithms that solve linear and nonlinear convex optimization problems.

<span class="mw-page-title-main">Geometry processing</span>

Geometry processing, or mesh processing, is an area of research that uses concepts from applied mathematics, computer science and engineering to design efficient algorithms for the acquisition, reconstruction, analysis, manipulation, simulation and transmission of complex 3D models. As the name implies, many of the concepts, data structures, and algorithms are directly analogous to signal processing and image processing. For example, where image smoothing might convolve an intensity signal with a blur kernel formed using the Laplace operator, geometric smoothing might be achieved by convolving a surface geometry with a blur kernel formed using the Laplace-Beltrami operator.

In mathematics, the Poincaré inequality is a result in the theory of Sobolev spaces, named after the French mathematician Henri Poincaré. The inequality allows one to obtain bounds on a function using bounds on its derivatives and the geometry of its domain of definition. Such bounds are of great importance in the modern, direct methods of the calculus of variations. A very closely related result is Friedrichs' inequality.

In mathematics, a Caccioppoli set is a set whose boundary is measurable and has a finite measure. A synonym is set of (locally) finite perimeter. Basically, a set is a Caccioppoli set if its characteristic function is a function of bounded variation.

The topological derivative is, conceptually, a derivative of a shape functional with respect to infinitesimal changes in its topology, such as adding an infinitesimal hole or crack. When used in higher dimensions than one, the term topological gradient is also used to name the first-order term of the topological asymptotic expansion, dealing only with infinitesimal singular domain perturbations. It has applications in shape optimization, topology optimization, image processing and mechanical modeling.

Compressed sensing is a signal processing technique for efficiently acquiring and reconstructing a signal, by finding solutions to underdetermined linear systems. This is based on the principle that, through optimization, the sparsity of a signal can be exploited to recover it from far fewer samples than required by the Nyquist–Shannon sampling theorem. There are two conditions under which recovery is possible. The first one is sparsity, which requires the signal to be sparse in some domain. The second one is incoherence, which is applied through the isometric property, which is sufficient for sparse signals.

In optimization, a self-concordant function is a function for which

The Bregman method is an iterative algorithm to solve certain convex optimization problems involving regularization. The original version is due to Lev M. Bregman, who published it in 1967.

In image processing and computer vision, anisotropic diffusion, also called Perona–Malik diffusion, is a technique aiming at reducing image noise without removing significant parts of the image content, typically edges, lines or other details that are important for the interpretation of the image. Anisotropic diffusion resembles the process that creates a scale space, where an image generates a parameterized family of successively more and more blurred images based on a diffusion process. Each of the resulting images in this family are given as a convolution between the image and a 2D isotropic Gaussian filter, where the width of the filter increases with the parameter. This diffusion process is a linear and space-invariant transformation of the original image. Anisotropic diffusion is a generalization of this diffusion process: it produces a family of parameterized images, but each resulting image is a combination between the original image and a filter that depends on the local content of the original image. As a consequence, anisotropic diffusion is a non-linear and space-variant transformation of the original image.

In mathematics, the Dirichlet eigenvalues are the fundamental modes of vibration of an idealized drum with a given shape. The problem of whether one can hear the shape of a drum is: given the Dirichlet eigenvalues, what features of the shape of the drum can one deduce. Here a "drum" is thought of as an elastic membrane Ω, which is represented as a planar domain whose boundary is fixed. The Dirichlet eigenvalues are found by solving the following problem for an unknown function u ≠ 0 and eigenvalue λ

<span class="mw-page-title-main">Step detection</span>

In statistics and signal processing, step detection is the process of finding abrupt changes in the mean level of a time series or signal. It is usually considered as a special case of the statistical method known as change detection or change point detection. Often, the step is small and the time series is corrupted by some kind of noise, and this makes the problem challenging because the step may be hidden by the noise. Therefore, statistical and/or signal processing algorithms are often required.

In mathematics, the Morrey–Campanato spaces are Banach spaces which extend the notion of functions of bounded mean oscillation, describing situations where the oscillation of the function in a ball is proportional to some power of the radius other than the dimension. They are used in the theory of elliptic partial differential equations, since for certain values of , elements of the space are Hölder continuous functions over the domain .

In mathematical optimization, the proximal operator is an operator associated with a proper, lower semi-continuous convex function from a Hilbert space to , and is defined by:

In numerical partial differential equations, the Ladyzhenskaya–Babuška–Brezzi (LBB) condition is a sufficient condition for a saddle point problem to have a unique solution that depends continuously on the input data. Saddle point problems arise in the discretization of Stokes flow and in the mixed finite element discretization of Poisson's equation. For positive-definite problems, like the unmixed formulation of the Poisson equation, most discretization schemes will converge to the true solution in the limit as the mesh is refined. For saddle point problems, however, many discretizations are unstable, giving rise to artifacts such as spurious oscillations. The LBB condition gives criteria for when a discretization of a saddle point problem is stable.

In functional analysis, every C*-algebra is isomorphic to a subalgebra of the C*-algebra of bounded linear operators on some Hilbert space This article describes the spectral theory of closed normal subalgebras of . A subalgebra of is called normal if it is commutative and closed under the operation: for all , we have and that .

<span class="mw-page-title-main">Chambolle-Pock algorithm</span> Primal-Dual algorithm optimization for convex problems

In mathematics, the Chambolle-Pock algorithm is an algorithm used to solve convex optimization problems. It was introduced by Antonin Chambolle and Thomas Pock in 2011 and has since become a widely used method in various fields, including image processing, computer vision, and signal processing.

References

  1. 1 2 3 Rudin, L. I.; Osher, S.; Fatemi, E. (1992). "Nonlinear total variation based noise removal algorithms". Physica D. 60 (1–4): 259–268. Bibcode:1992PhyD...60..259R. CiteSeerX   10.1.1.117.1675 . doi:10.1016/0167-2789(92)90242-f.
  2. Strong, D.; Chan, T. (2003). "Edge-preserving and scale-dependent properties of total variation regularization". Inverse Problems. 19 (6): S165–S187. Bibcode:2003InvPr..19S.165S. doi:10.1088/0266-5611/19/6/059. S2CID   250761777.
  3. 1 2 Little, M. A.; Jones, Nick S. (2010). "Sparse Bayesian Step-Filtering for High-Throughput Analysis of Molecular Machine Dynamics" (PDF). ICASSP 2010 Proceedings. 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
  4. Chambolle, A. (2004). "An algorithm for total variation minimization and applications". Journal of Mathematical Imaging and Vision. 20: 89–97. CiteSeerX   10.1.1.160.5226 . doi:10.1023/B:JMIV.0000011325.36760.1e. S2CID   207622122.
  5. Getreuer, Pascal (2012). "Rudin–Osher–Fatemi Total Variation Denoising using Split Bregman" (PDF).
  6. "Rudin–Osher–Fatemi Model Captures Infinity and Beyond". IPAM. 2019-04-15. Retrieved 2019-08-04.