Block-matching and 3D filtering

Last updated
Left: original crop from raw image taken at ISO800, Middle: Denoised using bm3d-gpu (sigma=10, twostep), Right: Denoised using darktable 2.4.0 profiled denoise (non-local means and wavelets blend) Denoising comparison - noisy, bm3d, wavelets + non-local denoise.jpg
Left: original crop from raw image taken at ISO800, Middle: Denoised using bm3d-gpu (sigma=10, twostep), Right: Denoised using darktable 2.4.0 profiled denoise (non-local means and wavelets blend)

Block-matching and 3D filtering (BM3D) is a 3-D block-matching algorithm used primarily for noise reduction in images. [1]




Image fragments are grouped together based on similarity, but unlike standard k-means clustering and such cluster analysis methods, the image fragments are not necessarily disjoint. This block-matching algorithm is less computationally demanding and is useful later on in the aggregation step. Fragments do however have the same size. A fragment is grouped if its dissimilarity with a reference fragment falls below a specified threshold. This grouping technique is called block-matching, it is typically used to group similar groups across different frames of a digital video, BM3D on the other hand may group macroblocks within a single frame. All image fragments in a group are then stacked to form 3D cylinder-like shapes.

Collaborative filtering

Filtering is done on every fragments group. A [ clarification needed ] dimensional linear transform is applied, followed by a transform-domain shrinkage such as Wiener filtering, then the linear transform is inverted to reproduce all (filtered) fragments.


The image is transformed back into its two-dimensional form. All overlapping image fragments are weight-averaged to ensures that they are filtered for noise yet retain their distinct signal.


Color images

RGB images can be processed much like grayscale ones. A luminance-chrominance transformation should be applied to the RGB image. The grouping is then completed on the luminance channel which contains most of the useful information and a higher SNR. This approach works because the noise in the chrominance channels is strongly correlated to that of the luminance channel, and it saves approximately one-third of the computing time because grouping takes up approximately half of the required computing time.


The BM3D algorithm has been extended (IDD-BM3D) to perform decoupled deblurring and denoising using the Nash equilibrium balance of the two objective functions. [2]

Convolutional neural network

An approach that integrates a convolutional neural network has been proposed and shows better results (albeit with a slower runtime). [3] MATLAB code has been released for research purpose. [4]


Related Research Articles

Digital image processing is the use of a digital computer to process digital images through an algorithm. As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the build-up of noise and distortion during processing. Since images are defined over two dimensions digital image processing may be modeled in the form of multidimensional systems. The generation and development of digital image processing are mainly affected by three factors: first, the development of computers; second, the development of mathematics ; third, the demand for a wide range of applications in environment, agriculture, military, industry and medical science has increased.


In mathematics, deconvolution is the operation inverse to convolution. Both operation are used in signal processing and image processing. For example, convolution can be used to apply a filter, and it may be possible to recover the original signal using deconvolution.

Noise reduction is the process of removing noise from a signal. Noise reduction techniques exist for audio and images. Noise reduction algorithms may distort the signal to some degree.

Discrete wavelet transform transform in numerical harmonic analysis

In numerical analysis and functional analysis, a discrete wavelet transform (DWT) is any wavelet transform for which the wavelets are discretely sampled. As with other wavelet transforms, a key advantage it has over Fourier transforms is temporal resolution: it captures both frequency and location information.

Template matching is a technique in digital image processing for finding small parts of an image which match a template image. It can be used in manufacturing as a part of quality control, a way to navigate a mobile robot, or as a way to detect edges in images.

Gabor filter Linear filter used for texture analysis

In image processing, a Gabor filter, named after Dennis Gabor, is a linear filter used for texture analysis, which essentially means that it analyzes whether there is any specific frequency content in the image in specific directions in a localized region around the point or region of analysis. Frequency and orientation representations of Gabor filters are claimed by many contemporary vision scientists to be similar to those of the human visual system. They have been found to be particularly appropriate for texture representation and discrimination. In the spatial domain, a 2D Gabor filter is a Gaussian kernel function modulated by a sinusoidal plane wave.

Dot crawl

Dot crawl is a visual defect of color analog video standards when signals are transmitted as composite video, as in terrestrial broadcast television. It consists of moving checkerboard patterns which appear along horizontal color transitions. It results from intermodulation or crosstalk between chrominance and luminance components of the signal, which are imperfectly multiplexed in the frequency domain.

Image scaling Changing the resolution of a digital image

In computer graphics and digital imaging, imagescaling refers to the resizing of a digital image. In video technology, the magnification of digital material is known as upscaling or resolution enhancement.

A demosaicing algorithm is a digital image process used to reconstruct a full color image from the incomplete color samples output from an image sensor overlaid with a color filter array (CFA). It is also known as CFA interpolation or color reconstruction.

Block-matching algorithm

A Block Matching Algorithm is a way of locating matching macroblocks in a sequence of digital video frames for the purposes of motion estimation. The underlying supposition behind motion estimation is that the patterns corresponding to objects and background in a frame of video sequence move within the frame to form corresponding objects on the subsequent frame. This can be used to discover temporal redundancy in the video sequence, increasing the effectiveness of inter-frame video compression by defining the contents of a macroblock by reference to the contents of a known macroblock which is minimally different.

Bilateral filter

A bilateral filter is a non-linear, edge-preserving, and noise-reducing smoothing filter for images. It replaces the intensity of each pixel with a weighted average of intensity values from nearby pixels. This weight can be based on a Gaussian distribution. Crucially, the weights depend not only on Euclidean distance of pixels, but also on the radiometric differences. This preserves sharp edges.

Contourlets form a multiresolution directional tight frame designed to efficiently approximate images made of smooth regions separated by smooth boundaries. The contourlet transform has a fast implementation based on a Laplacian pyramid decomposition followed by directional filterbanks applied on each bandpass subband.

Edge-preserving smoothing is an image processing technique that smooths away noise or textures while retaining sharp edges. Examples are the median, bilateral, guided, and anisotropic diffusion filters.

Total variation denoising

In signal processing, total variation denoising, also known as total variation regularization, is a process, most often used in digital image processing, that has applications in noise removal. It is based on the principle that signals with excessive and possibly spurious detail have high total variation, that is, the integral of the absolute gradient of the signal is high. According to this principle, reducing the total variation of the signal—subject to it being a close match to the original signal—removes unwanted detail whilst preserving important details such as edges. The concept was pioneered by Rudin, Osher, and Fatemi in 1992 and so is today known as the ROF model.

In digital image and video processing, a color layout descriptor (CLD) is designed to capture the spatial distribution of color in an image. The feature extraction process consists of two parts: grid based representative color selection and discrete cosine transform with quantization.

In image Noise reduction, local pixel grouping is the algorithm to remove noise from images using principal component analysis (PCA).

HCL color space

HCL (Hue-Chroma-Luminance) or Lch refers to any of the many cylindrical color space models that are designed to accord with human perception of color with the three parameters. Lch has been adopted by information visualization practitioners to present data without the bias implicit in using varying saturation. They are, in general, designed to have characteristics of both cylindrical translations of the RGB color space, such as HSL and HSV, and the L*a*b* color space. Some conflicting definitions of the terms are:

Shrinkage fields is a random field-based machine learning technique that aims to perform high quality image restoration using low computational overhead.

In image processing, line detection is an algorithm that takes a collection of n edge points and finds all the lines on which these edge points lie. The most popular line detectors are the Hough transform and convolution-based techniques.

Video Super Resolution is the process of generating high-resolution video frames from the given low-resolution ones. The main goal is to restore more fine details, while saving coarse ones.


  1. Dabov, Kostadin; Foi, Alessandro; Katkovnik, Vladimir; Egiazarian, Karen (16 July 2007). "Image denoising by sparse 3D transform-domain collaborative filtering". IEEE Transactions on Image Processing. 16 (8): 2080–2095. Bibcode:2007ITIP...16.2080D. CiteSeerX . doi:10.1109/TIP.2007.901238.
  2. Danielyan, Aram; Katkovnik, Vladimir; Egiazarian, Karen (30 June 2011). "BM3D Frames and Variational Image Deblurring". IEEE Transactions on Image Processing. 21 (4): 1715–28. arXiv: 1106.6180 . Bibcode:2012ITIP...21.1715D. doi:10.1109/TIP.2011.2176954. PMID   22128008.
  3. Ahn, Byeongyong; Ik Cho, Nam (3 April 2017). "Block-Matching Convolutional Neural Network for Image Denoising". arXiv: 1704.00524 [Vision and Pattern Recognition Computer Vision and Pattern Recognition].
  4. "BMCNN-ISPL". Seoul National University. Retrieved 3 January 2018.
  5. "LASIP - Legal Notice". Tampere University of Technology (TUT). Retrieved 2 January 2018.
  6. Lebrun, Marc (8 August 2012). "An Analysis and Implementation of the BM3D Image Denoising Method". Image Processing on Line. 2: 175–213. doi: 10.5201/ipol.2012.l-bm3d .