In image processing, line detection is an algorithm that takes a collection of n edge points and finds all the lines on which these edge points lie. [1] The most popular line detectors are the Hough transform and convolution-based techniques. [2]
The Hough transform [3] can be used to detect lines and the output is a parametric description of the lines in an image, for example ρ = r cos(θ) + c sin(θ). [1] If there is a line in a row and column based image space, it can be defined ρ, the distance from the origin to the line along a perpendicular to the line, and θ, the angle of the perpendicular projection from the origin to the line measured in degrees clockwise from the positive row axis. Therefore, a line in the image corresponds to a point in the Hough space. [4] The Hough space for lines has therefore these two dimensions θ and ρ, and a line is represented by a single point corresponding to a unique set of these parameters. The Hough transform can then be implemented by choosing a set of values of ρ and θ to use. For each pixel (r, c) in the image, compute r cos(θ) + c sin(θ) for each values of θ, and place the result in the appropriate position in the (ρ, θ) array. At the end, the values of (ρ, θ) with the highest values in the array will correspond to strongest lines in the image
In a convolution-based technique, the line detector operator consists of a convolution masks tuned to detect the presence of lines of a particular width n and a θ orientation. Here are the four convolution masks to detect horizontal, vertical, oblique (+45 degrees), and oblique (−45 degrees) lines in an image.
a) Horizontal mask(R1)
−1 | −1 | −1 |
2 | 2 | 2 |
−1 | −1 | −1 |
(b) Vertical (R3)
−1 | 2 | −1 |
−1 | 2 | −1 |
−1 | 2 | −1 |
(C) Oblique (+45 degrees)(R2)
−1 | −1 | 2 |
−1 | 2 | −1 |
2 | −1 | −1 |
(d) Oblique (−45 degrees)(R4)
2 | −1 | −1 |
−1 | 2 | −1 |
−1 | −1 | 2 |
In practice, masks are run over the image and the responses are combined given by the following equation:
R(x, y) = max(|R1 (x, y)|, |R2 (x, y)|, |R3 (x, y)|, |R4 (x, y)|)
If R(x, y) > T, then discontinuity
As can be seen below, if mask is overlay on the image (horizontal line), multiply the coincident values, and sum all these results, the output will be the (convolved image). For example, (−1)(0)+(−1)(0)+(−1)(0) + (2)(1) +(2)(1)+(2)(1) + (−1)(0)+(−1)(0)+(−1)(0) = 6 pixels on the second row, second column in the (convolved image) starting from the upper left corner of the horizontal lines. [1] page 82
Horizontal line | convolved image | |||||||||||
0 | 0 | 0 | 0 | - | - | - | - | |||||
1 | 1 | 1 | 1 | = | - | 6 | 6 | - | ||||
Mask | * | 0 | 0 | 0 | 0 | - | - | - | - | |||
−1 | −1 | −1 | ||||||||||
2 | 2 | 2 | ||||||||||
−1 | −1 | −1 | ||||||||||
* | Vertical line | convolved image | ||||||||||
0 | 0 | 1 | 0 | - | - | - | - | |||||
0 | 0 | 1 | 0 | = | - | 0 | 0 | - | ||||
0 | 0 | 1 | 0 | - | - | - | - |
These masks above are tuned for light lines against a dark background, and would give a big negative response to dark lines against a light background. [5]
The code was used to detect only the vertical lines in an image using Matlab and the result is below. The original image is the one on the top and the result is below it. As can be seen on the picture on the right, only the vertical lines were detected
clearallclc% this MATLAB program will only detect vertical lines in an imagebuilding=imread('building.jpg');% This will upload the image buildingtol=5;% define a tolerance in the angle to account for noise or edge% that may look vertical but when the angle is computed% it may not appear to be[~,angle]=imgradient(building);out=(angle>=180-tol|angle<=-180+tol);%this part will filter the lineout_filter=bwareaopen(out,50);figure,imshow(building),title('Original Image');figure,imshow(out_filter),title('Detected Lines');
A 3D projection is a design technique used to display a three-dimensional (3D) object on a two-dimensional (2D) surface. These projections rely on visual perspective and aspect analysis to project a complex object for viewing capability on a simpler plane.
Edge detection includes a variety of mathematical methods that aim at identifying edges, defined as curves in a digital image at which the image brightness changes sharply or, more formally, has discontinuities. The same problem of finding discontinuities in one-dimensional signals is known as step detection and the problem of finding signal discontinuities over time is known as change detection. Edge detection is a fundamental tool in image processing, machine vision and computer vision, particularly in the areas of feature detection and feature extraction.
The Hough transform is a feature extraction technique used in image analysis, computer vision, pattern recognition, and digital image processing. The purpose of the technique is to find imperfect instances of objects within a certain class of shapes by a voting procedure. This voting procedure is carried out in a parameter space, from which object candidates are obtained as local maxima in a so-called accumulator space that is explicitly constructed by the algorithm for computing the Hough transform.
The Sobel operator, sometimes called the Sobel–Feldman operator or Sobel filter, is used in image processing and computer vision, particularly within edge detection algorithms where it creates an image emphasising edges. It is named after Irwin Sobel and Gary M. Feldman, colleagues at the Stanford Artificial Intelligence Laboratory (SAIL). Sobel and Feldman presented the idea of an "Isotropic 3 × 3 Image Gradient Operator" at a talk at SAIL in 1968. Technically, it is a discrete differentiation operator, computing an approximation of the gradient of the image intensity function. At each point in the image, the result of the Sobel–Feldman operator is either the corresponding gradient vector or the norm of this vector. The Sobel–Feldman operator is based on convolving the image with a small, separable, and integer-valued filter in the horizontal and vertical directions and is therefore relatively inexpensive in terms of computations. On the other hand, the gradient approximation that it produces is relatively crude, in particular for high-frequency variations in the image.
The Roberts cross operator is used in image processing and computer vision for edge detection. It was one of the first edge detectors and was initially proposed by Lawrence Roberts in 1963. As a differential operator, the idea behind the Roberts cross operator is to approximate the gradient of an image through discrete differentiation which is achieved by computing the sum of the squares of the differences between diagonally adjacent pixels.
The Canny edge detector is an edge detection operator that uses a multi-stage algorithm to detect a wide range of edges in images. It was developed by John F. Canny in 1986. Canny also produced a computational theory of edge detection explaining why the technique works.
Tomographic reconstruction is a type of multidimensional inverse problem where the challenge is to yield an estimate of a specific system from a finite number of projections. The mathematical basis for tomographic imaging was laid down by Johann Radon. A notable example of applications is the reconstruction of computed tomography (CT) where cross-sectional images of patients are obtained in non-invasive manner. Recent developments have seen the Radon transform and its inverse used for tasks related to realistic object insertion required for testing and evaluating computed tomography use in airport security.
The scale-invariant feature transform (SIFT) is a computer vision algorithm to detect, describe, and match local features in images, invented by David Lowe in 1999. Applications include object recognition, robotic mapping and navigation, image stitching, 3D modeling, gesture recognition, video tracking, individual identification of wildlife and match moving.
In image processing, a Gabor filter, named after Dennis Gabor, who first proposed it as a 1D filter. The Gabor filter was first generalized to 2D by Gösta Granlund, by adding a reference direction. The Gabor filter is a linear filter used for texture analysis, which essentially means that it analyzes whether there is any specific frequency content in the image in specific directions in a localized region around the point or region of analysis. Frequency and orientation representations of Gabor filters are claimed by many contemporary vision scientists to be similar to those of the human visual system. They have been found to be particularly appropriate for texture representation and discrimination. In the spatial domain, a 2D Gabor filter is a Gaussian kernel function modulated by a sinusoidal plane wave.
Optical resolution describes the ability of an imaging system to resolve detail, in the object that is being imaged. An imaging system may have many individual components, including one or more lenses, and/or recording and display components. Each of these contributes to the optical resolution of the system; the environment in which the imaging is done often is a further important factor.
The Prewitt operator is used in image processing, particularly within edge detection algorithms. Technically, it is a discrete differentiation operator, computing an approximation of the gradient of the image intensity function. At each point in the image, the result of the Prewitt operator is either the corresponding gradient vector or the norm of this vector. The Prewitt operator is based on convolving the image with a small, separable, and integer valued filter in horizontal and vertical directions and is therefore relatively inexpensive in terms of computations like Sobel and Kayyali operators. On the other hand, the gradient approximation which it produces is relatively crude, in particular for high frequency variations in the image. The Prewitt operator was developed by Judith M. S. Prewitt.
In computer vision and image processing, a feature is a piece of information about the content of an image; typically about whether a certain region of the image has certain properties. Features may be specific structures in the image such as points, edges or objects. Features may also be the result of a general neighborhood operation or feature detection applied to the image. Other examples of features are related to motion in image sequences, or to shapes defined in terms of curves or boundaries between different image regions.
An image gradient is a directional change in the intensity or color in an image. The gradient of the image is one of the fundamental building blocks in image processing. For example, the Canny edge detector uses image gradient for edge detection. In graphics software for digital image editing, the term gradient or color gradient is also used for a gradual blend of color which can be considered as an even gradation from low to high values, and seen from black to white in the images to the right. Another name for this is color progression.
The generalized Hough transform (GHT), introduced by Dana H. Ballard in 1981, is the modification of the Hough transform using the principle of template matching. The Hough transform was initially developed to detect analytically defined shapes. In these cases, we have knowledge of the shape and aim to find out its location and orientation in the image. This modification enables the Hough transform to be used to detect an arbitrary object described with its model.
In computer vision, speeded up robust features (SURF) is a local feature detector and descriptor, with patented applications. It can be used for tasks such as object recognition, image registration, classification, or 3D reconstruction. It is partly inspired by the scale-invariant feature transform (SIFT) descriptor. The standard version of SURF is several times faster than SIFT and claimed by its authors to be more robust against different image transformations than SIFT.
The Viola–Jones object detection framework is a machine learning object detection framework proposed in 2001 by Paul Viola and Michael Jones. It was motivated primarily by the problem of face detection, although it can be adapted to the detection of other object classes.
Deriche edge detector is an edge detection operator developed by Rachid Deriche in 1987. It is a multistep algorithm used to obtain an optimal result of edge detection in a discrete two-dimensional image. This algorithm is based on John F. Canny's work related to the edge detection and his criteria for optimal edge detection:
In image processing, a kernel, convolution matrix, or mask is a small matrix used for blurring, sharpening, embossing, edge detection, and more. This is accomplished by doing a convolution between the kernel and an image. Or more simply, when each pixel in the output image is a function of the nearby pixels in the input image, the kernel is that function.
Chessboards arise frequently in computer vision theory and practice because their highly structured geometry is well-suited for algorithmic detection and processing. The appearance of chessboards in computer vision can be divided into two main areas: camera calibration and feature extraction. This article provides a unified discussion of the role that chessboards play in the canonical methods from these two areas, including references to the seminal literature, examples, and pointers to software implementations.
In image analysis, the generalized structure tensor (GST) is an extension of the Cartesian structure tensor to curvilinear coordinates. It is mainly used to detect and to represent the "direction" parameters of curves, just as the Cartesian structure tensor detects and represents the direction in Cartesian coordinates. Curve families generated by pairs of locally orthogonal functions have been the best studied.