Slab method

Last updated February 08, 2024

In computer graphics, the slab method is an algorithm used to solve the ray-box intersection problem in case of an axis-aligned bounding box (AABB), i.e. to determine the intersection points between a ray and the box. Due to its efficient nature, that can allow for a branch-free implementation, it is widely used in computer graphics applications.^[1]^[2]

Algorithm

The idea behind the algorithm is to clip the ray with the planes containing the six faces of the box. Each pair of parallel planes defines a slab, and the volume contained in the box is the intersection of the three slabs. Therefore, the portion of ray within the box (if any, given that the ray effectively intersects the box) will be given by the intersection of the portions of ray within each of the three slabs.^[3]

A tridimensional AABB can be represented by two triples ${\boldsymbol {l}}=(l_{0},l_{1},l_{2})$ and ${\boldsymbol {h}}=(h_{0},h_{1},h_{2})$ denoting the low and high bounds of the box along each dimension. A point ${\boldsymbol {p}}(t)=(p_{0}(t),p_{1}(t),p_{2}(t))$ along a ray with origin ${\boldsymbol {o}}=(o_{0},o_{1},o_{2})$ and direction ${\boldsymbol {r}}=(r_{0},r_{1},r_{2})$ can be expressed in parametric form as

{\boldsymbol {p}}(t)={\boldsymbol {o}}+t{\boldsymbol {r}}

.

Assuming that all intersections exist, i.e. $r_{i}\neq 0\;\forall i$ , solving for $t$ gives

t={\frac {{\boldsymbol {p}}-{\boldsymbol {o}}}{\boldsymbol {r}}}

and therefore the two intersections of the ray with the two planes orthogonal to the $i$ -th coordinate axis will be given by

{\begin{aligned}t_{i}^{\text{low}}&={\frac {l_{i}-o_{i}}{r_{i}}}\\t_{i}^{\text{high}}&={\frac {h_{i}-o_{i}}{r_{i}}}.\end{aligned}}

The close and far extrema of the segment within the $i$ -th slab will be given by

{\begin{aligned}t_{i}^{\text{close}}&=\min \left\{t_{i}^{\text{low}},t_{i}^{\text{high}}\right\}\\t_{i}^{\text{far}}&=\max \left\{t_{i}^{\text{low}},t_{i}^{\text{high}}\right\}\end{aligned}}

and the intersection of all these segments is

{\begin{aligned}t^{\text{close}}&=\max _{i}\left\{t_{i}^{\text{close}}\right\}\\t^{\text{far}}&=\min _{i}\left\{t_{i}^{\text{far}}\right\}.\end{aligned}}

Such resulting segment will be inside the box, and therefore an intersection exists, only if $t^{\text{close}}\leq t^{\text{far}}$ .^[4] The sign of $t^{\text{close}}$ determines if the intersection happens ahead or behind the origin of the ray, which might be interesting in applications such as ray casting, where only intersections in front of the camera are of interest. The two intersection points will therefore be given by

{\begin{aligned}{\boldsymbol {p}}^{\text{close}}&={\boldsymbol {o}}+t^{\text{close}}{\boldsymbol {r}}\\{\boldsymbol {p}}^{\text{far}}&={\boldsymbol {o}}+t^{\text{far}}{\boldsymbol {r}}.\end{aligned}}

While the equations above are well defined for real-valued variables only if $r_{i}\neq 0\;\forall i$ , i.e. if the ray is not parallel to any of the coordinate axes, the algorithm can be applied to an extended real number arithmetic (such as the one implemented by IEEE 754) to handle rays parallel to an axis, as long as the origin of the ray itself does not lie on one of the faces of the bounding box. In such arithmetic, the intersections with the planes parallel to the ray will be given by $t=+\infty$ or $t=-\infty$ , and the algorithm will still work as expected. If the origin lies on a face of the bounding box, then for some $i$ it will happen that $t_{i}={\frac {0}{0}}$ , which is undefined (in IEEE 754 it is represented by NaN). However, implementations of the IEEE 754-2008 minNum and maxNum functions^[5] will treat NaN as a missing value, and when comparing a well-defined value with a NaN they will always return the well-defined value,^[6] and they will therefore be able to handle even such corner case. An alternative approach to handle corner cases is to avoid divisions by zero altogether, which can be achieved by replacing the inverse of zero with a large arbitrary constant number.^[4]

Related Research Articles

<span class="mw-page-title-main">Torque</span> Turning force around an axis

In physics and mechanics, torque is the rotational analogue of linear force. It is also referred to as the moment of force. It describes the rate of change of angular momentum that would be imparted to an isolated body.

Pattern recognition is the task of assigning a class to an observation based on patterns extracted from data. While similar, pattern recognition (PR) is not to be confused with pattern machines (PM) which may possess (PR) capabilities but their primary function is to distinguish and create emergent pattern. PR has applications in statistical data analysis, signal processing, image analysis, information retrieval, bioinformatics, data compression, computer graphics and machine learning. Pattern recognition has its origins in statistics and engineering; some modern approaches to pattern recognition include the use of machine learning, due to the increased availability of big data and a new abundance of processing power.

<span class="mw-page-title-main">Singular value decomposition</span> Matrix decomposition

In linear algebra, the singular value decomposition (SVD) is a factorization of a real or complex matrix into a rotation, followed by a rescaling followed by another rotation. It generalizes the eigendecomposition of a square normal matrix with an orthonormal eigenbasis to any $matrix. It is related to the polar decomposition.$

<span class="mw-page-title-main">Ellipsoid</span> Quadric surface that looks like a deformed sphere

An ellipsoid is a surface that can be obtained from a sphere by deforming it by means of directional scalings, or more generally, of an affine transformation.

<span class="mw-page-title-main">Hamiltonian mechanics</span> Formulation of classical mechanics using momenta

Hamiltonian mechanics emerged in 1833 as a reformulation of Lagrangian mechanics. Introduced by Sir William Rowan Hamilton, Hamiltonian mechanics replaces (generalized) velocities $used in Lagrangian mechanics with (generalized) momenta . Both theories provide interpretations of classical mechanics and describe the same physical phenomena.$

In statistics, an expectation–maximization (EM) algorithm is an iterative method to find (local) maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variables. The EM iteration alternates between performing an expectation (E) step, which creates a function for the expectation of the log-likelihood evaluated using the current estimate for the parameters, and a maximization (M) step, which computes parameters maximizing the expected log-likelihood found on the E step. These parameter-estimates are then used to determine the distribution of the latent variables in the next E step. It can be used, for example, to estimate a mixture of gaussians, or to solve the multiple linear regression problem.

In mathematics and computing, the Levenberg–Marquardt algorithm, also known as the damped least-squares (DLS) method, is used to solve non-linear least squares problems. These minimization problems arise especially in least squares curve fitting. The LMA interpolates between the Gauss–Newton algorithm (GNA) and the method of gradient descent. The LMA is more robust than the GNA, which means that in many cases it finds a solution even if it starts very far off the final minimum. For well-behaved functions and reasonable starting parameters, the LMA tends to be slower than the GNA. LMA can also be viewed as Gauss–Newton using a trust region approach.

A sensor array is a group of sensors, usually deployed in a certain geometry pattern, used for collecting and processing electromagnetic or acoustic signals. The advantage of using a sensor array over using a single sensor lies in the fact that an array adds new dimensions to the observation, helping to estimate more parameters and improve the estimation performance. For example an array of radio antenna elements used for beamforming can increase antenna gain in the direction of the signal while decreasing the gain in other directions, i.e., increasing signal-to-noise ratio (SNR) by amplifying the signal coherently. Another example of sensor array application is to estimate the direction of arrival of impinging electromagnetic waves. The related processing method is called array signal processing. A third examples includes chemical sensor arrays, which utilize multiple chemical sensors for fingerprint detection in complex mixtures or sensing environments. Application examples of array signal processing include radar/sonar, wireless communications, seismology, machine condition monitoring, astronomical observations fault diagnosis, etc.

The Gauss–Newton algorithm is used to solve non-linear least squares problems, which is equivalent to minimizing a sum of squared function values. It is an extension of Newton's method for finding a minimum of a non-linear function. Since a sum of squares must be nonnegative, the algorithm can be viewed as using Newton's method to iteratively approximate zeroes of the components of the sum, and thus minimizing the sum. In this sense, the algorithm is also an effective method for solving overdetermined systems of equations. It has the advantage that second derivatives, which can be challenging to compute, are not required.

<span class="mw-page-title-main">Boltzmann machine</span> Type of stochastic recurrent neural network

A Boltzmann machine, named after Ludwig Boltzmann is a special energy-based model. It is a stochastic spin-glass model with an external field, i.e., a Sherrington–Kirkpatrick model, that is a stochastic Ising model. It is a statistical physics technique applied in the context of cognitive science. It is also classified as a Markov random field.

In mathematics, a symplectic integrator (SI) is a numerical integration scheme for Hamiltonian systems. Symplectic integrators form the subclass of geometric integrators which, by definition, are canonical transformations. They are widely used in nonlinear dynamics, molecular dynamics, discrete element methods, accelerator physics, plasma physics, quantum physics, and celestial mechanics.

The method of iteratively reweighted least squares (IRLS) is used to solve certain optimization problems with objective functions of the form of a p-norm:

In computer science, locality-sensitive hashing (LSH) is a fuzzy hashing technique that hashes similar input items into the same "buckets" with high probability. Since similar items end up in the same buckets, this technique can be used for data clustering and nearest neighbor search. It differs from conventional hashing techniques in that hash collisions are maximized, not minimized. Alternatively, the technique can be seen as a way to reduce the dimensionality of high-dimensional data; high-dimensional input items can be reduced to low-dimensional versions while preserving relative distances between items.

<span class="mw-page-title-main">Fast inverse square root</span> Root-finding algorithm

Fast inverse square root, sometimes referred to as Fast InvSqrt or by the hexadecimal constant 0x5F3759DF, is an algorithm that estimates $, the reciprocal of the square root of a 32-bit floating-point number in IEEE 754 floating-point format. The algorithm is best known for its implementation in 1999 in Quake III Arena, a first-person shooter video game heavily based on 3D graphics. With subsequent hardware advancements, especially the x86 SSE instruction rsqrtss, this algorithm is not generally the best choice for modern computers, though it remains an interesting historical example.$

<span class="mw-page-title-main">Matrix completion</span>

Matrix completion is the task of filling in the missing entries of a partially observed matrix, which is equivalent to performing data imputation in statistics. A wide range of datasets are naturally organized in matrix form. One example is the movie-ratings matrix, as appears in the Netflix problem: Given a ratings matrix in which each entry $represents the rating of movie by customer, if customer has watched movie and is otherwise missing, we would like to predict the remaining entries in order to make good recommendations to customers on what to watch next. Another example is the document-term matrix: The frequencies of words used in a collection of documents can be represented as a matrix, where each entry corresponds to the number of times the associated term appears in the indicated document.$

The two-rays ground-reflection model is a multipath radio propagation model which predicts the path losses between a transmitting antenna and a receiving antenna when they are in line of sight (LOS). Generally, the two antenna each have different height. The received signal having two components, the LOS component and the reflection component formed predominantly by a single ground reflected wave.

In machine learning, the kernel embedding of distributions comprises a class of nonparametric methods in which a probability distribution is represented as an element of a reproducing kernel Hilbert space (RKHS). A generalization of the individual data-point feature mapping done in classical kernel methods, the embedding of distributions into infinite-dimensional feature spaces can preserve all of the statistical features of arbitrary distributions, while allowing one to compare and manipulate distributions using Hilbert space operations such as inner products, distances, projections, linear transformations, and spectral analysis. This learning framework is very general and can be applied to distributions over any space $on which a sensible kernel function may be defined. For example, various kernels have been proposed for learning from data which are: vectors in, discrete classes/categories, strings, graphs/networks, images, time series, manifolds, dynamical systems, and other structured objects. The theory behind kernel embeddings of distributions has been primarily developed by Alex Smola, Le Song, Arthur Gretton, and Bernhard Schölkopf. A review of recent works on kernel embedding of distributions can be found in.$

αΒΒ is a second-order deterministic global optimization algorithm for finding the optima of general, twice continuously differentiable functions. The algorithm is based around creating a relaxation for nonlinear functions of general form by superposing them with a quadratic of sufficient magnitude, called α, such that the resulting superposition is enough to overcome the worst-case scenario of non-convexity of the original function. Since a quadratic has a diagonal Hessian matrix, this superposition essentially adds a number to all diagonal elements of the original Hessian, such that the resulting Hessian is positive-semidefinite. Thus, the resulting relaxation is a convex function.

Powell's dog leg method, also called Powell's hybrid method, is an iterative optimisation algorithm for the solution of non-linear least squares problems, introduced in 1970 by Michael J. D. Powell. Similarly to the Levenberg–Marquardt algorithm, it combines the Gauss–Newton algorithm with gradient descent, but it uses an explicit trust region. At each iteration, if the step from the Gauss–Newton algorithm is within the trust region, it is used to update the current solution. If not, the algorithm searches for the minimum of the objective function along the steepest descent direction, known as Cauchy point. If the Cauchy point is outside of the trust region, it is truncated to the boundary of the latter and it is taken as the new solution. If the Cauchy point is inside the trust region, the new solution is taken at the intersection between the trust region boundary and the line joining the Cauchy point and the Gauss-Newton step.

In mathematics, the Chambolle-Pock algorithm is an algorithm used to solve convex optimization problems. It was introduced by Antonin Chambolle and Thomas Pock in 2011 and has since become a widely used method in various fields, including image processing, computer vision, and signal processing.

References

Sources

IEEE Standards Committee (2008). IEEE standard for floating-point arithmetic: 754-2008. Washington, DC: IEEE Computer Society.
Kay, Timothy L.; Kajiya, James T. (1986). Ray tracing complex scenes. ACM SIGGRAPH computer graphics. Vol. 20. Dallas. pp. 269–278.
Majercik, Alexander; Crassin, Cyril; Shirley, Peter; McGuire, Morgan (2018). "A Ray-Box Intersection Algorithm and Efficient Dynamic Voxel Rendering". Journal of Computer Graphics Techniques (JCGT). 7 (3): 66–81.
Shirley, Peter; Wald, Ingo; Marrs, Adam (2021). "Ray Axis-Aligned Bounding Box Intersection". Ray Tracing Gems II. Berkeley, CA: Apress.

External links

Barnes, Tavian (27 July 2022). "Fast, Branchless Ray/Bounding Box Intersections, Part 3: Boundaries". Archived from the original on 2023-07-22.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[FOOTNOTEShirleyWaldMarrs2021-1] Shirley, Wald & Marrs 2021.

[FOOTNOTEMajercikCrassinShirleyMcGuire2018-2] Majercik et al. 2018.

[FOOTNOTEKayKajiya1986272-3] Kay & Kajiya 1986, p. 272.

[FOOTNOTEKayKajiya1986273-4] 1 2 Kay & Kajiya 1986, p. 273.

[5] For example, fmin and fmax in C99.

[FOOTNOTEIEEE_Standards_Committee2008-6] IEEE Standards Committee 2008.

[1]

[2]

[3]

[4]

[5]

[6]