Detection theory

Last updated

Detection theory or signal detection theory is a means to measure the ability to differentiate between information-bearing patterns (called stimulus in living organisms, signal in machines) and random patterns that distract from the information (called noise, consisting of background stimuli and random activity of the detection machine and of the nervous system of the operator).

Contents

In the field of electronics, signal recovery is the separation of such patterns from a disguising background. [1]

According to the theory, there are a number of determiners of how a detecting system will detect a signal, and where its threshold levels will be. The theory can explain how changing the threshold will affect the ability to discern, often exposing how adapted the system is to the task, purpose or goal at which it is aimed. When the detecting system is a human being, characteristics such as experience, expectations, physiological state (e.g., fatigue) and other factors can affect the threshold applied. For instance, a sentry in wartime might be likely to detect fainter stimuli than the same sentry in peacetime due to a lower criterion, however they might also be more likely to treat innocuous stimuli as a threat.

Much of the early work in detection theory was done by radar researchers. [2] By 1954, the theory was fully developed on the theoretical side as described by Peterson, Birdsall and Fox [3] and the foundation for the psychological theory was made by Wilson P. Tanner, David M. Green, and John A. Swets, also in 1954. [4] Detection theory was used in 1966 by John A. Swets and David M. Green for psychophysics. [5] Green and Swets criticized the traditional methods of psychophysics for their inability to discriminate between the real sensitivity of subjects and their (potential) response biases. [6]

Detection theory has applications in many fields such as diagnostics of any kind, quality control, telecommunications, and psychology. The concept is similar to the signal-to-noise ratio used in the sciences and confusion matrices used in artificial intelligence. It is also usable in alarm management, where it is important to separate important events from background noise.

Psychology

Signal detection theory (SDT) is used when psychologists want to measure the way we make decisions under conditions of uncertainty, such as how we would perceive distances in foggy conditions or during eyewitness identification. [7] [8] SDT assumes that the decision maker is not a passive receiver of information, but an active decision-maker who makes difficult perceptual judgments under conditions of uncertainty. In foggy circumstances, we are forced to decide how far away from us an object is, based solely upon visual stimulus which is impaired by the fog. Since the brightness of the object, such as a traffic light, is used by the brain to discriminate the distance of an object, and the fog reduces the brightness of objects, we perceive the object to be much farther away than it actually is (see also decision theory). According to SDT, during eyewitness identifications, witnesses base their decision as to whether a suspect is the culprit or not based on their perceived level of familiarity with the suspect.

To apply signal detection theory to a data set where stimuli were either present or absent, and the observer categorized each trial as having the stimulus present or absent, the trials are sorted into one of four categories:

Respond "Absent"Respond "Present"
Stimulus Present Miss Hit
Stimulus AbsentCorrect Rejection False Alarm

Based on the proportions of these types of trials, numerical estimates of sensitivity can be obtained with statistics like the sensitivity index d' and A', [9] and response bias can be estimated with statistics like c and β. [9] β is the measure of response bias. [10]

Signal detection theory can also be applied to memory experiments, where items are presented on a study list for later testing. A test list is created by combining these 'old' items with novel, 'new' items that did not appear on the study list. On each test trial the subject will respond 'yes, this was on the study list' or 'no, this was not on the study list'. Items presented on the study list are called Targets, and new items are called Distractors. Saying 'Yes' to a target constitutes a Hit, while saying 'Yes' to a distractor constitutes a False Alarm.

Respond "No"Respond "Yes"
Target Miss Hit
DistractorCorrect Rejection False Alarm

Applications

Signal Detection Theory has wide application, both in humans and animals. Topics include memory, stimulus characteristics of schedules of reinforcement, etc.

Sensitivity or discriminability

Conceptually, sensitivity refers to how hard or easy it is to detect that a target stimulus is present from background events. For example, in a recognition memory paradigm, having longer to study to-be-remembered words makes it easier to recognize previously seen or heard words. In contrast, having to remember 30 words rather than 5 makes the discrimination harder. One of the most commonly used statistics for computing sensitivity is the so-called sensitivity index or d'. There are also non-parametric measures, such as the area under the ROC-curve. [6]

Bias

Bias is the extent to which one response is more probable than another, averaging across stimulus-present and stimulus-absent cases. That is, a receiver may be more likely overall to respond that a stimulus is present or more likely overall to respond that a stimulus is not present. Bias is independent of sensitivity. Bias can be desirable if false alarms and misses lead to different costs. For example, if the stimulus is a bomber, then a miss (failing to detect the bomber) may be more costly than a false alarm (reporting a bomber when there is not one), making a liberal response bias desirable. In contrast, giving false alarms too often (crying wolf) may make people less likely to respond, a problem that can be reduced by a conservative response bias.

Compressed sensing

Another field which is closely related to signal detection theory is called compressed sensing (or compressive sensing). The objective of compressed sensing is to recover high dimensional but with low complexity entities from only a few measurements. Thus, one of the most important applications of compressed sensing is in the recovery of high dimensional signals which are known to be sparse (or nearly sparse) with only a few linear measurements. The number of measurements needed in the recovery of signals is by far smaller than what Nyquist sampling theorem requires provided that the signal is sparse, meaning that it only contains a few non-zero elements. There are different methods of signal recovery in compressed sensing including basis pursuit , expander recovery algorithm [11] , CoSaMP [12] and also fastnon-iterative algorithm. [13] In all of the recovery methods mentioned above, choosing an appropriate measurement matrix using probabilistic constructions or deterministic constructions, is of great importance. In other words, measurement matrices must satisfy certain specific conditions such as RIP (Restricted Isometry Property) or Null-Space property in order to achieve robust sparse recovery.

Mathematics

P(H1|y) > P(H2|y) / MAP testing

In the case of making a decision between two hypotheses, H1, absent, and H2, present, in the event of a particular observation, y, a classical approach is to choose H1 when p(H1|y) > p(H2|y) and H2 in the reverse case. [14] In the event that the two a posteriori probabilities are equal, one might choose to default to a single choice (either always choose H1 or always choose H2), or might randomly select either H1 or H2. The a priori probabilities of H1 and H2 can guide this choice, e.g. by always choosing the hypothesis with the higher a priori probability.

When taking this approach, usually what one knows are the conditional probabilities, p(y|H1) and p(y|H2), and the a priori probabilities and . In this case,

,

where p(y) is the total probability of event y,

.

H2 is chosen in case

and H1 otherwise.

Often, the ratio is called and is called , the likelihood ratio .

Using this terminology, H2 is chosen in case . This is called MAP testing, where MAP stands for "maximum a posteriori").

Taking this approach minimizes the expected number of errors one will make.

Bayes criterion

In some cases, it is far more important to respond appropriately to H1 than it is to respond appropriately to H2. For example, if an alarm goes off, indicating H1 (an incoming bomber is carrying a nuclear weapon), it is much more important to shoot down the bomber if H1 = TRUE, than it is to avoid sending a fighter squadron to inspect a false alarm (i.e., H1 = FALSE, H2 = TRUE) (assuming a large supply of fighter squadrons). The Bayes criterion is an approach suitable for such cases. [14]

Here a utility is associated with each of four situations:

As is shown below, what is important are the differences, and .

Similarly, there are four probabilities, , , etc., for each of the cases (which are dependent on one's decision strategy).

The Bayes criterion approach is to maximize the expected utility:

Effectively, one may maximize the sum,

,

and make the following substitutions:

where and are the a priori probabilities, and , and is the region of observation events, y, that are responded to as though H1 is true.

and thus are maximized by extending over the region where

This is accomplished by deciding H2 in case

and H1 otherwise, where L(y) is the so-defined likelihood ratio .

Normal distribution models

Das and Geisler [15] extended the results of signal detection theory for normally distributed stimuli, and derived methods of computing the error rate and confusion matrix for ideal observers and non-ideal observers for detecting and categorizing univariate and multivariate normal signals from two or more categories.

See also

Related Research Articles

<span class="mw-page-title-main">Discrete Fourier transform</span> Type of Fourier transform in discrete mathematics

In mathematics, the discrete Fourier transform (DFT) converts a finite sequence of equally-spaced samples of a function into a same-length sequence of equally-spaced samples of the discrete-time Fourier transform (DTFT), which is a complex-valued function of frequency. The interval at which the DTFT is sampled is the reciprocal of the duration of the input sequence. An inverse DFT (IDFT) is a Fourier series, using the DTFT samples as coefficients of complex sinusoids at the corresponding DTFT frequencies. It has the same sample-values as the original input sequence. The DFT is therefore said to be a frequency domain representation of the original input sequence. If the original sequence spans all the non-zero values of a function, its DTFT is continuous, and the DFT provides discrete samples of one cycle. If the original sequence is one cycle of a periodic function, the DFT provides all the non-zero values of one DTFT cycle.

<span class="mw-page-title-main">Dirac delta function</span> Generalized function whose value is zero everywhere except at zero

In mathematical analysis, the Dirac delta distribution, also known as the unit impulse, is a generalized function or distribution over the real numbers, whose value is zero everywhere except at zero, and whose integral over the entire real line is equal to one.

<span class="mw-page-title-main">Probability density function</span> Function whose integral over a region describes the probability of an event occurring in that region

In probability theory, a probability density function (PDF), density function, or density of an absolutely continuous random variable, is a function whose value at any given sample in the sample space can be interpreted as providing a relative likelihood that the value of the random variable would be equal to that sample. Probability density is the probability per unit length, in other words, while the absolute likelihood for a continuous random variable to take on any particular value is 0, the value of the PDF at two different samples can be used to infer, in any particular draw of the random variable, how much more likely it is that the random variable would be close to one sample compared to the other sample.

<span class="mw-page-title-main">Fourier transform</span> Mathematical transform that expresses a function of time as a function of frequency

In physics, engineering and mathematics, the Fourier transform (FT) is an integral transform that converts a function into a form that describes the frequencies present in the original function. The output of the transform is a complex-valued function of frequency. The term Fourier transform refers to both this complex-valued function and the mathematical operation. When a distinction needs to be made the Fourier transform is sometimes called the frequency domain representation of the original function. The Fourier transform is analogous to decomposing the sound of a musical chord into the intensities of its constituent pitches.

In mathematics, a Gaussian function, often simply referred to as a Gaussian, is a function of the base form

<span class="mw-page-title-main">Beta function</span> Mathematical function

In mathematics, the beta function, also called the Euler integral of the first kind, is a special function that is closely related to the gamma function and to binomial coefficients. It is defined by the integral

<span class="mw-page-title-main">Rayleigh distribution</span> Probability distribution

In probability theory and statistics, the Rayleigh distribution is a continuous probability distribution for nonnegative-valued random variables. Up to rescaling, it coincides with the chi distribution with two degrees of freedom. The distribution is named after Lord Rayleigh.

In complex analysis, the Hardy spacesHp are certain spaces of holomorphic functions on the unit disk or upper half plane. They were introduced by Frigyes Riesz, who named them after G. H. Hardy, because of the paper. In real analysis Hardy spaces are certain spaces of distributions on the real line, which are boundary values of the holomorphic functions of the complex Hardy spaces, and are related to the Lp spaces of functional analysis. For 1 ≤ p < ∞ these real Hardy spaces Hp are certain subsets of Lp, while for p < 1 the Lp spaces have some undesirable properties, and the Hardy spaces are much better behaved.

<span class="mw-page-title-main">Wave packet</span> Short "burst" or "envelope" of restricted wave action that travels as a unit

In physics, a wave packet is a short burst of localized wave action that travels as a unit, outlined by an envelope. A wave packet can be analyzed into, or can be synthesized from, a potentially-infinite set of component sinusoidal waves of different wavenumbers, with phases and amplitudes such that they interfere constructively only over a small region of space, and destructively elsewhere. Any signal of a limited width in time or space requires many frequency components around a center frequency within a bandwidth inversely proportional to that width; even a gaussian function is considered a wave packet because its Fourier transform is a "packet" of waves of frequencies clustered around a central frequency. Each component wave function, and hence the wave packet, are solutions of a wave equation. Depending on the wave equation, the wave packet's profile may remain constant or it may change (dispersion) while propagating.

In mathematics and signal processing, the Hilbert transform is a specific singular integral that takes a function, u(t) of a real variable and produces another function of a real variable H(u)(t). The Hilbert transform is given by the Cauchy principal value of the convolution with the function (see § Definition). The Hilbert transform has a particularly simple representation in the frequency domain: It imparts a phase shift of ±90° (π/2 radians) to every frequency component of a function, the sign of the shift depending on the sign of the frequency (see § Relationship with the Fourier transform). The Hilbert transform is important in signal processing, where it is a component of the analytic representation of a real-valued signal u(t). The Hilbert transform was first introduced by David Hilbert in this setting, to solve a special case of the Riemann–Hilbert problem for analytic functions.

<span class="mw-page-title-main">Sinc function</span> Special mathematical function defined as sin(x)/x

In mathematics, physics and engineering, the sinc function, denoted by sinc(x), has two forms, normalized and unnormalized.

<span class="mw-page-title-main">Buffon's needle problem</span> Question in geometric probability

In probability theory, Buffon's needle problem is a question first posed in the 18th century by Georges-Louis Leclerc, Comte de Buffon:

In electronics, when describing a voltage or current step function, rise time is the time taken by a signal to change from a specified low value to a specified high value. These values may be expressed as ratios or, equivalently, as percentages with respect to a given reference value. In analog electronics and digital electronics, these percentages are commonly the 10% and 90% of the output step height: however, other values are commonly used. For applications in control theory, according to Levine, rise time is defined as "the time required for the response to rise from x% to y% of its final value", with 0% to 100% rise time common for underdamped second order systems, 5% to 95% for critically damped and 10% to 90% for overdamped ones. According to Orwiler, the term "rise time" applies to either positive or negative step response, even if a displayed negative excursion is popularly termed fall time.

In quantum computing and specifically the quantum circuit model of computation, a quantum logic gate is a basic quantum circuit operating on a small number of qubits. They are the building blocks of quantum circuits, like classical logic gates are for conventional digital circuits.

<span class="mw-page-title-main">Rectangular function</span> Function whose graph is 0, then 1, then 0 again, in an almost-everywhere continuous way

The rectangular function is defined as

Estimation theory is a branch of statistics that deals with estimating the values of parameters based on measured empirical data that has a random component. The parameters describe an underlying physical setting in such a way that their value affects the distribution of the measured data. An estimator attempts to approximate the unknown parameters using the measurements. In estimation theory, two approaches are generally considered:

<span class="mw-page-title-main">Controlled NOT gate</span> Quantum logic gate

In computer science, the controlled NOT gate, controlled-X gate, controlled-bit-flip gate, Feynman gate or controlled Pauli-X is a quantum logic gate that is an essential component in the construction of a gate-based quantum computer. It can be used to entangle and disentangle Bell states. Any quantum circuit can be simulated to an arbitrary degree of accuracy using a combination of CNOT gates and single qubit rotations. The gate is sometimes named after Richard Feynman who developed an early notation for quantum gate diagrams in 1986.

<span class="mw-page-title-main">Fisher's noncentral hypergeometric distribution</span>

In probability theory and statistics, Fisher's noncentral hypergeometric distribution is a generalization of the hypergeometric distribution where sampling probabilities are modified by weight factors. It can also be defined as the conditional distribution of two or more binomially distributed variables dependent upon their fixed sum.

<span class="mw-page-title-main">Phase stretch transform</span>

Phase stretch transform (PST) is a computational approach to signal and image processing. One of its utilities is for feature detection and classification. PST is related to time stretch dispersive Fourier transform. It transforms the image by emulating propagation through a diffractive medium with engineered 3D dispersive property. The operation relies on symmetry of the dispersion profile and can be understood in terms of dispersive eigenfunctions or stretch modes. PST performs similar functionality as phase-contrast microscopy, but on digital images. PST can be applied to digital images and temporal data. It is a physics-based feature engineering algorithm.

<span class="mw-page-title-main">Hyperbolastic functions</span> Mathematical functions

The hyperbolastic functions, also known as hyperbolastic growth models, are mathematical functions that are used in medical statistical modeling. These models were originally developed to capture the growth dynamics of multicellular tumor spheres, and were introduced in 2005 by Mohammad Tabatabai, David Williams, and Zoran Bursac. The precision of hyperbolastic functions in modeling real world problems is somewhat due to their flexibility in their point of inflection. These functions can be used in a wide variety of modeling problems such as tumor growth, stem cell proliferation, pharma kinetics, cancer growth, sigmoid activation function in neural networks, and epidemiological disease progression or regression.

References

  1. T. H. Wilmshurst (1990). Signal Recovery from Noise in Electronic Instrumentation (2nd ed.). CRC Press. pp. 11 ff. ISBN   978-0-7503-0058-2.
  2. Marcum, J. I. (1947). "A Statistical Theory of Target Detection by Pulsed Radar". The Research Memorandum: 90. Retrieved 2009-06-28.
  3. Peterson, W.; Birdsall, T.; Fox, W. (September 1954). "The theory of signal detectability". Transactions of the IRE Professional Group on Information Theory. 4 (4): 171–212. doi:10.1109/TIT.1954.1057460.
  4. Tanner, Wilson P.; Swets, John A. (1954). "A decision-making theory of visual detection". Psychological Review. 61 (6): 401–409. doi:10.1037/h0058700. PMID   13215690.
  5. Swets, J.A. (ed.) (1964) Signal detection and recognition by human observers. New York: Wiley[ page needed ]
  6. 1 2 Green, D.M., Swets J.A. (1966) Signal Detection Theory and Psychophysics. New York: Wiley. ( ISBN   0-471-32420-5)[ page needed ]
  7. Clark, Steven E.; Benjamin, Aaron S.; Wixted, John T.; Mickes, Laura; Gronlund, Scott D. (2015). "Eyewitness Identification and the Accuracy of the Criminal Justice System". Policy Insights from the Behavioral and Brain Sciences. 2: 175–186. doi:10.1177/2372732215602267. hdl: 11244/49353 . S2CID   18529957.
  8. Haw, Ryann Michelle (January 2005). "A theoretical analysis of eyewitness identification: Dual -process theory, signal detection theory and eyewitness confidence". ProQuest Etd Collection for Fiu: 1–98.
  9. 1 2 Stanislaw, Harold; Todorov, Natasha (March 1999). "Calculation of signal detection theory measures". Behavior Research Methods, Instruments, & Computers. 31 (1): 137–149. doi: 10.3758/BF03207704 . PMID   10495845.
  10. "Signal Detection Theory". elvers.us. Retrieved 2023-07-14.
  11. Jafarpour, Sina; Xu, Weiyu; Hassibi, Babak; Calderbank, Robert (September 2009). "Efficient and Robust Compressed Sensing Using Optimized Expander Graphs" (PDF). IEEE Transactions on Information Theory. 55 (9): 4299–4308. doi:10.1109/tit.2009.2025528. S2CID   15490427.
  12. Needell, D.; Tropp, J.A. (2009). "CoSaMP: Iterative signal recovery from incomplete and inaccurate samples". Applied and Computational Harmonic Analysis. 26 (3): 301–321. arXiv: 0803.2392 . doi:10.1016/j.acha.2008.07.002. S2CID   1642637.
  13. Lotfi, M.; Vidyasagar, M."A Fast Noniterative Algorithm for Compressive Sensing Using Binary Measurement Matrices".
  14. 1 2 Schonhoff, T.A. and Giordano, A.A. (2006) Detection and Estimation Theory and Its Applications. New Jersey: Pearson Education ( ISBN   0-13-089499-0)
  15. Das, Abhranil; Geisler, Wilson (2021). "A method to integrate and classify normal distributions". Journal of Vision. 21 (10): 1. arXiv: 2012.14331 . doi:10.1167/jov.21.10.1. PMC   8419883 . PMID   34468706.