Signal-to-noise statistic

Last updated

In mathematics the signal-to-noise statistic distance between two vectors a and b with mean values and and standard deviation and respectively is:

Mathematics field of study

Mathematics includes the study of such topics as quantity, structure, space, and change.

Distance is a numerical measurement of how far apart objects are. In physics or everyday usage, distance may refer to a physical length or an estimation based on other criteria. In most cases, "distance from A to B" is interchangeable with "distance from B to A". In mathematics, a distance function or metric is a generalization of the concept of physical distance. A metric is a function that behaves according to a specific set of rules, and is a way of describing what it means for elements of some space to be "close to" or "far away from" each other.

Standard deviation dispersion of the values of a random variable around its expected value

In statistics, the standard deviation is a measure that is used to quantify the amount of variation or dispersion of a set of data values. A low standard deviation indicates that the data points tend to be close to the mean of the set, while a high standard deviation indicates that the data points are spread out over a wider range of values.

In the case of Gaussian-distributed data and unbiased class distributions, this statistic can be related to classification accuracy given an ideal linear discrimination, and a decision boundary can be derived. [1]

This distance is frequently used to identify vectors that have significant difference. One usage is in bioinformatics to locate genes that are differential expressed on microarray experiments. [2] [3] [4]

Bioinformatics interdisciplinary field

Bioinformatics is an interdisciplinary field that develops methods and software tools for understanding biological data. As an interdisciplinary field of science, bioinformatics combines biology, computer science, information engineering, mathematics and statistics to analyze and interpret biological data. Bioinformatics has been used for in silico analyses of biological queries using mathematical and statistical techniques.

Gene expression The process in which a genes sequence is converted into a mature gene product or products (proteins or RNA). This includes the production of an RNA transcript as well as any processing to produce a mature RNA product or an mRNA or circRNA (for prote

Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product. These products are often proteins, but in non-protein coding genes such as transfer RNA (tRNA) or small nuclear RNA (snRNA) genes, the product is a functional RNA.

Microarray

A microarray is a multiplex lab-on-a-chip. It is a two-dimensional array on a solid substrate that assays (tests) large amounts of biological material using high-throughput screening miniaturized, multiplexed and parallel processing and detection methods. The concept and methodology of microarrays was first introduced and illustrated in antibody microarrays by Tse Wen Chang in 1983 in a scientific publication and a series of patents. The "gene chip" industry started to grow significantly after the 1995 Science Paper by the Ron Davis and Pat Brown labs at Stanford University. With the establishment of companies, such as Affymetrix, Agilent, Applied Microarrays, Arrayjet, Illumina, and others, the technology of DNA microarrays has become the most sophisticated and the most widely used, while the use of protein, peptide and carbohydrate microarrays is expanding.

See also

Uniform norm

In mathematical analysis, the uniform norm assigns to real- or complex-valued bounded functions f defined on a set S the non-negative number

Signal-to-noise ratio is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. SNR is defined as the ratio of signal power to the noise power, often expressed in decibels. A ratio higher than 1:1 indicates more signal than noise.

Notes

  1. Auffarth, B., Lopez, M., Cerquides, J. (2010). Comparison of redundancy and relevance measures for feature selection in tissue classification of CT images. Advances in Data Mining. Applications and Theoretical Aspects. p. 248--262. Springer.
  2. Golub, T.R. et al. (1999) Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science 286, 531-537,
  3. Slonim D.K. et al. (2000) Class Prediction and Discovery Using Gene Expression Data. Procs. of the Fourth Annual International Conference on Computational Molecular Biology Tokyo, Japan April 8 - 11, p263-272
  4. Pomeroy, S.L. et al. (2002) Gene Expression-Based Classification and Outcome Prediction of Central Nervous System Embryonal Tumors. Nature 415, 436–442.


Related Research Articles

Normal distribution probability distribution

In probability theory, the normal distribution is a very common continuous probability distribution. Normal distributions are important in statistics and are often used in the natural and social sciences to represent real-valued random variables whose distributions are not known. A random variable with a Gaussian distribution is said to be normally distributed and is called a normal deviate.

White noise random signal having equal intensity at different frequencies, giving it a constant power spectral density

In signal processing, white noise is a random signal having equal intensity at different frequencies, giving it a constant power spectral density. The term is used, with this or similar meanings, in many scientific and technical disciplines, including physics, acoustical engineering, telecommunications, and statistical forecasting. White noise refers to a statistical model for signals and signal sources, rather than to any specific signal. White noise draws its name from white light, although light that appears white generally does not have a flat power spectral density over the visible band.

Multivariate normal distribution

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.

An adaptive filter is a system with a linear filter that has a transfer function controlled by variable parameters and a means to adjust those parameters according to an optimization algorithm. Because of the complexity of the optimization algorithms, almost all adaptive filters are digital filters. Adaptive filters are required for some applications because some parameters of the desired processing operation are not known in advance or are changing. The closed loop adaptive filter uses feedback in the form of an error signal to refine its transfer function.

Standard score

In statistics, the standard score is the signed fractional number of standard deviations by which the value of an observation or data point is above the mean value of what is being observed or measured. Observed values above the mean have positive standard scores, while values below the mean have negative standard scores.

In statistics, the Bhattacharyya distance measures the similarity of two probability distributions. It is closely related to the Bhattacharyya coefficient which is a measure of the amount of overlap between two statistical samples or populations. Both measures are named after Anil Kumar Bhattacharya, a statistician who worked in the 1930s at the Indian Statistical Institute.

In probability theory and statistics, the coefficient of variation (CV), also known as relative standard deviation (RSD), is a standardized measure of dispersion of a probability distribution or frequency distribution. It is often expressed as a percentage, and is defined as the ratio of the standard deviation to the mean . The CV or RSD is widely used in analytical chemistry to express the precision and repeatability of an assay. It is also commonly used in fields such as engineering or physics when doing quality assurance studies and ANOVA gauge R&R. In addition, CV is utilized by economists and investors in economic models and in determining the volatility of a security.

Multimodal distribution

In statistics, a bimodal distribution is a continuous probability distribution with two different modes. These appear as distinct peaks in the probability density function, as shown in Figures 1 and 2.

Linear discriminant analysis method used in statistics, pattern recognition and machine learning

Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics, pattern recognition and machine learning to find a linear combination of features that characterizes or separates two or more classes of objects or events. The resulting combination may be used as a linear classifier, or, more commonly, for dimensionality reduction before later classification.

A quadratic classifier is used in machine learning and statistical classification to separate measurements of two or more classes of objects or events by a quadric surface. It is a more general version of the linear classifier.

Mathematics of general relativity

The mathematics of general relativity refers to various mathematical structures and techniques that are used in studying and formulating Albert Einstein's theory of general relativity. The main tools used in this geometrical theory of gravitation are tensor fields defined on a Lorentzian manifold representing spacetime. This article is a general description of the mathematics of general relativity.

Rice distribution probability distribution of the magnitude of a circular bivariate normal random variable

In probability theory, the Rice distribution, Rician distribution or Ricean distribution is the probability distribution of the magnitude of a circular bivariate normal random variable with potentially non-zero mean. It was named after Stephen O. Rice.

The sensitivity index or d' is a statistic used in signal detection theory. It provides the separation between the means of the signal and the noise distributions, compared against the standard deviation of the signal or noise distribution. For normally distributed signal and noise with mean and standard deviations and , and and , respectively, d' is defined as:

In statistics, a pivotal quantity or pivot is a function of observations and unobservable parameters such that the function's probability distribution does not depend on the unknown parameters. A pivot quantity need not be a statistic—the function and its value can depend on the parameters of the model, but its distribution must not. If it is a statistic, then it is known as an ancillary statistic.

In statistics, the bias of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. Otherwise the estimator is said to be biased. In statistics, "bias" is an objective property of an estimator, and while not a desired property, it is not pejorative, unlike the ordinary English use of the term "bias".

In probability theory and statistics, the index of dispersion, dispersion index, coefficient of dispersion, relative variance, or variance-to-mean ratio (VMR), like the coefficient of variation, is a normalized measure of the dispersion of a probability distribution: it is a measure used to quantify whether a set of observed occurrences are clustered or dispersed compared to a standard statistical model.

The signal-to-noise ratio (SNR) is used in imaging as a physical measure of the sensitivity of a imaging system. Industry standards measure SNR in decibels (dB) of power and therefore apply the 10 log rule to the "pure" SNR ratio. In turn, yielding the "sensitivity." Industry standards measure and define sensitivity in terms of the ISO film speed equivalent; SNR:32.04 dB = excellent image quality and SNR:20 dB = acceptable image quality.

Logit-normal distribution

In probability theory, a logit-normal distribution is a probability distribution of a random variable whose logit has a normal distribution. If Y is a random variable with a normal distribution, and P is the standard logistic function, then X = P(Y) has a logit-normal distribution; likewise, if X is logit-normally distributed, then Y = logit(X)= log is normally distributed. It is also known as the logistic normal distribution, which often refers to a multinomial logit version (e.g.).

In statistics, the two-way analysis of variance (ANOVA) is an extension of the one-way ANOVA that examines the influence of two different categorical independent variables on one continuous dependent variable. The two-way ANOVA not only aims at assessing the main effect of each independent variable but also if there is any interaction between them.