Dirk P. Kroese | |
---|---|
Born | 1963 (age 60–61) |
Scientific career | |
Fields | Mathematics AND Statistics |
Institutions | The University of Queensland |
Thesis | Stochastic Models in Reliability (1990) |
Dirk Pieter Kroese (born 1963) is a Dutch-Australian mathematician and statistician, and Professor at the University of Queensland. He is known for several contributions to applied probability, kernel density estimation, Monte Carlo methods and rare-event simulation. He is, with Reuven Rubinstein, a pioneer of the Cross-Entropy (CE) method.
Born in Wapenveld (municipality of Heerde), Dirk Kroese received his MSc (Netherlands Ingenieur (ir) degree) in 1986 and his Ph.D. (cum laude) in 1990, both from the Department of Applied Mathematics at the University of Twente. His dissertation was entitled Stochastic Models in Reliability. His PhD advisors were Joseph H. A. de Smit and Wilbert C. M. Kallenberg. [1] Part of his PhD research was carried out at Princeton University under the guidance of Erhan Çinlar. He has held teaching and research positions at University of Texas at Austin (1986), Princeton University (1988–1989), the University of Twente (1991–1998), the University of Melbourne (1997), and the University of Adelaide (1998–2000). Since 2000 he has been working at the University of Queensland, where he became a full professor in 2010. [2]
Kroese's work spans a wide range of topics in applied probability and mathematical statistics, including telecommunication networks, reliability engineering, point processes, kernel density estimation, Monte Carlo methods, rare-event simulation, cross-entropy methods, randomized optimization, and machine learning. He is a Chief Investigator of the Australian Research Council Centre of Excellence in Mathematical and Statistical Frontiers (ACEMS). [3] He has over 120 peer-reviewed publications, [4] including six monographs. [5]
In physics, statistical mechanics is a mathematical framework that applies statistical methods and probability theory to large assemblies of microscopic entities. It does not assume or postulate any natural laws, but explains the macroscopic behavior of nature from the behavior of such ensembles.
The following outline is provided as an overview of and topical guide to statistics:
Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deterministic in principle. They are often used in physical and mathematical problems and are most useful when it is difficult or impossible to use other approaches. Monte Carlo methods are mainly used in three problem classes: optimization, numerical integration, and generating draws from a probability distribution.
The principle of maximum entropy states that the probability distribution which best represents the current state of knowledge about a system is the one with largest entropy, in the context of precisely stated prior data.
In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain a sample of the desired distribution by recording states from the chain. The more steps that are included, the more closely the distribution of the sample matches the actual desired distribution. Various algorithms exist for constructing chains, including the Metropolis–Hastings algorithm.
Importance sampling is a Monte Carlo method for evaluating properties of a particular distribution, while only having samples generated from a different distribution than the distribution of interest. Its introduction in statistics is generally attributed to a paper by Teun Kloek and Herman K. van Dijk in 1978, but its precursors can be found in statistical physics as early as 1949. Importance sampling is also related to umbrella sampling in computational physics. Depending on the application, the term may refer to the process of sampling from this alternative distribution, the process of inference, or both.
In information theory, the cross-entropy between two probability distributions and over the same underlying set of events measures the average number of bits needed to identify an event drawn from the set if a coding scheme used for the set is optimized for an estimated probability distribution , rather than the true distribution .
In statistics, kernel density estimation (KDE) is the application of kernel smoothing for probability density estimation, i.e., a non-parametric method to estimate the probability density function of a random variable based on kernels as weights. KDE answers a fundamental data smoothing problem where inferences about the population are made, based on a finite data sample. In some fields such as signal processing and econometrics it is also termed the Parzen–Rosenblatt window method, after Emanuel Parzen and Murray Rosenblatt, who are usually credited with independently creating it in its current form. One of the famous applications of kernel density estimation is in estimating the class-conditional marginal densities of data when using a naive Bayes classifier, which can improve its prediction accuracy.
In statistics, resampling is the creation of new samples based on one observed sample. Resampling methods are:
The cross-entropy (CE) method is a Monte Carlo method for importance sampling and optimization. It is applicable to both combinatorial and continuous problems, with either a static or noisy objective.
Stochastic optimization (SO) methods are optimization methods that generate and use random variables. For stochastic problems, the random variables appear in the formulation of the optimization problem itself, which involves random objective functions or random constraints. Stochastic optimization methods also include methods with random iterates. Some stochastic optimization methods use random iterates to solve stochastic problems, combining both meanings of stochastic optimization. Stochastic optimization methods generalize deterministic methods for deterministic problems.
In probability and statistics, the truncated normal distribution is the probability distribution derived from that of a normally distributed random variable by bounding the random variable from either below or above. The truncated normal distribution has wide applications in statistics and econometrics.
Computational statistics, or statistical computing, is the bond between statistics and computer science, and refers to the statistical methods that are enabled by using computational methods. It is the area of computational science specific to the mathematical science of statistics. This area is also developing rapidly, leading to calls that a broader concept of computing should be taught as part of general statistical education.
Reuven Rubinstein (1938–2012) was an Israeli scientist known for his contributions to Monte Carlo simulation, applied probability, stochastic modeling, and stochastic optimization, having authored more than one hundred papers and six books.
In statistics, the antithetic variates method is a variance reduction technique used in Monte Carlo methods. Considering that the error in the simulated signal has a one-over square root convergence, a very large number of sample paths is required to obtain an accurate result. The antithetic variates method reduces the variance of the simulation results.
In probability theory, reflected Brownian motion is a Wiener process in a space with reflecting boundaries. In the physical literature, this process describes diffusion in a confined space and it is often called confined Brownian motion. For example it can describe the motion of hard spheres in water confined between two walls.
Rare event sampling is an umbrella term for a group of computer simulation methods intended to selectively sample 'special' regions of the dynamic space of systems which are unlikely to visit those special regions through brute-force simulation. A familiar example of a rare event in this context would be nucleation of a raindrop from over-saturated water vapour: although raindrops form every day, relative to the length and time scales defined by the motion of water molecules in the vapour phase, the formation of a liquid droplet is extremely rare.
In probability theory, a Cauchy process is a type of stochastic process. There are symmetric and asymmetric forms of the Cauchy process. The unspecified term "Cauchy process" is often used to refer to the symmetric Cauchy process.
Subset simulation is a method used in reliability engineering to compute small failure probabilities encountered in engineering systems. The basic idea is to express a small failure probability as a product of larger conditional probabilities by introducing intermediate failure events. This conceptually converts the original rare-event problem into a series of frequent-event problems that are easier to solve. In the actual implementation, samples conditional on intermediate failure events are adaptively generated to gradually populate from the frequent to rare event region. These 'conditional samples' provide information for estimating the complementary cumulative distribution function (CCDF) of the quantity of interest, covering the high as well as the low probability regions. They can also be used for investigating the cause and consequence of failure events. The generation of conditional samples is not trivial but can be performed efficiently using Markov chain Monte Carlo (MCMC).