Sensitivity analysis

Last updated

Sensitivity analysis is the study of how the uncertainty in the output of a mathematical model or system (numerical or otherwise) can be divided and allocated to different sources of uncertainty in its inputs. [1] [2] A related practice is uncertainty analysis, which has a greater focus on uncertainty quantification and propagation of uncertainty; ideally, uncertainty and sensitivity analysis should be run in tandem.

Uncertainty situation which involves imperfect and/or unknown information

Uncertainty refers to epistemic situations involving imperfect or unknown information. It applies to predictions of future events, to physical measurements that are already made, or to the unknown. Uncertainty arises in partially observable and/or stochastic environments, as well as due to ignorance, indolence, or both. It arises in any number of fields, including insurance, philosophy, physics, statistics, economics, finance, psychology, sociology, engineering, metrology, meteorology, ecology and information science.

A mathematical model is a description of a system using mathematical concepts and language. The process of developing a mathematical model is termed mathematical modeling. Mathematical models are used in the natural sciences and engineering disciplines, as well as in the social sciences.

Uncertainty analysis investigates the uncertainty of variables that are used in decision-making problems in which observations and models represent the knowledge base. In other words, uncertainty analysis aims to make a technical contribution to decision-making through the quantification of uncertainties in the relevant variables.

Contents

The process of recalculating outcomes under alternative assumptions to determine the impact of a variable under sensitivity analysis can be useful for a range of purposes, [3] including:

Overview

A mathematical model (for example a climate model, an economic model, or a finite element model in engineering etc.) can be highly complex, and as a result its relationships between inputs and outputs may be poorly understood. In such cases, the model can be viewed as a black box, i.e. the output is an "opaque" function of its inputs.

Climate model Quantitative methods used to simulate climate

Climate models use quantitative methods to simulate the interactions of the important drivers of climate, including atmosphere, oceans, land surface and ice. They are used for a variety of purposes from study of the dynamics of the climate system to projections of future climate.

Economic model

In economics, a model is a theoretical construct representing economic processes by a set of variables and a set of logical and/or quantitative relationships between them. The economic model is a simplified, often mathematical, framework designed to illustrate complex processes. Frequently, economic models posit structural parameters. A model may have various exogenous variables, and those variables may change to create various responses by economic variables. Methodological uses of models include investigation, theorizing, and fitting theories to the world.

In science, computing, and engineering, a black box is a device, system or object which can be viewed in terms of its inputs and outputs, without any knowledge of its internal workings. Its implementation is "opaque" (black). Almost anything might be referred to as a black box: a transistor, an algorithm, or the human brain.

Quite often, some or all of the model inputs are subject to sources of uncertainty, including errors of measurement, absence of information and poor or partial understanding of the driving forces and mechanisms. This uncertainty imposes a limit on our confidence in the response or output of the model. Further, models may have to cope with the natural intrinsic variability of the system (aleatory), such as the occurrence of stochastic events. [7]

Uncertainty quantification (UQ) is the science of quantitative characterization and reduction of uncertainties in both computational and real world applications. It tries to determine how likely certain outcomes are if some aspects of the system are not exactly known. An example would be to predict the acceleration of a human body in a head-on crash with another car: even if we exactly knew the speed, small differences in the manufacturing of individual cars, how tightly every bolt has been tightened, etc., will lead to different results that can only be predicted in a statistical sense.

In metrology, measurement uncertainty is the expression of the statistical dispersion of the values attributed to a measured quantity. All measurements are subject to uncertainty and a measurement result is complete only when it is accompanied by a statement of the associated uncertainty, such as the standard deviation. By international agreement, this uncertainty has a probabilistic basis and reflects incomplete knowledge of the quantity value. It is a non-negative parameter.

Confidence The belief that anything is possible by taking action and by believing in oneself

Confidence has a common meaning of a certainty about handling something, such as work, family, social events, or relationships. Some have ascribed confidence as a state of being certain either that a hypothesis or prediction is correct or that a chosen course of action is the best or most effective. Confidence comes from a latin word fidere' which means "to trust"; therefore, having a self-confidence is having trust in one's self. Arrogance or hubris in this comparison is having unmerited confidence – believing something or someone is capable or correct when they are not. Overconfidence or presumptuousness is excessive belief in someone succeeding, without any regard for failure. Confidence can be a self-fulfilling prophecy as those without it may fail or not try because they lack it and those with it may succeed because they have it rather than because of an innate ability.

Good modeling practice requires that the modeler provide an evaluation of the confidence in the model. This requires, first, a quantification of the uncertainty in any model results (uncertainty analysis); and second, an evaluation of how much each input is contributing to the output uncertainty. Sensitivity analysis addresses the second of these issues (although uncertainty analysis is usually a necessary precursor), performing the role of ordering by importance the strength and relevance of the inputs in determining the variation in the output. [2]

In mathematics and empirical science, quantification is the act of counting and measuring that maps human sense observations and experiences into quantities. Quantification in this sense is fundamental to the scientific method.

In models involving many input variables, sensitivity analysis is an essential ingredient of model building and quality assurance. National and international agencies involved in impact assessment studies have included sections devoted to sensitivity analysis in their guidelines. Examples are the European Commission (see e.g. the guidelines for impact assessment), [8] the White House Office of Management and Budget, the Intergovernmental Panel on Climate Change and US Environmental Protection Agency's modelling guidelines. [9]

Settings and constraints

The choice of method of sensitivity analysis is typically dictated by a number of problem constraints or settings. Some of the most common are

Computational expense is a problem in many practical sensitivity analyses. Some methods of reducing computational expense include the use of emulators (for large models), and screening methods (for reducing the dimensionality of the problem). Another method is to use an event-based sensitivity analysis method for variable selection for time-constrained applications. [11] This is an input variable selection (IVS) method that assembles together information about the trace of the changes in system inputs and outputs using sensitivity analysis to produce an input/output trigger/event matrix that is designed to map the relationships between input data as causes that trigger events and the output data that describes the actual events. The cause-effect relationship between the causes of state change i.e. input variables and the effect system output parameters determines which set of inputs have a genuine impact on a given output. The method has a clear advantage over analytical and computational IVS method since it tries to understand and interpret system state change in the shortest possible time with minimum computational overhead. [11] [12]

Core methodology

Ideal scheme of a possibly sampling-based sensitivity analysis. Uncertainty arising from different sources - errors in the data, parameter estimation procedure, alternative model structures - are propagated through the model for uncertainty analysis and their relative importance is quantified via sensitivity analysis. Sensitivity scheme.jpg
Ideal scheme of a possibly sampling-based sensitivity analysis. Uncertainty arising from different sources – errors in the data, parameter estimation procedure, alternative model structures – are propagated through the model for uncertainty analysis and their relative importance is quantified via sensitivity analysis.
Sampling-based sensitivity analysis by scatterplots. Y (vertical axis) is a function of four factors. The points in the four scatterplots are always the same though sorted differently, i.e. by Z1, Z2, Z3, Z4 in turn. Note that the abscissa is different for each plot: (-5, +5) for Z1, (-8, +8) for Z2, (-10, +10) for Z3 and Z4. Z4 is most important in influencing Y as it imparts more 'shape' on Y. Scatter plots for sensitivity analysis bis.jpg
Sampling-based sensitivity analysis by scatterplots. Y (vertical axis) is a function of four factors. The points in the four scatterplots are always the same though sorted differently, i.e. by Z1, Z2, Z3, Z4 in turn. Note that the abscissa is different for each plot: (−5, +5) for Z1, (−8, +8) for Z2, (−10, +10) for Z3 and Z4. Z4 is most important in influencing Y as it imparts more 'shape' on Y.

There are a large number of approaches to performing a sensitivity analysis, many of which have been developed to address one or more of the constraints discussed above. [2] They are also distinguished by the type of sensitivity measure, be it based on (for example) variance decompositions, partial derivatives or elementary effects. In general, however, most procedures adhere to the following outline:

  1. Quantify the uncertainty in each input (e.g. ranges, probability distributions). Note that this can be difficult and many methods exist to elicit uncertainty distributions from subjective data. [15]
  2. Identify the model output to be analysed (the target of interest should ideally have a direct relation to the problem tackled by the model).
  3. Run the model a number of times using some design of experiments, [16] dictated by the method of choice and the input uncertainty.
  4. Using the resulting model outputs, calculate the sensitivity measures of interest.

In some cases this procedure will be repeated, for example in high-dimensional problems where the user has to screen out unimportant variables before performing a full sensitivity analysis.

The various types of "core methods" (discussed below) are distinguished by the various sensitivity measures which are calculated. These categories can somehow overlap. Alternative ways of obtaining these measures, under the constraints of the problem, can be given.

One-at-a-time (OAT/OFAT)

One of the simplest and most common approaches is that of changing one-factor-at-a-time (OFAT or OAT), to see what effect this produces on the output. [17] [18] [19] OAT customarily involves

Sensitivity may then be measured by monitoring changes in the output, e.g. by partial derivatives or linear regression. This appears a logical approach as any change observed in the output will unambiguously be due to the single variable changed. Furthermore, by changing one variable at a time, one can keep all other variables fixed to their central or baseline values. This increases the comparability of the results (all 'effects' are computed with reference to the same central point in space) and minimizes the chances of computer programme crashes, more likely when several input factors are changed simultaneously. OAT is frequently preferred by modellers because of practical reasons. In case of model failure under OAT analysis the modeller immediately knows which is the input factor responsible for the failure. [13]

Despite its simplicity however, this approach does not fully explore the input space, since it does not take into account the simultaneous variation of input variables. This means that the OAT approach cannot detect the presence of interactions between input variables. [20]

Local methods

Local methods involve taking the partial derivative of the output Y with respect to an input factor Xi:

where the subscript X0 indicates that the derivative is taken at some fixed point in the space of the input (hence the 'local' in the name of the class). Adjoint modelling [21] [22] and Automated Differentiation [23] are methods in this class. Similar to OAT/OFAT, local methods do not attempt to fully explore the input space, since they examine small perturbations, typically one variable at a time.

Scatter plots

A simple but useful tool is to plot scatter plots of the output variable against individual input variables, after (randomly) sampling the model over its input distributions. The advantage of this approach is that it can also deal with "given data", i.e., a set of arbitrarily-placed data points, and gives a direct visual indication of sensitivity. Quantitative measures can also be drawn, for example by measuring the correlation between Y and Xi, or even by estimating variance-based measures by nonlinear regression. [14]

Regression analysis

Regression analysis, in the context of sensitivity analysis, involves fitting a linear regression to the model response and using standardized regression coefficients as direct measures of sensitivity. The regression is required to be linear with respect to the data (i.e. a hyperplane, hence with no quadratic terms, etc., as regressors) because otherwise it is difficult to interpret the standardised coefficients. This method is therefore most suitable when the model response is in fact linear; linearity can be confirmed, for instance, if the coefficient of determination is large. The advantages of regression analysis are that it is simple and has a low computational cost.

Variance-based methods

Variance-based methods [24] [25] [26] are a class of probabilistic approaches which quantify the input and output uncertainties as probability distributions, and decompose the output variance into parts attributable to input variables and combinations of variables. The sensitivity of the output to an input variable is therefore measured by the amount of variance in the output caused by that input. These can be expressed as conditional expectations, i.e., considering a model Y=f(X) for X={X1, X2, ... Xk}, a measure of sensitivity of the ith variable Xi is given as,

where "Var" and "E" denote the variance and expected value operators respectively, and X~i denotes the set of all input variables except Xi. This expression essentially measures the contribution Xi alone to the uncertainty (variance) in Y (averaged over variations in other variables), and is known as the first-order sensitivity index or main effect index. Importantly, it does not measure the uncertainty caused by interactions with other variables. A further measure, known as the total effect index, gives the total variance in Y caused by Xiand its interactions with any of the other input variables. Both quantities are typically standardised by dividing by Var(Y).

Variance-based methods allow full exploration of the input space, accounting for interactions, and nonlinear responses. For these reasons they are widely used when it is feasible to calculate them. Typically this calculation involves the use of Monte Carlo methods, but since this can involve many thousands of model runs, other methods (such as emulators) can be used to reduce computational expense when necessary. Note that full variance decompositions are only meaningful when the input factors are independent from one another. [27]

Variogram-based methods

Variogram analysis of response surfaces (VARS)

One of the major shortcomings of the previous sensitivity analysis methods is that none of them considers the spatially ordered structure of the response surface/output of the model Y=f(X) in the parameter space. By utilizing the concepts of directional variograms and covariograms, variogram analysis of response surfaces (VARS) addresses this weakness through recognizing a spatially continuous correlation structure to the values of Y, and hence also to the values of . [28] [29]

Basically, the higher the variability the more heterogeneous is the response surface along a particular direction/parameter, at a specific perturbation scale. Accordingly, in the VARS framework, the values of directional variograms for a given perturbation scale can be considered as a comprehensive illustration of sensitivity information, through linking variogram analysis to both direction and perturbation scale concepts. As a result, the VARS framework accounts for the fact that sensitivity is a scale-dependent concept, and thus overcomes the scale issue of traditional sensitivity analysis methods. [30] More importantly, VARS is able to provide relatively stable and statistically robust estimates of parameter sensitivity with much lower computational cost than other strategies (about two orders of magnitude more efficient). [31] Noteworthy, it has been shown that there is a theoretical link between the VARS framework and the variance-based and derivative-based approaches.

Screening

Screening is a particular instance of a sampling-based method. The objective here is rather to identify which input variables are contributing significantly to the output uncertainty in high-dimensionality models, rather than exactly quantifying sensitivity (i.e. in terms of variance). Screening tends to have a relatively low computational cost when compared to other approaches, and can be used in a preliminary analysis to weed out uninfluential variables before applying a more informative analysis to the remaining set. One of the most commonly used screening method is the elementary effect method. [32] [33]

Alternative methods

A number of methods have been developed to overcome some of the constraints discussed above, which would otherwise make the estimation of sensitivity measures infeasible (most often due to computational expense). Generally, these methods focus on efficiently calculating variance-based measures of sensitivity.

Emulators

Emulators (also known as metamodels, surrogate models or response surfaces) are data-modeling/machine learning approaches that involve building a relatively simple mathematical function, known as an emulator, that approximates the input/output behaviour of the model itself. [34] In other words, it is the concept of "modelling a model" (hence the name "metamodel"). The idea is that, although computer models may be a very complex series of equations that can take a long time to solve, they can always be regarded as a function of their inputs Y=f(X). By running the model at a number of points in the input space, it may be possible to fit a much simpler emulator η(X), such that η(X)≈f(X) to within an acceptable margin of error. Then, sensitivity measures can be calculated from the emulator (either with Monte Carlo or analytically), which will have a negligible additional computational cost. Importantly, the number of model runs required to fit the emulator can be orders of magnitude less than the number of runs required to directly estimate the sensitivity measures from the model. [35]

Clearly the crux of an emulator approach is to find an η (emulator) that is a sufficiently close approximation to the model f. This requires the following steps,

  1. Sampling (running) the model at a number of points in its input space. This requires a sample design.
  2. Selecting a type of emulator (mathematical function) to use.
  3. "Training" the emulator using the sample data from the model – this generally involves adjusting the emulator parameters until the emulator mimics the true model as well as possible.

Sampling the model can often be done with low-discrepancy sequences, such as the Sobol sequence – due to mathematician Ilya M. Sobol or Latin hypercube sampling, although random designs can also be used, at the loss of some efficiency. The selection of the emulator type and the training are intrinsically linked, since the training method will be dependent on the class of emulator. Some types of emulators that have been used successfully for sensitivity analysis include,

The use of an emulator introduces a machine learning problem, which can be difficult if the response of the model is highly nonlinear. In all cases it is useful to check the accuracy of the emulator, for example using cross-validation.

High-dimensional model representations (HDMR)

A high-dimensional model representation (HDMR) [40] [41] (the term is due to H. Rabitz [42] ) is essentially an emulator approach, which involves decomposing the function output into a linear combination of input terms and interactions of increasing dimensionality. The HDMR approach exploits the fact that the model can usually be well-approximated by neglecting higher-order interactions (second or third-order and above). The terms in the truncated series can then each be approximated by e.g. polynomials or splines (REFS) and the response expressed as the sum of the main effects and interactions up to the truncation order. From this perspective, HDMRs can be seen as emulators which neglect high-order interactions; the advantage being that they are able to emulate models with higher dimensionality than full-order emulators.

Fourier amplitude sensitivity test (FAST)

The Fourier amplitude sensitivity test (FAST) uses the Fourier series to represent a multivariate function (the model) in the frequency domain, using a single frequency variable. Therefore, the integrals required to calculate sensitivity indices become univariate, resulting in computational savings.

Other

Methods based on Monte Carlo filtering. [43] [44] These are also sampling-based and the objective here is to identify regions in the space of the input factors corresponding to particular values (e.g. high or low) of the output.

Other issues

Assumptions vs. inferences

In uncertainty and sensitivity analysis there is a crucial trade off between how scrupulous an analyst is in exploring the input assumptions and how wide the resulting inference may be. The point is well illustrated by the econometrician Edward E. Leamer: [45] [46]

I have proposed a form of organized sensitivity analysis that I call 'global sensitivity analysis' in which a neighborhood of alternative assumptions is selected and the corresponding interval of inferences is identified. Conclusions are judged to be sturdy only if the neighborhood of assumptions is wide enough to be credible and the corresponding interval of inferences is narrow enough to be useful.

Note Leamer's emphasis is on the need for 'credibility' in the selection of assumptions. The easiest way to invalidate a model is to demonstrate that it is fragile with respect to the uncertainty in the assumptions or to show that its assumptions have not been taken 'wide enough'. The same concept is expressed by Jerome R. Ravetz, for whom bad modeling is when uncertainties in inputs must be suppressed lest outputs become indeterminate. [47]

Pitfalls and difficulties

Some common difficulties in sensitivity analysis include

Applications

Some examples of sensitivity analyses performed in various disciplines follow here.

Environmental

Environmental computer models are increasingly used in a wide variety of studies and applications. For example, global climate models are used for both short-term weather forecasts and long-term climate change. Moreover, computer models are increasingly used for environmental decision-making at a local scale, for example for assessing the impact of a waste water treatment plant on a river flow, or for assessing the behavior and life-length of bio-filters for contaminated waste water. Similarly, sensitivity analyses can be used to evaluate the effect of demographic and environmental variables on the viability of populations to guide wildlife management [48]

In either case, sensitivity analysis may help to understand the contribution of the various sources of uncertainty to the model output uncertainty and the system performance in general. In these cases, depending on model complexity, different sampling strategies may be advisable and traditional sensitivity indices have to be generalized to cover multiple model outputs, [49] heteroskedastic effects and correlated inputs. [12]

Business

In a decision problem, the analyst may want to identify cost drivers as well as other quantities for which we need to acquire better knowledge in order to make an informed decision. On the other hand, some quantities have no influence on the predictions, so that we can save resources at no loss in accuracy by relaxing some of the conditions. See Corporate finance: Quantifying uncertainty. Additionally to the general motivations listed above, sensitivity analysis can help in a variety of other circumstances specific to business:

However, there are also some problems associated with sensitivity analysis in the business context:

Social sciences

Sensitivity analysis is common practice in social sciences. A famous early example is Mroz (1987), who analysed econometric models of female labor market participation. [50]

In modern econometrics the use of sensitivity analysis to anticipate criticism is the subject of one of Peter Kennedy's "ten commandments of applied econometrics": [51]

Thou shall confess in the presence of sensitivity. Corollary: Thou shall anticipate criticism [•••] When reporting a sensitivity analysis, researchers should explain fully their specification search so that the readers can judge for themselves how the results may have been affected. This is basically an 'honesty is the best policy' approach, advocated by Leamer, (1978 [52] ).

Sensitivity analysis can also be used in model-based policy assessment studies. [53] Sensitivity analysis can be used to assess the robustness of composite indicators, [54] also known as indices, such as the Environmental Performance Index.

Chemistry

Sensitivity analysis is common in many areas of physics and chemistry. [55]

With the accumulation of knowledge about kinetic mechanisms under investigation and with the advance of power of modern computing technologies, detailed complex kinetic models are increasingly used as predictive tools and as aids for understanding the underlying phenomena. A kinetic model is usually described by a set of differential equations representing the concentration-time relationship. Sensitivity analysis has been proven to be a powerful tool to investigate a complex kinetic model. [56] [57] [58]

Kinetic parameters are frequently determined from experimental data via nonlinear estimation. Sensitivity analysis can be used for optimal experimental design, e.g. determining initial conditions, measurement positions, and sampling time, to generate informative data which are critical to estimation accuracy. A great number of parameters in a complex model can be candidates for estimation but not all are estimable. [58] Sensitivity analysis can be used to identify the influential parameters which can be determined from available data while screening out the unimportant ones. Sensitivity analysis can also be used to identify the redundant species and reactions allowing model reduction.

Engineering

Modern engineering design makes extensive use of computer models to test designs before they are manufactured. Sensitivity analysis allows designers to assess the effects and sources of uncertainties, in the interest of building robust models. [59]

In meta-analysis

In a meta analysis, a sensitivity analysis tests if the results are sensitive to restrictions on the data included. Common examples are large trials only, higher quality trials only, and more recent trials only. If results are consistent it provides stronger evidence of an effect and of generalizability. [60]

Multi-criteria decision making

Sometimes a sensitivity analysis may reveal surprising insights about the subject of interest. For instance, the field of multi-criteria decision making (MCDM) studies (among other topics) the problem of how to select the best alternative among a number of competing alternatives. This is an important task in decision making. In such a setting each alternative is described in terms of a set of evaluative criteria. These criteria are associated with weights of importance. Intuitively, one may think that the larger the weight for a criterion is, the more critical that criterion should be. However, this may not be the case. It is important to distinguish here the notion of criticality with that of importance. By critical, we mean that a criterion with small change (as a percentage) in its weight, may cause a significant change of the final solution. It is possible criteria with rather small weights of importance (i.e., ones that are not so important in that respect) to be much more critical in a given situation than ones with larger weights. [61] [62] That is, a sensitivity analysis may shed light into issues not anticipated at the beginning of a study. This, in turn, may dramatically improve the effectiveness of the initial study and assist in the successful implementation of the final solution.

Time-critical decision making

Producing time-critical accurate knowledge about the state of a system (effect) under computational and data acquisition (cause) constraints is a major challenge, especially if the knowledge required is critical to the system operation where the safety of operators or integrity of costly equipment is at stake, e.g., during manufacturing or during environment substrate drilling. Understanding and interpreting, a chain of interrelated events, predicted or unpredicted, that may or may not result in a specific state of the system, is the core challenge of this research. Sensitivity analysis may be used to identify which set of input data signals has a significant impact on the set of system state information (i.e. output). Through a cause-effect analysis technique, sensitivity can be used to support the filtering of unsolicited data to reduce the communication and computational capabilities of a standard supervisory control and data acquisition system. [12]

Model calibration and improvement

One application of sensitivity analysis addresses the question of "What's important to model or system development?" One can seek to identify important connections between observations, model inputs, and predictions or forecasts. That is, one can seek to understand what observations (measurements of dependent variables) are most and least important to model inputs (parameters representing system characteristics or excitation), what model inputs are most and least important to predictions or forecasts, and what observations are most and least important to the predictions and forecasts. Often the results are surprising, lead to finding problems in the data or model development, and fixing the problems. This leads to better models. [5] [6] In biomedical engineering, sensitivity analysis can be used to determine system dynamics in ODE-based kinetic models. Parameters corresponding to stages of differentiation can be varied to determine which parameter is most influential on cell fate. Therefore, the most limiting step can be identified and the cell state for most advantageous scale-up and expansion can be determined. [63] Additionally, complex networks in systems biology can be better understood through fitting mass-action kinetic models. Sensitivity analysis on rate coefficients can then be conducted to determine optimal therapeutic targets within the system of interest. [64]

Sensitivity auditing

It may happen that a sensitivity analysis of a model-based study is meant to underpin an inference, and to certify its robustness, in a context where the inference feeds into a policy or decision making process. In these cases the framing of the analysis itself, its institutional context, and the motivations of its author may become a matter of great importance, and a pure sensitivity analysis – with its emphasis on parametric uncertainty – may be seen as insufficient. The emphasis on the framing may derive inter-alia from the relevance of the policy study to different constituencies that are characterized by different norms and values, and hence by a different story about 'what the problem is' and foremost about 'who is telling the story'. Most often the framing includes more or less implicit assumptions, which could be political (e.g. which group needs to be protected) all the way to technical (e.g. which variable can be treated as a constant).

In order to take these concerns into due consideration the instruments of SA have been extended to provide an assessment of the entire knowledge and model generating process. This approach has been called 'sensitivity auditing'. It takes inspiration from NUSAP, [65] a method used to qualify the worth of quantitative information with the generation of `Pedigrees' of numbers. Likewise, sensitivity auditing has been developed to provide pedigrees of models and model-based inferences. [66] Sensitivity auditing has been especially designed for an adversarial context, where not only the nature of the evidence, but also the degree of certainty and uncertainty associated to the evidence, will be the subject of partisan interests.

Sensitivity analysis is closely related with uncertainty analysis; while the latter studies the overall uncertainty in the conclusions of the study, sensitivity analysis tries to identify what source of uncertainty weighs more on the study's conclusions.

The problem setting in sensitivity analysis also has strong similarities with the field of design of experiments. In a design of experiments, one studies the effect of some process or intervention (the 'treatment') on some objects (the 'experimental units'). In sensitivity analysis one looks at the effect of varying the inputs of a mathematical model on the output of the model itself. In both disciplines one strives to obtain information from the system with a minimum of physical or numerical experiments.

See also


Related Research Articles

Supervised learning machine learning task of learning a function that maps an input to an output based on example input-output pairs

Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. It infers a function from labeled training data consisting of a set of training examples. In supervised learning, each example is a pair consisting of an input object and a desired output value. A supervised learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples. An optimal scenario will allow for the algorithm to correctly determine the class labels for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way.

Decision tree learning Machine Learning Algorithm

In computer science, Decision tree learning uses a decision tree to go from observations about an item to conclusions about the item's target value. It is one of the predictive modeling approaches used in statistics, data mining and machine learning. Tree models where the target variable can take a discrete set of values are called classification trees; in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. Decision trees where the target variable can take continuous values are called regression trees.

Dimensionality reduction process of reducing the number of random variables under consideration

In statistics, machine learning, and information theory, dimensionality reduction or dimension reduction is the process of reducing the number of random variables under consideration by obtaining a set of principal variables. It can be divided into feature selection and feature extraction.

Regression analysis set of statistical processes for estimating the relationships among variables

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships among variables. It includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables. More specifically, regression analysis helps one understand how the typical value of the dependent variable changes when any one of the independent variables is varied, while the other independent variables are held fixed.

A computer experiment or simulation experiment is an experiment used to study a computer simulation, also referred to as an in silico system. This area includes computational physics, computational chemistry, computational biology and other similar disciplines.

Nonlinear regression

In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables. The data are fitted by a method of successive approximations.

Spatial analysis Formal techniques which study entities using their topological, geometric, or geographic properties

Spatial analysis or spatial statistics includes any of the formal techniques which study entities using their topological, geometric, or geographic properties. Spatial analysis includes a variety of techniques, many still in their early development, using different analytic approaches and applied in fields as diverse as astronomy, with its studies of the placement of galaxies in the cosmos, to chip fabrication engineering, with its use of "place and route" algorithms to build complex wiring structures. In a more restricted sense, spatial analysis is the technique applied to structures at the human scale, most notably in the analysis of geographic data.

Local regression generalization of moving average and polynomial regression

Local regression or local polynomial regression, also known as moving regression, is a generalization of moving average and polynomial regression. Its most common methods, initially developed for scatterplot smoothing, are LOESS and LOWESS, both pronounced. They are two strongly related non-parametric regression methods that combine multiple regression models in a k-nearest-neighbor-based meta-model. Outside econometrics, LOESS is known and commonly referred to as Savitzky–Golay filter. Savitzky–Golay filter was proposed 15 years before LOESS.

Polynomial chaos (PC), also called Wiener chaos expansion, is a non-sampling-based method to determine evolution of uncertainty in a dynamical system, when there is probabilistic uncertainty in the system parameters. PC was first introduced by Norbert Wiener where Hermite polynomials were used to model stochastic processes with Gaussian random variables. It can be thought of as an extension of Volterra's theory of nonlinear functionals for stochastic systems. According to Cameron and Martin such an expansion converges in the sense for any arbitrary stochastic process with finite second moment. This applies to most physical systems.

In data mining, cluster-weighted modeling (CWM) is an algorithm-based approach to non-linear prediction of outputs from inputs based on density estimation using a set of models (clusters) that are each notionally appropriate in a sub-region of the input space. The overall approach works in jointly input-output space and an initial version was proposed by Neil Gershenfeld.

OptiY is a design environment providing modern optimization strategies and state of the art probabilistic algorithms for uncertainty, reliability, robustness, sensitivity analysis, data-mining and meta-modeling.

Probability box

A probability box is a characterization of an uncertain number consisting of both aleatoric and epistemic uncertainties that is often used in risk analysis or quantitative uncertainty modeling where numerical calculations must be performed. Probability bounds analysis is used to make arithmetic and logical calculations with p-boxes.

In applied statistics, the Morris method for global sensitivity analysis is a so-called one-step-at-a-time method (OAT), meaning that in each run only one input parameter is given a new value. It facilitates a global sensitivity analysis by making a number r of local changes at different points x(1 → r) of the possible range of input values.

Kimeme is an open platform for multi-objective optimization and multidisciplinary design optimization. It is intended to be coupled with external numerical software such as Computer Aided Design (CAD), Finite Element Analysis (FEM), Structural analysis and Computational Fluid Dynamics tools. It was developed by Cyber Dyne Srl and provides both a design environment for problem definition and analysis and a software network infrastructure to distribute the computational load.

Probability bounds analysis (PBA) is a collection of methods of uncertainty propagation for making qualitative and quantitative calculations in the face of uncertainties of various kinds. It is used to project partial information about random variables and other quantities through mathematical expressions. For instance, it computes sure bounds on the distribution of a sum, product, or more complex function, given only sure bounds on the distributions of the inputs. Such bounds are called probability boxes, and constrain cumulative probability distributions.

Variance-based sensitivity analysis is a form of global sensitivity analysis. Working within a probabilistic framework, it decomposes the variance of the output of the model or system into fractions which can be attributed to inputs or sets of inputs. For example, given a model with two inputs and one output, one might find that 70% of the output variance is caused by the variance in the first input, 20% by the variance in the second, and 10% due to interactions between the two. These percentages are directly interpreted as measures of sensitivity. Variance-based measures of sensitivity are attractive because they measure sensitivity across the whole input space, they can deal with nonlinear responses, and they can measure the effect of interactions in non-additive systems.

Bayesian Operational Modal Analysis (BAYOMA) adopts a Bayesian system identification approach for Operational Modal Analysis (OMA). Operational Modal Analysis (OMA) aims at identifying the modal properties of a constructed structure using only its (output) vibration response measured under operating conditions. The (input) excitations to the structure are not measured but are assumed to be 'ambient'. In a Bayesian context, the set of modal parameters are viewed as uncertain parameters or random variables whose probability distribution is updated from the prior distribution to the posterior distribution. The peak(s) of the posterior distribution represents the most probable value(s) (MPV) suggested by the data, while the spread of the distribution around the MPV reflects the remaining uncertainty of the parameters.

OptiSLang

optiSLang is a software platform for CAE-based sensitivity analysis, multi-disciplinary optimization (MDO) and robustness evaluation. It is developed by Dynardo GmbH and provides a framework for numerical Robust Design Optimization (RDO) and stochastic analysis by identifying variables which contribute most to a predefined optimization goal. This includes also the evaluation of robustness, i.e. the sensitivity towards scatter of design variables or random fluctuations of parameters.

Linear regression statistical approach for modeling the relationship between a scalar dependent variable and one or more explanatory variables

In statistics, linear regression is a linear approach to modeling the relationship between a scalar response and one or more explanatory variables. The case of one explanatory variable is called simple linear regression. For more than one explanatory variable, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable.

References

  1. Saltelli, A. (2002). "Sensitivity Analysis for Importance Assessment". Risk Analysis. 22 (3): 1–12. CiteSeerX   10.1.1.194.7359 . doi:10.1111/0272-4332.00040.
  2. 1 2 3 Saltelli, A.; Ratto, M.; Andres, T.; Campolongo, F.; Cariboni, J.; Gatelli, D.; Saisana, M.; Tarantola, S. (2008). Global Sensitivity Analysis: The Primer. John Wiley & Sons.
  3. Pannell, D. J. (1997). "Sensitivity Analysis of Normative Economic Models: Theoretical Framework and Practical Strategies". Agricultural Economics. 16 (2): 139–152. doi:10.1016/S0169-5150(96)01217-0.
  4. Bahremand, A.; De Smedt, F. (2008). "Distributed Hydrological Modeling and Sensitivity Analysis in Torysa Watershed, Slovakia". Water Resources Management. 22 (3): 293–408. doi:10.1007/s11269-007-9168-x.
  5. 1 2 Hill, M.; Kavetski, D.; Clark, M.; Ye, M.; Arabi, M.; Lu, D.; Foglia, L.; Mehl, S. (2015). "Practical use of computationally frugal model analysis methods". Groundwater. 54 (2): 159–170. doi:10.1111/gwat.12330. PMID   25810333.
  6. 1 2 Hill, M.; Tiedeman, C. (2007). Effective Groundwater Model Calibration, with Analysis of Data, Sensitivities, Predictions, and Uncertainty. John Wiley & Sons.
  7. Der Kiureghian, A.; Ditlevsen, O. (2009). "Aleatory or epistemic? Does it matter?". Structural Safety. 31 (2): 105–112. doi:10.1016/j.strusafe.2008.06.020.
  8. http://ec.europa.eu/governance/impact/commission_guidelines/docs/iag_2009_en.pdf
  9. http://www.epa.gov/CREM/library/cred_guidance_0309.pdf
  10. Helton, J. C.; Johnson, J. D.; Salaberry, C. J.; Storlie, C. B. (2006). "Survey of sampling based methods for uncertainty and sensitivity analysis". Reliability Engineering and System Safety. 91 (10–11): 1175–1209. doi:10.1016/j.ress.2005.11.017.
  11. 1 2 Tavakoli, Siamak; Mousavi, Alireza (2013). "Event tracking for real-time unaware sensitivity analysis (EventTracker)". IEEE Transactions on Knowledge and Data Engineering. 25 (2): 348–359. doi:10.1109/tkde.2011.240.
  12. 1 2 3 Tavakoli, Siamak; Mousavi, Alireza; Poslad, Stefan (2013). "Input variable selection in time-critical knowledge integration applications: A review, analysis, and recommendation paper". Advanced Engineering Informatics. 27 (4): 519–536. doi:10.1016/j.aei.2013.06.002.
  13. 1 2 Saltelli, A.; Annoni, P. (2010). "How to avoid a perfunctory sensitivity analysis". Environmental Modeling and Software. 25 (12): 1508–1517. doi:10.1016/j.envsoft.2010.04.012.
  14. 1 2 Paruolo, P.; Saisana, M.; Saltelli, A. (2013). "Ratings and Rankings: Voodoo or Science?". Journal of the Royal Statistical Society, Series A . 176 (3): 609–634. arXiv: 1104.3009 . doi:10.1111/j.1467-985X.2012.01059.x.
  15. O'Hagan, A.; et al. (2006). Uncertain Judgements: Eliciting Experts' Probabilities. Chichester: Wiley. ISBN   9780470033302.
  16. Sacks, J.; Welch, W. J.; Mitchell, T. J.; Wynn, H. P. (1989). "Design and Analysis of Computer Experiments". Statistical Science. 4 (4): 409–435. doi:10.1214/ss/1177012413.
  17. Campbell, J.; et al. (2008). "Photosynthetic Control of Atmospheric Carbonyl Sulfide During the Growing Season". Science . 322 (5904): 1085–1088. Bibcode:2008Sci...322.1085C. doi:10.1126/science.1164015. PMID   19008442.
  18. Bailis, R.; Ezzati, M.; Kammen, D. (2005). "Mortality and Greenhouse Gas Impacts of Biomass and Petroleum Energy Futures in Africa". Science . 308 (5718): 98–103. Bibcode:2005Sci...308...98B. doi:10.1126/science.1106881. PMID   15802601.
  19. Murphy, J.; et al. (2004). "Quantification of modelling uncertainties in a large ensemble of climate change simulations". Nature . 430 (7001): 768–772. Bibcode:2004Natur.430..768M. doi:10.1038/nature02771. PMID   15306806.
  20. Czitrom (1999). "One-Factor-at-a-Time Versus Designed Experiments". American Statistician. 53 (2).
  21. Cacuci, Dan G. Sensitivity and Uncertainty Analysis: Theory. I. Chapman & Hall.
  22. Cacuci, Dan G.; Ionescu-Bujor, Mihaela; Navon, Michael (2005). Sensitivity and Uncertainty Analysis: Applications to Large-Scale Systems. II. Chapman & Hall.
  23. Griewank, A. (2000). Evaluating Derivatives, Principles and Techniques of Algorithmic Differentiation. SIAM.
  24. Sobol', I. (1990). Sensitivity estimates for nonlinear mathematical models. Matematicheskoe Modelirovanie2, 112–118. in Russian, translated in English in Sobol' , I. (1993). Sensitivity analysis for non-linear mathematical models. Mathematical Modeling & Computational Experiment (Engl. Transl.), 1993, 1, 407–414.
  25. Homma, T.; Saltelli, A. (1996). "Importance measures in global sensitivity analysis of nonlinear models". Reliability Engineering and System Safety. 52: 1–17. doi:10.1016/0951-8320(96)00002-6.
  26. Saltelli, A., K. Chan, and M. Scott (Eds.) (2000). Sensitivity Analysis. Wiley Series in Probability and Statistics. New York: John Wiley and Sons.
  27. Saltelli, A.; Tarantola, S. (2002). "On the relative importance of input factors in mathematical models: safety assessment for nuclear waste disposal". Journal of the American Statistical Association. 97 (459): 702–709. doi:10.1198/016214502388618447.
  28. Razavi, Saman; Gupta, Hoshin V. (1 January 2016). "A new framework for comprehensive, robust, and efficient global sensitivity analysis: 1. Theory". Water Resources Research. 52 (1): 423–439. Bibcode:2016WRR....52..423R. doi:10.1002/2015WR017558. ISSN   1944-7973.
  29. Razavi, Saman; Gupta, Hoshin V. (1 January 2016). "A new framework for comprehensive, robust, and efficient global sensitivity analysis: 2. Application". Water Resources Research. 52 (1): 440–455. Bibcode:2016WRR....52..440R. doi:10.1002/2015WR017559. ISSN   1944-7973.
  30. Haghnegahdar, Amin; Razavi, Saman (1 September 2017). "Insights into sensitivity analysis of Earth and environmental systems models: On the impact of parameter perturbation scale". Environmental Modelling & Software. 95: 115–131. doi:10.1016/j.envsoft.2017.03.031.
  31. Gupta, H; Razavi, S (2016). "Challenges and Future Outlook of Sensitivity Analysis". In Petropoulos, George; Srivastava, Prashant (eds.). Sensitivity Analysis in Earth Observation Modelling (1st ed.). pp. 397–415. ISBN   9780128030318.
  32. Morris, M. D. (1991). "Factorial sampling plans for preliminary computational experiments". Technometrics. 33 (2): 161–174. CiteSeerX   10.1.1.584.521 . doi:10.2307/1269043. JSTOR   1269043.
  33. Campolongo, F.; Cariboni, J.; Saltelli, A. (2007). "An effective screening design for sensitivity analysis of large models". Environmental Modelling and Software. 22 (10): 1509–1518. doi:10.1016/j.envsoft.2006.10.004.
  34. 1 2 3 Storlie, C.B.; Swiler, L.P.; Helton, J.C.; Sallaberry, C.J. (2009). "Implementation and evaluation of nonparametric regression procedures for sensitivity analysis of computationally demanding models". Reliability Engineering & System Safety. 94 (11): 1735–1763. doi:10.1016/j.ress.2009.05.007.
  35. 1 2 Oakley, J.; O'Hagan, A. (2004). "Probabilistic sensitivity analysis of complex models: a Bayesian approach". J. Royal Stat. Soc. B. 66 (3): 751–769. CiteSeerX   10.1.1.6.9720 . doi:10.1111/j.1467-9868.2004.05304.x.
  36. Gramacy, R. B.; Taddy, M. A. (2010). "Categorical Inputs, Sensitivity Analysis, Optimization and Importance Tempering with tgp Version 2, an R Package for Treed Gaussian Process Models" (PDF). Journal of Statistical Software. 33 (6). doi:10.18637/jss.v033.i06.
  37. Becker, W.; Worden, K.; Rowson, J. (2013). "Bayesian sensitivity analysis of bifurcating nonlinear models". Mechanical Systems and Signal Processing. 34 (1–2): 57–75. Bibcode:2013MSSP...34...57B. doi:10.1016/j.ymssp.2012.05.010.
  38. Sudret, B., (2008), Global sensitivity analysis using polynomial chaos expansions}, Reliability Engineering & System Safety93(7): 964-979,
  39. Ratto, M.; Pagano, A. (2010). "Using recursive algorithms for the efficient identification of smoothing spline ANOVA models". AStA Advances in Statistical Analysis. 94 (4): 367–388. doi:10.1007/s10182-010-0148-8.
  40. Li, G.; Hu, J.; Wang, S.-W.; Georgopoulos, P.; Schoendorf, J.; Rabitz, H. (2006). "Random Sampling-High Dimensional Model Representation (RS-HDMR) and orthogonality of its different order component functions". Journal of Physical Chemistry A. 110 (7): 2474–2485. Bibcode:2006JPCA..110.2474L. doi:10.1021/jp054148m. PMID   16480307.
  41. Li, G., W. S. W., and R. H. (2002). Practical approaches to construct RS-HDMR component functions. Journal of Physical Chemistry106, 8721{8733.
  42. Rabitz, H (1989). "System analysis at molecular scale". Science. 246 (4927): 221–226. Bibcode:1989Sci...246..221R. doi:10.1126/science.246.4927.221. PMID   17839016.
  43. Hornberger, G.; Spear, R. (1981). "An approach to the preliminary analysis of environmental systems". Journal of Environmental Management. 7: 7–18.
  44. Saltelli, A.; Tarantola, S.; Campolongo, F.; Ratto, M. (2004). Sensitivity Analysis in Practice: A Guide to Assessing Scientific Models. John Wiley and Sons.
  45. Leamer, Edward E. (1983). "Let's Take the Con Out of Econometrics". American Economic Review . 73 (1): 31–43. JSTOR   1803924.
  46. Leamer, Edward E. (1985). "Sensitivity Analyses Would Help". American Economic Review . 75 (3): 308–313. JSTOR   1814801.
  47. Ravetz, J.R., 2007, No-Nonsense Guide to Science, New Internationalist Publications Ltd.
  48. Manlik, O.; Lacy, R.C.; Sherwin, W.B. (2018). "Applicability and limitations of sensitivity analyses for wildlife management". Journal of Applied Ecology. 55 (3): 1430–1440. doi:10.1111/1365-2664.13044.
  49. Fassò, Alessandro (2006). "Sensitivity Analysis for Environmental Models and Monitoring Networks" (PDF). Preprint.
  50. Mroz, Thomas A. (1987). "The Sensitivity of an Empirical Model of Married Women's Hours of Work to Economic and Statistical Assumptions". Econometrica . 55 (4): 765–799. doi:10.2307/1911029. JSTOR   1911029.
  51. Kennedy, P. (2007). A Guide to Econometrics (Fifth ed.). Blackwell. ISBN   9780262611831.
  52. Leamer, E. (1978). Specification Searches: Ad Hoc Inferences with Nonexperimental Data. John Wiley & Sons, Ltd, p. vi.
  53. Saltelli, Andrea (2006) "The critique of modelling and sensitivity analysis in the scientific discourse: An overview of good practices" Archived 2011-07-20 at the Wayback Machine , Transatlantic Uncertainty Colloquium (TAUC) Washington, October 10–11
  54. Saisana, M.; Saltelli, A.; Tarantola, S. (2005). "Uncertainty and Sensitivity analysis techniques as tools for the quality assessment of composite indicators". Journal of the Royal Statistical Society, Series A . 168 (2): 307–323. doi:10.1111/j.1467-985x.2005.00350.x.
  55. Saltelli, A.; Ratto, M.; Tarantola, S.; Campolongo, F. (2005). "Sensitivity Analysis for Chemical Models". Chemical Reviews. 105 (7): 2811–2828. doi:10.1021/cr040659d. PMID   16011325.
  56. Rabitz, H.; Kramer, M.; Dacol, D. (1983). "Sensitivity Analysis in Chemical Kinetics". Annual Review of Physical Chemistry. 34: 419–461. Bibcode:1983ARPC...34..419R. doi:10.1146/annurev.pc.34.100183.002223.
  57. Turanyi, T. (1990). "Sensitivity analysis of complex kinetic systems. Tools and applications". Journal of Mathematical Chemistry. 5 (3): 203–248. doi:10.1007/BF01166355.
  58. 1 2 Komorowski, M.; Costa, M. J.; Rand, D. A.; Stumpf, M. P. H. (2011). "Sensitivity, robustness, and identifiability in stochastic chemical kinetics models". Proc Natl Acad Sci U S A . 108 (21): 8645–50. arXiv: 1104.1274 . Bibcode:2011PNAS..108.8645K. doi:10.1073/pnas.1015814108. PMC   3102369 . PMID   21551095.
  59. Becker, W.; Rowson, J.; Oakley, J.E.; Yoxall, A.; Manson, G.; Worden, K. (2011). "Bayesian sensitivity analysis of a model of the aortic valve". Journal of Biomechanics. 44 (8): 1499–506. doi:10.1016/j.jbiomech.2011.03.008. PMID   21481873.
  60. clinicalevidence.bmj.com > Glossary > sensitivity analysis Retrieved on June 21, 2010
  61. Triantaphyllou, E.; A. Sanchez (1997). "A Sensitivity Analysis Approach for Some Deterministic Multi-Criteria Decision-Making Methods". Decision Sciences. 28 (1): 151–194. doi:10.1111/j.1540-5915.1997.tb01306.x . Retrieved 2010-06-28.
  62. Triantaphyllou, E. (2000). Multi-Criteria Decision Making: A Comparative Study. Dordrecht, The Netherlands: Kluwer Academic Publishers (now Springer). p. 320. ISBN   978-0-7923-6607-2.
  63. Selekman JA, Das A, Grundl NJ, Palecek SP. Improving efficiency of human pluripotent stem cell differentiation platforms using an integrated experimental and computational approach. Biotechnol Bioeng. 2013;110(11):3024-37.
  64. Tian D, Solodin NM, Rajbhandari P, Bjorklund K, Alarid ET, Kreeger PK. A kinetic model identifies phosphorylated estrogen receptor-α (ERα) as a critical regulator of ERα dynamics in breast cancer. FASEB J. 2015;29(5):2022-31.
  65. Van der Sluijs, JP; Craye, M; Funtowicz, S; Kloprogge, P; Ravetz, J; Risbey, J (2005). "Combining quantitative and qualitative measures of uncertainty in model based environmental assessment: the NUSAP system". Risk Analysis. 25 (2): 481–492. doi:10.1111/j.1539-6924.2005.00604.x. PMID   15876219.
  66. Saltelli, A., van der Sluijs, J., Guimarães Pereira, Â., 2013, Funtowiz, S.O., What do I make of your Latinorum? Sensitivity auditing of mathematical modelling, International Journal Foresight and Innovation Policy, 9 (2/3/4), 213–234.

Further reading