Sensitivity analysis

Last updated

Sensitivity analysis is the study of how the uncertainty in the output of a mathematical model or system (numerical or otherwise) can be divided and allocated to different sources of uncertainty in its inputs. [1] [2] This involves estimating sensitivity indices that quantify the influence of an input or group of inputs on the output. A related practice is uncertainty analysis, which has a greater focus on uncertainty quantification and propagation of uncertainty; ideally, uncertainty and sensitivity analysis should be run in tandem.

Contents

Motivation

A mathematical model (for example in biology, climate change, economics, renewable energy, agronomy...) can be highly complex, and as a result, its relationships between inputs and outputs may be faultily understood. In such cases, the model can be viewed as a black box, i.e. the output is an "opaque" function of its inputs. Quite often, some or all of the model inputs are subject to sources of uncertainty, including errors of measurement, errors in input data, parameter estimation and approximation procedure, absence of information and poor or partial understanding of the driving forces and mechanisms, choice of underlying hypothesis of model, and so on. This uncertainty limits our confidence in the reliability of the model's response or output. Further, models may have to cope with the natural intrinsic variability of the system (aleatory), such as the occurrence of stochastic events. [3]

In models involving many input variables, sensitivity analysis is an essential ingredient of model building and quality assurance and can be useful to determine the impact of a uncertain variable for a range of purposes, [4] including:

Mathematical formulation and vocabulary

Figure 1. Schematic representation of uncertainty analysis and sensitivity analysis. In mathematical modeling, uncertainty arises from a variety of sources - errors in input data, parameter estimation and approximation procedure, underlying hypothesis, choice of model, alternative model structures and so on. They propagate through the model and have an impact on the output. The uncertainty on the output is described via uncertainty analysis (represented pdf on the output) and their relative importance is quantified via sensitivity analysis (represented by pie charts showing the proportion that each source of uncertainty contributes to the total uncertainty of the output). Sensitivity scheme.jpg
Figure 1. Schematic representation of uncertainty analysis and sensitivity analysis. In mathematical modeling, uncertainty arises from a variety of sources - errors in input data, parameter estimation and approximation procedure, underlying hypothesis, choice of model, alternative model structures and so on. They propagate through the model and have an impact on the output. The uncertainty on the output is described via uncertainty analysis (represented pdf on the output) and their relative importance is quantified via sensitivity analysis (represented by pie charts showing the proportion that each source of uncertainty contributes to the total uncertainty of the output).

The object of study for sensitivity analysis is a function , (called "mathematical model" or "programming code"), viewed as a black box, with the -dimensional input vector and the output, presented as following:

The variability in input parameters have an impact on the output . While uncertainty analysis aims to describe the distribution of the output (providing its statistics, moments, pdf, cdf,...), sensitivity analysis aims to measure and quantify the impact of each input or a group of inputs on the variability of the output (by calculating the corresponding sensitivity indices). Figure 1 provides a schematic representation of this statement.

Taking into account uncertainty arising from different sources, whether in the context of uncertainty analysis or sensitivity analysis (for calculating sensitivity indices), requires multiple samples of the uncertain parameters and, consequently, running the model (evaluating the -function) multiple times. Depending on the complexity of the model there are many challenges that may be encountered during model evaluation. Therefore, the choice of method of sensitivity analysis is typically dictated by a number of problem constraints, settings or challenges. Some of the most common are:

To address the various constraints and challenges, a number of methods for sensitivity analysis have been proposed in the literature, which we will examine in the next section.

Sensitivity analysis methods

There are a large number of approaches to performing a sensitivity analysis, many of which have been developed to address one or more of the constraints discussed above. They are also distinguished by the type of sensitivity measure, be it based on (for example) variance decompositions, partial derivatives or elementary effects. In general, however, most procedures adhere to the following outline:

  1. Quantify the uncertainty in each input (e.g. ranges, probability distributions). Note that this can be difficult and many methods exist to elicit uncertainty distributions from subjective data. [14]
  2. Identify the model output to be analysed (the target of interest should ideally have a direct relation to the problem tackled by the model).
  3. Run the model a number of times using some design of experiments, [15] dictated by the method of choice and the input uncertainty.
  4. Using the resulting model outputs, calculate the sensitivity measures of interest.

In some cases this procedure will be repeated, for example in high-dimensional problems where the user has to screen out unimportant variables before performing a full sensitivity analysis.

The various types of "core methods" (discussed below) are distinguished by the various sensitivity measures which are calculated. These categories can somehow overlap. Alternative ways of obtaining these measures, under the constraints of the problem, can be given. In addition, an engineering view of the methods that takes into account the four important sensitivity analysis parameters has also been proposed. [16]

Visual analysis

Figure 2. Sampling-based sensitivity analysis by scatterplots. Y (vertical axis) is a function of four factors. The points in the four scatterplots are always the same though sorted differently, i.e. by Z1, Z2, Z3, Z4 in turn. Note that the abscissa is different for each plot: (-5, +5) for Z1, (-8, +8) for Z2, (-10, +10) for Z3 and Z4. Z4 is most important in influencing Y as it imparts more 'shape' on Y. Scatter plots for sensitivity analysis bis.jpg
Figure 2. Sampling-based sensitivity analysis by scatterplots. Y (vertical axis) is a function of four factors. The points in the four scatterplots are always the same though sorted differently, i.e. by Z1, Z2, Z3, Z4 in turn. Note that the abscissa is different for each plot: (−5, +5) for Z1, (−8, +8) for Z2, (−10, +10) for Z3 and Z4. Z4 is most important in influencing Y as it imparts more 'shape' on Y.

The first intuitive approach (especially useful in less complex cases) is to analyze the relationship between each input and the output using scatter plots, and observe the behavior of these pairs. The diagrams give an initial idea of the correlation and which input has an impact on the output. Figure 2 shows an example where two inputs, and are highly correlated with the output.

One-at-a-time (OAT)

One of the simplest and most common approaches is that of changing one-factor-at-a-time (OAT), to see what effect this produces on the output. [17] [18] [19] OAT customarily involves

Sensitivity may then be measured by monitoring changes in the output, e.g. by partial derivatives or linear regression. This appears a logical approach as any change observed in the output will unambiguously be due to the single variable changed. Furthermore, by changing one variable at a time, one can keep all other variables fixed to their central or baseline values. This increases the comparability of the results (all 'effects' are computed with reference to the same central point in space) and minimizes the chances of computer program crashes, more likely when several input factors are changed simultaneously. OAT is frequently preferred by modelers because of practical reasons. In case of model failure under OAT analysis the modeler immediately knows which is the input factor responsible for the failure.

Despite its simplicity however, this approach does not fully explore the input space, since it does not take into account the simultaneous variation of input variables. This means that the OAT approach cannot detect the presence of interactions between input variables and is unsuitable for nonlinear models. [20]

The proportion of input space which remains unexplored with an OAT approach grows superexponentially with the number of inputs. For example, a 3-variable parameter space which is explored one-at-a-time is equivalent to taking points along the x, y, and z axes of a cube centered at the origin. The convex hull bounding all these points is an octahedron which has a volume only 1/6th of the total parameter space. More generally, the convex hull of the axes of a hyperrectangle forms a hyperoctahedron which has a volume fraction of . With 5 inputs, the explored space already drops to less than 1% of the total parameter space. And even this is an overestimate, since the off-axis volume is not actually being sampled at all. Compare this to random sampling of the space, where the convex hull approaches the entire volume as more points are added. [21] While the sparsity of OAT is theoretically not a concern for linear models, true linearity is rare in nature.

Morris

Named after statistician Max D. Morris this method is suitable for screening systems with many parameters. This is also known as method of elementary effects because it combines repeated steps along the various parametric axes. [22]

Derivative-based local methods

Local derivative-based methods involve taking the partial derivative of the output with respect to an input factor :

where the subscript x0 indicates that the derivative is taken at some fixed point in the space of the input (hence the 'local' in the name of the class). Adjoint modelling [23] [24] and Automated Differentiation [25] are methods which allow to compute all partial derivatives at a cost at most 4-6 times of that for evaluating the original function. Similar to OAT, local methods do not attempt to fully explore the input space, since they examine small perturbations, typically one variable at a time. It is possible to select similar samples from derivative-based sensitivity through Neural Networks and perform uncertainty quantification.

One advantage of the local methods is that it is possible to make a matrix to represent all the sensitivities in a system, thus providing an overview that cannot be achieved with global methods if there is a large number of input and output variables. [26]

Regression analysis

Regression analysis, in the context of sensitivity analysis, involves fitting a linear regression to the model response and using standardized regression coefficients as direct measures of sensitivity. The regression is required to be linear with respect to the data (i.e. a hyperplane, hence with no quadratic terms, etc., as regressors) because otherwise it is difficult to interpret the standardised coefficients. This method is therefore most suitable when the model response is in fact linear; linearity can be confirmed, for instance, if the coefficient of determination is large. The advantages of regression analysis are that it is simple and has a low computational cost.

Variance-based methods

Variance-based methods [27] are a class of probabilistic approaches which quantify the input and output uncertainties as random variables, represented via their probability distributions, and decompose the output variance into parts attributable to input variables and combinations of variables. The sensitivity of the output to an input variable is therefore measured by the amount of variance in the output caused by that input.

This amount is quantified and calculated using Sobol indices: they represent the proportion of variance explained by an input or group of inputs. This expression essentially measures the contribution of alone to the uncertainty (variance) in (averaged over variations in other variables), and is known as the first-order sensitivity index or main effect index.

For an input , Sobol index is defined as following:

where and denote the variance and expected value operators respectively.

Importantly, first-order sensitivity index of does not measure the uncertainty caused by interactions has with other variables. A further measure, known as the total effect index, gives the total variance in caused by and its interactions with any of the other input variables. The total effect index is given as following: where denotes the set of all input variables except .

Variance-based methods allow full exploration of the input space, accounting for interactions, and nonlinear responses. For these reasons they are widely used when it is feasible to calculate them. Typically this calculation involves the use of Monte Carlo methods, but since this can involve many thousands of model runs, other methods (such as metamodels) can be used to reduce computational expense when necessary.

Moment-independent methods

Moment-independent methods extend variance-based techniques by considering the probability density or cumulative distribution function of the model output . Thus, they do not refer to any particular moment of , whence the name.

The moment-independent sensitivity measures of , here denoted by , can be defined through an equation similar to variance-based indices replacing the conditional expectation with a distance, as , where is a statistical distance [metric or divergence] between probability measures, and are the marginal and conditional probability measures of . [28]

If is a distance, the moment-independent global sensitivity measure satisfies zero-independence. This is a relevant statistical property also known as Renyi's postulate D. [29]

The class of moment-independent sensitivity measures includes indicators such as the -importance measure, [30] the new correlation coefficient of Chatterjee, [31] the Wasserstein correlation of Wiesel [32] and the kernel-based sensitivity measures of Barr and Rabitz. [33]

Another measure for global sensitivity analysis, in the category of moment-independent approaches, is the PAWN index. [34] It relies on Cumulative Distribution Functions (CDFs) to characterize the maximum distance between the unconditional output distribution and conditional output distribution (obtained by varying all input parameters and by setting the -th input, consequentially). The difference between the unconditional and conditional output distribution is usually calculated using the Kolmogorov–Smirnov test (KS). The PAWN index for a given input parameter is then obtained by calculating the summary statistics over all KS values.[ citation needed ]

Variogram analysis of response surfaces (VARS)

One of the major shortcomings of the previous sensitivity analysis methods is that none of them considers the spatially ordered structure of the response surface/output of the model in the parameter space. By utilizing the concepts of directional variograms and covariograms, variogram analysis of response surfaces (VARS) addresses this weakness through recognizing a spatially continuous correlation structure to the values of , and hence also to the values of . [35] [36]

Basically, the higher the variability the more heterogeneous is the response surface along a particular direction/parameter, at a specific perturbation scale. Accordingly, in the VARS framework, the values of directional variograms for a given perturbation scale can be considered as a comprehensive illustration of sensitivity information, through linking variogram analysis to both direction and perturbation scale concepts. As a result, the VARS framework accounts for the fact that sensitivity is a scale-dependent concept, and thus overcomes the scale issue of traditional sensitivity analysis methods. [37] More importantly, VARS is able to provide relatively stable and statistically robust estimates of parameter sensitivity with much lower computational cost than other strategies (about two orders of magnitude more efficient). [38] Noteworthy, it has been shown that there is a theoretical link between the VARS framework and the variance-based and derivative-based approaches.

Fourier amplitude sensitivity test (FAST)

The Fourier amplitude sensitivity test (FAST) uses the Fourier series to represent a multivariate function (the model) in the frequency domain, using a single frequency variable. Therefore, the integrals required to calculate sensitivity indices become univariate, resulting in computational savings.

Shapley effects

Shapley effects rely on Shapley values and represent the average marginal contribution of a given factors across all possible combinations of factors. These value are related to Sobol’s indices as their value falls between the first order Sobol’ effect and the total order effect. [39]

Chaos polynomials

The principle is to project the function of interest onto a basis of orthogonal polynomials. The Sobol indices are then expressed analytically in terms of the coefficients of this decomposition. [40]

Complementary research approaches for time-consuming simulations

A number of methods have been developed to overcome some of the constraints discussed above, which would otherwise make the estimation of sensitivity measures infeasible (most often due to computational expense). Generally, these methods focus on efficiently (by creating a metamodel of the costly function to be evaluated and/or by “ wisely ” sampling the factor space) calculating variance-based measures of sensitivity.

Metamodels

Metamodels (also known as emulators, surrogate models or response surfaces) are data-modeling/machine learning approaches that involve building a relatively simple mathematical function, known as an metamodels, that approximates the input/output behavior of the model itself. [41] In other words, it is the concept of "modeling a model" (hence the name "metamodel"). The idea is that, although computer models may be a very complex series of equations that can take a long time to solve, they can always be regarded as a function of their inputs . By running the model at a number of points in the input space, it may be possible to fit a much simpler metamodels , such that to within an acceptable margin of error. [42] Then, sensitivity measures can be calculated from the metamodel (either with Monte Carlo or analytically), which will have a negligible additional computational cost. Importantly, the number of model runs required to fit the metamodel can be orders of magnitude less than the number of runs required to directly estimate the sensitivity measures from the model. [43]

Clearly, the crux of an metamodel approach is to find an (metamodel) that is a sufficiently close approximation to the model . This requires the following steps,

  1. Sampling (running) the model at a number of points in its input space. This requires a sample design.
  2. Selecting a type of emulator (mathematical function) to use.
  3. "Training" the metamodel using the sample data from the model – this generally involves adjusting the metamodel parameters until the metamodel mimics the true model as well as possible.

Sampling the model can often be done with low-discrepancy sequences, such as the Sobol sequence – due to mathematician Ilya M. Sobol or Latin hypercube sampling, although random designs can also be used, at the loss of some efficiency. The selection of the metamodel type and the training are intrinsically linked since the training method will be dependent on the class of metamodel. Some types of metamodels that have been used successfully for sensitivity analysis include:

The use of an emulator introduces a machine learning problem, which can be difficult if the response of the model is highly nonlinear. In all cases, it is useful to check the accuracy of the emulator, for example using cross-validation.

High-dimensional model representations (HDMR)

A high-dimensional model representation (HDMR) [49] [50] (the term is due to H. Rabitz [51] ) is essentially an emulator approach, which involves decomposing the function output into a linear combination of input terms and interactions of increasing dimensionality. The HDMR approach exploits the fact that the model can usually be well-approximated by neglecting higher-order interactions (second or third-order and above). The terms in the truncated series can then each be approximated by e.g. polynomials or splines (REFS) and the response expressed as the sum of the main effects and interactions up to the truncation order. From this perspective, HDMRs can be seen as emulators which neglect high-order interactions; the advantage is that they are able to emulate models with higher dimensionality than full-order emulators.

Monte Carlo filtering

Sensitivity analysis via Monte Carlo filtering [52] is also a sampling-based approach, whose objective is to identify regions in the space of the input factors corresponding to particular values (e.g., high or low) of the output.

Sensitivity analysis is closely related with uncertainty analysis; while the latter studies the overall uncertainty in the conclusions of the study, sensitivity analysis tries to identify what source of uncertainty weighs more on the study's conclusions.

The problem setting in sensitivity analysis also has strong similarities with the field of design of experiments. [53] In a design of experiments, one studies the effect of some process or intervention (the 'treatment') on some objects (the 'experimental units'). In sensitivity analysis one looks at the effect of varying the inputs of a mathematical model on the output of the model itself. In both disciplines one strives to obtain information from the system with a minimum of physical or numerical experiments.

Sensitivity auditing

It may happen that a sensitivity analysis of a model-based study is meant to underpin an inference, and to certify its robustness, in a context where the inference feeds into a policy or decision-making process. In these cases the framing of the analysis itself, its institutional context, and the motivations of its author may become a matter of great importance, and a pure sensitivity analysis – with its emphasis on parametric uncertainty – may be seen as insufficient. The emphasis on the framing may derive inter-alia from the relevance of the policy study to different constituencies that are characterized by different norms and values, and hence by a different story about 'what the problem is' and foremost about 'who is telling the story'. Most often the framing includes more or less implicit assumptions, which could be political (e.g. which group needs to be protected) all the way to technical (e.g. which variable can be treated as a constant).

In order to take these concerns into due consideration the instruments of SA have been extended to provide an assessment of the entire knowledge and model generating process. This approach has been called 'sensitivity auditing'. It takes inspiration from NUSAP, [54] a method used to qualify the worth of quantitative information with the generation of `Pedigrees' of numbers. Sensitivity auditing has been especially designed for an adversarial context, where not only the nature of the evidence, but also the degree of certainty and uncertainty associated to the evidence, will be the subject of partisan interests. [55] Sensitivity auditing is recommended in the European Commission guidelines for impact assessment, [56] as well as in the report Science Advice for Policy by European Academies. [57]

Pitfalls and difficulties

Some common difficulties in sensitivity analysis include:

" I have proposed a form of organized sensitivity analysis that I call 'global sensitivity analysis' in which a neighborhood of alternative assumptions is selected and the corresponding interval of inferences is identified. Conclusions are judged to be sturdy only if the neighborhood of assumptions is wide enough to be credible and the corresponding interval of inferences is narrow enough to be useful."

Note Leamer's emphasis is on the need for 'credibility' in the selection of assumptions. The easiest way to invalidate a model is to demonstrate that it is fragile with respect to the uncertainty in the assumptions or to show that its assumptions have not been taken 'wide enough'. The same concept is expressed by Jerome R. Ravetz, for whom bad modeling is when uncertainties in inputs must be suppressed lest outputs become indeterminate. [60]

SA in international context

The importance of understanding and managing uncertainty in model results has inspired many scientists from different research centers all over the world to take a close interest in this subject. National and international agencies involved in impact assessment studies have included sections devoted to sensitivity analysis in their guidelines. Examples are the European Commission (see e.g. the guidelines for impact assessment), [56] the White House Office of Management and Budget, the Intergovernmental Panel on Climate Change and US Environmental Protection Agency's modeling guidelines. [61]

Specific applications of sensitivity analysis

The following pages discuss sensitivity analyses in relation to specific applications:

See also

Related Research Articles

<span class="mw-page-title-main">Supervised learning</span> Paradigm in machine learning

Supervised learning (SL) is a paradigm in machine learning where input objects and a desired output value train a model. The training data is processed, building a function that maps new data to expected output values. An optimal scenario will allow for the algorithm to correctly determine output values for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a "reasonable" way. This statistical quality of an algorithm is measured through the so-called generalization error.

<span class="mw-page-title-main">Least squares</span> Approximation method in statistics

The method of least squares is a parameter estimation method in regression analysis based on minimizing the sum of the squares of the residuals made in the results of each individual equation.

In statistics, propagation of uncertainty is the effect of variables' uncertainties on the uncertainty of a function based on them. When the variables are the values of experimental measurements they have uncertainties due to measurement limitations which propagate due to the combination of variables in the function.

<span class="mw-page-title-main">Regression analysis</span> Set of statistical processes for estimating the relationships among variables

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable and one or more error-free independent variables. The most common form of regression analysis is linear regression, in which one finds the line that most closely fits the data according to a specific mathematical criterion. For example, the method of ordinary least squares computes the unique line that minimizes the sum of squared differences between the true data and that line. For specific mathematical reasons, this allows the researcher to estimate the conditional expectation of the dependent variable when the independent variables take on a given set of values. Less common forms of regression use slightly different procedures to estimate alternative location parameters or estimate the conditional expectation across a broader collection of non-linear models.

In metrology, measurement uncertainty is the expression of the statistical dispersion of the values attributed to a quantity measured on an interval or ratio scale.

In statistics, a generalized additive model (GAM) is a generalized linear model in which the linear response variable depends linearly on unknown smooth functions of some predictor variables, and interest focuses on inference about these smooth functions.

Uncertainty quantification (UQ) is the science of quantitative characterization and estimation of uncertainties in both computational and real world applications. It tries to determine how likely certain outcomes are if some aspects of the system are not exactly known. An example would be to predict the acceleration of a human body in a head-on crash with another car: even if the speed was exactly known, small differences in the manufacturing of individual cars, how tightly every bolt has been tightened, etc., will lead to different results that can only be predicted in a statistical sense.

High-dimensional model representation is a finite expansion for a given multivariable function. The expansion was first described by Ilya M. Sobol as

<span class="mw-page-title-main">Probabilistic design</span> Discipline within engineering design

Probabilistic design is a discipline within engineering design. It deals primarily with the consideration and minimization of the effects of random variability upon the performance of an engineering system during the design phase. Typically, these effects studied and optimized are related to quality and reliability. It differs from the classical approach to design by assuming a small probability of failure instead of using the safety factor. Probabilistic design is used in a variety of different applications to assess the likelihood of failure. Disciplines which extensively use probabilistic design principles include product design, quality control, systems engineering, machine design, civil engineering and manufacturing.

Polynomial chaos (PC), also called polynomial chaos expansion (PCE) and Wiener chaos expansion, is a method for representing a random variable in terms of a polynomial function of other random variables. The polynomials are chosen to be orthogonal with respect to the joint probability distribution of these random variables. Note that despite its name, PCE has no immediate connections to chaos theory. The word "chaos" here should be understood as "random".

Published in 1991 by Max Morris the elementary effects (EE) method is one of the most used screening methods in sensitivity analysis.

<span class="mw-page-title-main">Probability box</span> Characterization of uncertain numbers consisting of both aleatoric and epistemic uncertainties

A probability box is a characterization of uncertain numbers consisting of both aleatoric and epistemic uncertainties that is often used in risk analysis or quantitative uncertainty modeling where numerical calculations must be performed. Probability bounds analysis is used to make arithmetic and logical calculations with p-boxes.

In applied statistics, the Morris method for global sensitivity analysis is a so-called one-factor-at-a-time method, meaning that in each run only one input parameter is given a new value. It facilitates a global sensitivity analysis by making a number of local changes at different points of the possible range of input values.

Probability bounds analysis (PBA) is a collection of methods of uncertainty propagation for making qualitative and quantitative calculations in the face of uncertainties of various kinds. It is used to project partial information about random variables and other quantities through mathematical expressions. For instance, it computes sure bounds on the distribution of a sum, product, or more complex function, given only sure bounds on the distributions of the inputs. Such bounds are called probability boxes, and constrain cumulative probability distributions.

Variance-based sensitivity analysis is a form of global sensitivity analysis. Working within a probabilistic framework, it decomposes the variance of the output of the model or system into fractions which can be attributed to inputs or sets of inputs. For example, given a model with two inputs and one output, one might find that 70% of the output variance is caused by the variance in the first input, 20% by the variance in the second, and 10% due to interactions between the two. These percentages are directly interpreted as measures of sensitivity. Variance-based measures of sensitivity are attractive because they measure sensitivity across the whole input space, they can deal with nonlinear responses, and they can measure the effect of interactions in non-additive systems.

In mathematics, error analysis is the study of kind and quantity of error, or uncertainty, that may be present in the solution to a problem. This issue is particularly prominent in applied areas such as numerical analysis and statistics.

<span class="mw-page-title-main">OptiSLang</span>

optiSLang is a software platform for CAE-based sensitivity analysis, multi-disciplinary optimization (MDO) and robustness evaluation. It was originally developed by Dynardo GmbH and provides a framework for numerical Robust Design Optimization (RDO) and stochastic analysis by identifying variables which contribute most to a predefined optimization goal. This includes also the evaluation of robustness, i.e. the sensitivity towards scatter of design variables or random fluctuations of parameters. In 2019, Dynardo GmbH was acquired by Ansys.

Gradient-enhanced kriging (GEK) is a surrogate modeling technique used in engineering. A surrogate model is a prediction of the output of an expensive computer code. This prediction is based on a small number of evaluations of the expensive computer code.

Line sampling is a method used in reliability engineering to compute small failure probabilities encountered in engineering systems. The method is particularly suitable for high-dimensional reliability problems, in which the performance function exhibits moderate non-linearity with respect to the uncertain parameters The method is suitable for analyzing black box systems, and unlike the importance sampling method of variance reduction, does not require detailed knowledge of the system.

Probabilistic numerics is an active field of study at the intersection of applied mathematics, statistics, and machine learning centering on the concept of uncertainty in computation. In probabilistic numerics, tasks in numerical analysis such as finding numerical solutions for integration, linear algebra, optimization and simulation and differential equations are seen as problems of statistical, probabilistic, or Bayesian inference.

References

  1. Saltelli, A.; Ratto, M.; Andreas, T.; Campolongo, F.; Gariboni, J.; Gatelli, D.; Saisana, M.; Tarantola, S. (2008). Global sensitivity analysis: the primer. John Wiley & Sons. doi:10.1002/9780470725184. ISBN   978-0-470-05997-5.
  2. Saltelli, A.; Tarantola, S.; Campolongo, F.; Ratto, M. (2004). Sensitivity analysis in practice: a guide to assessing scientific models. Vol. 1. doi:10.1002/0470870958. ISBN   978-0-470-87093-8.
  3. Der Kiureghian, A.; Ditlevsen, O. (2009). "Aleatory or epistemic? Does it matter?". Structural Safety. 31 (2): 105–112. doi:10.1016/j.strusafe.2008.06.020.
  4. Pannell, D. J. (1997). "Sensitivity Analysis of Normative Economic Models: Theoretical Framework and Practical Strategies". Agricultural Economics. 16 (2): 139–152. doi:10.1111/j.1574-0862.1997.tb00449.x.
  5. Bahremand, A.; De Smedt, F. (2008). "Distributed Hydrological Modeling and Sensitivity Analysis in Torysa Watershed, Slovakia". Water Resources Management. 22 (3): 293–408. Bibcode:2008WatRM..22..393B. doi:10.1007/s11269-007-9168-x. S2CID   9710579.
  6. Hill, M.; Kavetski, D.; Clark, M.; Ye, M.; Arabi, M.; Lu, D.; Foglia, L.; Mehl, S. (2015). "Practical use of computationally frugal model analysis methods". Groundwater. 54 (2): 159–170. doi: 10.1111/gwat.12330 . OSTI   1286771. PMID   25810333.
  7. Hill, M.; Tiedeman, C. (2007). Effective Groundwater Model Calibration, with Analysis of Data, Sensitivities, Predictions, and Uncertainty. John Wiley & Sons.
  8. Helton, J. C.; Johnson, J. D.; Salaberry, C. J.; Storlie, C. B. (2006). "Survey of sampling based methods for uncertainty and sensitivity analysis". Reliability Engineering and System Safety. 91 (10–11): 1175–1209. doi:10.1016/j.ress.2005.11.017.
  9. Tsvetkova, O.; Ouarda, T.B.M.J. (2019). "Quasi-Monte Carlo technique in global sensitivity analysis of wind resource assessment with a study on UAE" (PDF). Journal of Renewable and Sustainable Energy. 11 (5): 053303. doi:10.1063/1.5120035. S2CID   208835771.
  10. Chastaing, G.; Gamboa, F.; Prieur, C. (2012). "Generalized Hoeffding-Sobol decomposition for dependent variables - application to sensitivity analysis". Electronic Journal of Statistics . 6: 2420–2448. arXiv: 1112.1788 . doi:10.1214/12-EJS749. ISSN   1935-7524.
  11. Gamboa, F.; Janon, A.; Klein, T.; Lagnoux, A. (2014). "Sensitivity analysis for multidimensional and functional outputs". Electronic Journal of Statistics . 8: 575–603. arXiv: 1311.1797 . doi:10.1214/14-EJS895. ISSN   1935-7524.
  12. Marrel, A.; Iooss, B.; Da Veiga, S.; Ribatet, M. (2012). "Global sensitivity analysis of stochastic computer models with joint metamodels". Statistics and Computing . 22 (3): 833–847. doi:10.1007/s11222-011-9274-8. ISSN   0960-3174.
  13. Marrel, A.; Iooss, B.; Van Dorpe, F.; Volkova, E. (2008). "An efficient methodology for modeling complex computer codes with Gaussian processes". Computational Statistics & Data Analysis . 52 (10): 4731–4744. arXiv: 0802.1099 . doi:10.1016/j.csda.2008.03.026.
  14. O'Hagan, A.; et al. (2006). Uncertain Judgements: Eliciting Experts' Probabilities. Chichester: Wiley. ISBN   9780470033302.
  15. Sacks, J.; Welch, W. J.; Mitchell, T. J.; Wynn, H. P. (1989). "Design and Analysis of Computer Experiments". Statistical Science. 4 (4): 409–435. doi: 10.1214/ss/1177012413 .
  16. Da Veiga S, Gamboa F, Iooss B, Prieur C (2021). Basics and Trends in Sensitivity Analysis. SIAM. doi:10.1137/1.9781611976694. ISBN   978-1-61197-668-7.
  17. Campbell, J.; et al. (2008). "Photosynthetic Control of Atmospheric Carbonyl Sulfide During the Growing Season". Science . 322 (5904): 1085–1088. Bibcode:2008Sci...322.1085C. doi:10.1126/science.1164015. PMID   19008442. S2CID   206515456.
  18. Bailis, R.; Ezzati, M.; Kammen, D. (2005). "Mortality and Greenhouse Gas Impacts of Biomass and Petroleum Energy Futures in Africa". Science . 308 (5718): 98–103. Bibcode:2005Sci...308...98B. doi:10.1126/science.1106881. PMID   15802601. S2CID   14404609.
  19. Murphy, J.; et al. (2004). "Quantification of modelling uncertainties in a large ensemble of climate change simulations". Nature . 430 (7001): 768–772. Bibcode:2004Natur.430..768M. doi:10.1038/nature02771. PMID   15306806. S2CID   980153.
  20. Czitrom, Veronica (1999). "One-Factor-at-a-Time Versus Designed Experiments". American Statistician. 53 (2): 126–131. doi:10.2307/2685731. JSTOR   2685731.
  21. Gatzouras, D; Giannopoulos, A (2009). "Threshold for the volume spanned by random points with independent coordinates". Israel Journal of Mathematics . 169 (1): 125–153. doi: 10.1007/s11856-009-0007-z .
  22. Morris MD (1991). "Factorial Sampling Plans for Preliminary Computational Experiments". Technometrics. 33 (2). Taylor & Francis: 161–174. doi:10.2307/1269043. JSTOR   1269043.
  23. Cacuci, Dan G. Sensitivity and Uncertainty Analysis: Theory. Vol. I. Chapman & Hall.
  24. Cacuci, Dan G.; Ionescu-Bujor, Mihaela; Navon, Michael (2005). Sensitivity and Uncertainty Analysis: Applications to Large-Scale Systems. Vol. II. Chapman & Hall.
  25. Griewank, A. (2000). Evaluating Derivatives, Principles and Techniques of Algorithmic Differentiation. SIAM.
  26. Kabir HD, Khosravi A, Nahavandi D, Nahavandi S. Uncertainty Quantification Neural Network from Similarity and Sensitivity. In2020 International Joint Conference on Neural Networks (IJCNN) 2020 Jul 19 (pp. 1-8). IEEE.
  27. Sobol', I (1990). "Sensitivity estimates for nonlinear mathematical models". Matematicheskoe Modelirovanie (in Russian). 2: 112–118.; translated in English in Sobol', I (1993). "Sensitivity analysis for non-linear mathematical models". Mathematical Modeling & Computational Experiment. 1: 407–414.
  28. Borgonovo E, Tarantola S, Plischke E, Morris MD (2014). "Transformations and invariance in the sensitivity analysis of computer experiments". Journal of the Royal Statistical Society. Series B (Statistical Methodology). 76 (5): 925–947. doi:10.1111/rssb.12052. ISSN   1369-7412.
  29. Rényi, A (1 September 1959). "On measures of dependence". Acta Mathematica Academiae Scientiarum Hungarica. 10 (3): 441–451. doi:10.1007/BF02024507. ISSN   1588-2632.
  30. Borgonovo E (June 2007). "A new uncertainty importance measure". Reliability Engineering & System Safety. 92 (6): 771–784. doi:10.1016/J.RESS.2006.04.015. ISSN   0951-8320.
  31. Chatterjee S (2 October 2021). "A New Coefficient of Correlation". Journal of the American Statistical Association. 116 (536): 2009–2022. arXiv: 1909.10140 . doi:10.1080/01621459.2020.1758115. ISSN   0162-1459.
  32. Wiesel JC (November 2022). "Measuring association with Wasserstein distances". Bernoulli. 28 (4): 2816–2832. arXiv: 2102.00356 . doi:10.3150/21-BEJ1438. ISSN   1350-7265.
  33. Barr J, Rabitz H (31 March 2022). "A Generalized Kernel Method for Global Sensitivity Analysis". SIAM/ASA Journal on Uncertainty Quantification. 10 (1). Society for Industrial and Applied Mathematics: 27–54. doi:10.1137/20M1354829.
  34. Pianosi F, Wagener T (2015). "A simple and efficient method for global sensitivity analysis based on cumulative distribution functions". Environmental Modelling & Software. 67: 1–11. Bibcode:2015EnvMS..67....1P. doi: 10.1016/j.envsoft.2015.01.004 .
  35. Razavi, Saman; Gupta, Hoshin V. (January 2016). "A new framework for comprehensive, robust, and efficient global sensitivity analysis: 1. Theory". Water Resources Research. 52 (1): 423–439. Bibcode:2016WRR....52..423R. doi: 10.1002/2015WR017558 . ISSN   1944-7973.
  36. Razavi, Saman; Gupta, Hoshin V. (January 2016). "A new framework for comprehensive, robust, and efficient global sensitivity analysis: 2. Application". Water Resources Research. 52 (1): 440–455. Bibcode:2016WRR....52..440R. doi: 10.1002/2015WR017559 . ISSN   1944-7973.
  37. Haghnegahdar, Amin; Razavi, Saman (September 2017). "Insights into sensitivity analysis of Earth and environmental systems models: On the impact of parameter perturbation scale". Environmental Modelling & Software. 95: 115–131. Bibcode:2017EnvMS..95..115H. doi:10.1016/j.envsoft.2017.03.031.
  38. Gupta, H; Razavi, S (2016). "Challenges and Future Outlook of Sensitivity Analysis". In Petropoulos, George; Srivastava, Prashant (eds.). Sensitivity Analysis in Earth Observation Modelling (1st ed.). Elsevier. pp. 397–415. ISBN   9780128030318.
  39. Owen AB (1 January 2014). "Sobol' Indices and Shapley Value". SIAM/ASA Journal on Uncertainty Quantification. 2 (1). Society for Industrial and Applied Mathematics: 245–251. doi:10.1137/130936233.
  40. Sudret, B. (2008). "Global sensitivity analysis using polynomial chaos expansions". Bayesian Networks in Dependability]. 93 (7): 964–979. doi:10.1016/j.ress.2007.04.002.
  41. 1 2 3 Storlie, C.B.; Swiler, L.P.; Helton, J.C.; Sallaberry, C.J. (2009). "Implementation and evaluation of nonparametric regression procedures for sensitivity analysis of computationally demanding models". Reliability Engineering & System Safety. 94 (11): 1735–1763. doi:10.1016/j.ress.2009.05.007.
  42. Wang, Shangying; Fan, Kai; Luo, Nan; Cao, Yangxiaolu; Wu, Feilun; Zhang, Carolyn; Heller, Katherine A.; You, Lingchong (2019-09-25). "Massive computational acceleration by using neural networks to emulate mechanism-based biological models". Nature Communications. 10 (1): 4354. Bibcode:2019NatCo..10.4354W. doi:10.1038/s41467-019-12342-y. ISSN   2041-1723. PMC   6761138 . PMID   31554788.
  43. 1 2 Oakley, J.; O'Hagan, A. (2004). "Probabilistic sensitivity analysis of complex models: a Bayesian approach". J. R. Stat. Soc. B. 66 (3): 751–769. CiteSeerX   10.1.1.6.9720 . doi:10.1111/j.1467-9868.2004.05304.x. S2CID   6130150.
  44. Gramacy, R. B.; Taddy, M. A. (2010). "Categorical Inputs, Sensitivity Analysis, Optimization and Importance Tempering with tgp Version 2, an R Package for Treed Gaussian Process Models" (PDF). Journal of Statistical Software. 33 (6). doi: 10.18637/jss.v033.i06 .
  45. Becker, W.; Worden, K.; Rowson, J. (2013). "Bayesian sensitivity analysis of bifurcating nonlinear models". Mechanical Systems and Signal Processing. 34 (1–2): 57–75. Bibcode:2013MSSP...34...57B. doi:10.1016/j.ymssp.2012.05.010.
  46. Sudret, B. (2008). "Global sensitivity analysis using polynomial chaos expansions". Reliability Engineering & System Safety. 93 (7): 964–979. doi:10.1016/j.ress.2007.04.002.
  47. Ratto, M.; Pagano, A. (2010). "Using recursive algorithms for the efficient identification of smoothing spline ANOVA models". AStA Advances in Statistical Analysis. 94 (4): 367–388. doi:10.1007/s10182-010-0148-8. S2CID   7678955.
  48. Cardenas, IC (2019). "On the use of Bayesian networks as a meta-modeling approach to analyse uncertainties in slope stability analysis". Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards. 13 (1): 53–65. Bibcode:2019GAMRE..13...53C. doi:10.1080/17499518.2018.1498524. S2CID   216590427.
  49. Li, G.; Hu, J.; Wang, S.-W.; Georgopoulos, P.; Schoendorf, J.; Rabitz, H. (2006). "Random Sampling-High Dimensional Model Representation (RS-HDMR) and orthogonality of its different order component functions". Journal of Physical Chemistry A. 110 (7): 2474–2485. Bibcode:2006JPCA..110.2474L. doi:10.1021/jp054148m. PMID   16480307.
  50. Li, G. (2002). "Practical approaches to construct RS-HDMR component functions". Journal of Physical Chemistry. 106 (37): 8721–8733. Bibcode:2002JPCA..106.8721L. doi:10.1021/jp014567t.
  51. Rabitz, H (1989). "System analysis at molecular scale". Science. 246 (4927): 221–226. Bibcode:1989Sci...246..221R. doi:10.1126/science.246.4927.221. PMID   17839016. S2CID   23088466.
  52. Hornberger, G.; Spear, R. (1981). "An approach to the preliminary analysis of environmental systems". Journal of Environmental Management. 7: 7–18.
  53. Box GEP, Hunter WG, Hunter, J. Stuart. Statistics for experimenters [Internet]. New York: Wiley & Sons
  54. Van der Sluijs, JP; Craye, M; Funtowicz, S; Kloprogge, P; Ravetz, J; Risbey, J (2005). "Combining quantitative and qualitative measures of uncertainty in model based environmental assessment: the NUSAP system". Risk Analysis. 25 (2): 481–492. Bibcode:2005RiskA..25..481V. doi:10.1111/j.1539-6924.2005.00604.x. hdl: 1874/386039 . PMID   15876219. S2CID   15988654.
  55. Lo Piano, S; Robinson, M (2019). "Nutrition and public health economic evaluations under the lenses of post normal science". Futures. 112: 102436. doi:10.1016/j.futures.2019.06.008. S2CID   198636712.
  56. 1 2 European Commission. 2021. “Better Regulation Toolbox.” November 25.
  57. Science Advice for Policy by European Academies, Making sense of science for policy under conditions of complexity and uncertainty, Berlin, 2019.
  58. Leamer, Edward E. (1983). "Let's Take the Con Out of Econometrics". American Economic Review . 73 (1): 31–43. JSTOR   1803924.
  59. Leamer, Edward E. (1985). "Sensitivity Analyses Would Help". American Economic Review . 75 (3): 308–313. JSTOR   1814801.
  60. Ravetz, J.R., 2007, No-Nonsense Guide to Science, New Internationalist Publications Ltd.
  61. "Archived copy" (PDF). Archived from the original (PDF) on 2011-04-26. Retrieved 2009-10-16.{{cite web}}: CS1 maint: archived copy as title (link)

Further reading