Central composite design

Last updated June 06, 2023

In statistics, a central composite design is an experimental design, useful in response surface methodology, for building a second order (quadratic) model for the response variable without needing to use a complete three-level factorial experiment.

Implementation

The design consists of three distinct sets of experimental runs:

A factorial (perhaps fractional) design in the factors studied, each having two levels;
A set of center points, experimental runs whose values of each factor are the medians of the values used in the factorial portion. This point is often replicated in order to improve the precision of the experiment;
A set of axial points, experimental runs identical to the centre points except for one factor, which will take on values both below and above the median of the two factorial levels, and typically both outside their range. All factors are varied in this way.

Design matrix

The design matrix for a central composite design experiment involving k factors is derived from a matrix, d, containing the following three different parts corresponding to the three types of experimental runs:

The matrix F obtained from the factorial experiment. The factor levels are scaled so that its entries are coded as +1 and −1.
The matrix C from the center points, denoted in coded variables as (0,0,0,...,0), where there are k zeros.
A matrix E from the axial points, with 2k rows. Each factor is sequentially placed at ±α and all other factors are at zero. The value of α is determined by the designer; while arbitrary, some values may give the design desirable properties. This part would look like:

\mathbf {E} ={\begin{bmatrix}\alpha &0&0&\cdots &\cdots &\cdots &0\\{-\alpha }&0&0&\cdots &\cdots &\cdots &0\\0&\alpha &0&\cdots &\cdots &\cdots &0\\0&{-\alpha }&0&\cdots &\cdots &\cdots &0\\\vdots &{}&{}&{}&{}&{}&\vdots \\0&0&0&0&\cdots &\cdots &\alpha \\0&0&0&0&\cdots &\cdots &{-\alpha }\\\end{bmatrix}}.

Then d is the vertical concatenation:

\mathbf {d} ={\begin{bmatrix}\mathbf {F} \\\mathbf {C} \\\mathbf {E} \end{bmatrix}}.

The design matrix X used in linear regression is the horizontal concatenation of a column of 1s (intercept), d, and all elementwise products of a pair of columns of d:

\mathbf {X} ={\begin{bmatrix}\mathbf {1} &\mathbf {d} &\mathbf {d} (1)\times \mathbf {d} (2)&\mathbf {d} (1)\times \mathbf {d} (3)&\cdots &\mathbf {d} (k-1)\times \mathbf {d} (k)&\mathbf {d} (1)^{2}&\mathbf {d} (2)^{2}&\cdots &\mathbf {d} (k)^{2}\end{bmatrix}},

where d(i) represents the ith column in d.

Choosing α

There are many different methods to select a useful value of α. Let F be the number of points due to the factorial design and T = 2k + n, the number of additional points, where n is the number of central points in the design. Common values are as follows (Myers, 1971):

Orthogonal design:: $\alpha =(Q\times F/4)^{1/4}\,\!$ , where $Q=({\sqrt {F+T}}-{\sqrt {F}})^{2}$ ;
Rotatable design: α = F^1/4 (the design implemented by MATLAB’s ccdesign function).

Application of central composite designs for optimization

Statistical approaches such as Response Surface Methodology can be employed to maximize the production of a special substance by optimization of operational factors. In contrast to conventional methods, the interaction among process variables can be determined by statistical techniques. For instance, in a study, a central composite design was employed to investigate the effect of critical parameters of organosolv pretreatment of rice straw including temperature, time, and ethanol concentration. The residual solid, lignin recovery, and hydrogen yield were selected as the response variables.^[1]

Related Research Articles

<span class="mw-page-title-main">Gradient</span> Multivariate derivative (mathematics)

In vector calculus, the gradient of a scalar-valued differentiable function $of several variables is the vector field whose value at a point is the "direction and rate of fastest increase". If the gradient of a function is non-zero at a point, the direction of the gradient is the direction in which the function increases most quickly from, and the magnitude of the gradient is the rate of increase in that direction, the greatest absolute directional derivative. Further, a point where the gradient is the zero vector is known as a stationary point. The gradient thus plays a fundamental role in optimization theory, where it is used to maximize a function by gradient ascent. In coordinate-free terms, the gradient of a function may be defined by:$

Buckingham <span class="texhtml mvar" style="font-style:italic;">π</span> theorem Theorem in dimensional analysis

In engineering, applied mathematics, and physics, the Buckingham $π$ theorem is a key theorem in dimensional analysis. It is a formalization of Rayleigh's method of dimensional analysis. Loosely, the theorem states that if there is a physically meaningful equation involving a certain number n of physical variables, then the original equation can be rewritten in terms of a set of p = n − k dimensionless parameters $π$ ₁, $π$ ₂, ..., $π$ _p constructed from the original variables, where k is the number of physical dimensions involved; it is obtained as the rank of a particular matrix.

In mathematics, particularly in linear algebra, matrix multiplication is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix. The resulting matrix, known as the matrix product, has the number of rows of the first and the number of columns of the second matrix. The product of matrices $A$ and $B$ is denoted as $AB$ .

In statistics, the Gauss–Markov theorem states that the ordinary least squares (OLS) estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in the linear regression model are uncorrelated, have equal variances and expectation value of zero. The errors do not need to be normal, nor do they need to be independent and identically distributed. The requirement that the estimator be unbiased cannot be dropped, since biased estimators exist with lower variance. See, for example, the James–Stein estimator, ridge regression, or simply any degenerate estimator.

<span class="mw-page-title-main">Covariance matrix</span> Measure of covariance of components of a random vector

In probability theory and statistics, a covariance matrix is a square matrix giving the covariance between each pair of elements of a given random vector. Any covariance matrix is symmetric and positive semi-definite and its main diagonal contains variances.

In vector calculus, the Jacobian matrix of a vector-valued function of several variables is the matrix of all its first-order partial derivatives. When this matrix is square, that is, when the function takes the same number of variables as input as the number of vector components of its output, its determinant is referred to as the Jacobian determinant. Both the matrix and the determinant are often referred to simply as the Jacobian in literature.

In the mathematical field of differential geometry, a metric tensor is an additional structure on a manifold $M$ that allows defining distances and angles, just as the inner product on a Euclidean space allows defining distances and angles there. More precisely, a metric tensor at a point $p$ of $M$ is a bilinear form defined on the tangent space at $p$ , and a metric tensor on $M$ consists of a metric tensor at each point $p$ of $M$ that varies smoothly with $p$ .

In linear algebra, a QR decomposition, also known as a QR factorization or QU factorization, is a decomposition of a matrix A into a product A = QR of an orthonormal matrix Q and an upper triangular matrix R. QR decomposition is often used to solve the linear least squares problem and is the basis for a particular eigenvalue algorithm, the QR algorithm.

In mathematics, the Hessian matrix or Hessian is a square matrix of second-order partial derivatives of a scalar-valued function, or scalar field. It describes the local curvature of a function of many variables. The Hessian matrix was developed in the 19th century by the German mathematician Ludwig Otto Hesse and later named after him. Hesse originally used the term "functional determinants".

In control engineering, model based fault detection and system identification a state-space representation is a mathematical model of a physical system specified as a set of input, output and variables related by first-order differential equations or difference equations. Such variables, called state variables, evolve over time in a way that depends on the values they have at any given instant and on the externally imposed values of input variables. Output variables’ values depend on the values of the state variables.

In mathematics, the matrix exponential is a matrix function on square matrices analogous to the ordinary exponential function. It is used to solve systems of linear differential equations. In the theory of Lie groups, the matrix exponential gives the exponential map between a matrix Lie algebra and the corresponding Lie group.

In mathematics, the real coordinate space of dimension $n$ , denoted $R n$ or $,$ is the set of the $n$ -tuples of real numbers, that is the set of all sequences of $n$ real numbers. Special cases are called the real line $R 1$ and the real coordinate plane $R 2$ . With component-wise addition and scalar multiplication, it is a real vector space, and its elements are called coordinate vectors.

In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the input dataset and the output of the (linear) function of the independent variable.

In statistics, a full factorial experiment is an experiment whose design consists of two or more factors, each with discrete possible values or "levels", and whose experimental units take on all possible combinations of these levels across all such factors. A full factorial design may also be called a fully crossed design. Such an experiment allows the investigator to study the effect of each factor on the response variable, as well as the effects of interactions between factors on the response variable.

In linear algebra, an eigenvector or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted by $, is the factor by which the eigenvector is scaled.$

In mathematics, the Weierstrass–Enneper parameterization of minimal surfaces is a classical piece of differential geometry.

A multi-compartment model is a type of mathematical model used for describing the way materials or energies are transmitted among the compartments of a system. Sometimes, the physical system that we try to model in equations is too complex, so it is much easier to discretize the problem and reduce the number of parameters. Each compartment is assumed to be a homogeneous entity within which the entities being modeled are equivalent. A multi-compartment model is classified as a lumped parameters model. Similar to more general mathematical models, multi-compartment models can treat variables as continuous, such as a differential equation, or as discrete, such as a Markov chain. Depending on the system being modeled, they can be treated as stochastic or deterministic.

In mathematics, the discrete Fourier transform over a ring generalizes the discrete Fourier transform (DFT), of a function whose values are commonly complex numbers, over an arbitrary ring.

In statistics, polynomial regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is modelled as an nth degree polynomial in x. Polynomial regression fits a nonlinear relationship between the value of x and the corresponding conditional mean of y, denoted E(y |x). Although polynomial regression fits a nonlinear model to the data, as a statistical estimation problem it is linear, in the sense that the regression function E(y | x) is linear in the unknown parameters that are estimated from the data. For this reason, polynomial regression is considered to be a special case of multiple linear regression.

In analytical mechanics, the mass matrix is a symmetric matrix $M$ that expresses the connection between the time derivative $of the generalized coordinate vector q of a system and the kinetic energy T of that system, by the equation$

References

↑ Asadi, Nooshin; Zilouei, Hamid (March 2017). "Optimization of organosolv pretreatment of rice straw for enhanced biohydrogen production using Enterobacter aerogenes". Bioresource Technology. 227: 335–344. doi:10.1016/j.biortech.2016.12.073. PMID 28042989.

Myers, Raymond H. Response Surface Methodology. Boston: Allyn and Bacon, Inc., 1971

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Organosolv-1] Asadi, Nooshin; Zilouei, Hamid (March 2017). "Optimization of organosolv pretreatment of rice straw for enhanced biohydrogen production using Enterobacter aerogenes". Bioresource Technology. 227: 335–344. doi:10.1016/j.biortech.2016.12.073. PMID 28042989.

[1]

v t e Design of experiments
Scientific method	Scientific experiment Statistical design Control Internal and external validity Experimental unit Blinding Optimal design : Bayesian Random assignment Randomization Restricted randomization Replication versus subsampling Sample size
Treatment and blocking	Treatment Effect size Contrast Interaction Confounding Orthogonality Blocking Covariate Nuisance variable
Models and inference	Linear regression Ordinary least squares Bayesian Random effect Mixed model Hierarchical model: Bayesian Analysis of variance (Anova) Cochran's theorem Manova (multivariate) Ancova (covariance) Compare means Multiple comparison
Designs Completely randomized	Factorial Fractional factorial Plackett-Burman Taguchi Response surface methodology Polynomial and rational modeling Box-Behnken Central composite Block Generalized randomized block design (GRBD) Latin square Graeco-Latin square Orthogonal array Latin hypercube Repeated measures design Crossover study Randomized controlled trial Sequential analysis Sequential probability ratio test
Glossary Category Mathematicsportal Statistical outline Statistical topics