Hierarchical generalized linear model

Last updated January 03, 2025

In statistics, hierarchical generalized linear models extend generalized linear models by relaxing the assumption that error components are independent.^[1] This allows models to be built in situations where more than one error term is necessary and also allows for dependencies between error terms.^[2] The error components can be correlated and not necessarily follow a normal distribution. When there are different clusters, that is, groups of observations, the observations in the same cluster are correlated. In fact, they are positively correlated because observations in the same cluster share some common features. In this situation, using generalized linear models and ignoring the correlations may cause problems.^[3]

Overview and model

Model

In a hierarchical model, observations are grouped into clusters, and the distribution of an observation is determined not only by common structure among all clusters but also by the specific structure of the cluster where this observation belongs. So a random effect component, different for different clusters, is introduced into the model. Let $y$ be the response, $u$ be the random effect, $g$ be the link function, $\eta =X\beta$ , and $v=v(u)$ is some strictly monotone function of $u$ . In a hierarchical generalized linear model, the assumption on $y|u$ and $u$ need to be made:^[2] $y\mid u\sim \ f(\theta ,\,\phi )$ and $u\sim \ f_{u}(\alpha ).$

The linear predictor is in the form:

g(E(y))=g(\mu )=\eta =X\beta +v\,

where $g$ is the link function, $\mu =E(y)$ , $\eta =X\beta +v$ , and $v=v(u)$ is a monotone function of $u$ . In this hierarchical generalized linear model, the fixed effect is described by $\beta$ , which is the same for all observations. The random component $u$ is unobserved and varies among clusters randomly. So $v$ takes the same value for observations in the same cluster and different values for observations in different clusters.^[3]

Identifiability

Identifiability is a concept in statistics. In order to perform parameter inference, it is necessary to make sure that the identifiability property holds.^[4] In the model stated above, the location of v is not identifiable, since

X\beta +v=(X\beta +a)+(v-a)\,

for constant $a$ .^[2] In order to make the model identifiable, we need to impose constraints on parameters. The constraint is usually imposed on random effects, such as $E(v)=0$ .^[2]

Models with different distributions and link functions

By assuming different distributions of $y\mid u$ and $u$ , and using different functions of $g$ and ' $v$ , we will be able to obtain different models. Moreover, the generalized linear mixed model (GLMM) is a special case of the hierarchical generalized linear model. In hierarchical generalized linear models, the distributions of random effect $u$ do not necessarily follow normal distribution. If the distribution of $u$ is normal and the link function of $v$ is the identity function, then hierarchical generalized linear model is the same as GLMM.^[2]

Distributions of $y\mid u$ and $u$ can also be chosen to be conjugate, since nice properties hold and it is easier for computation and interpretation.^[2] For example, if the distribution of $y\mid u$ is Poisson with certain mean, the distribution of $u$ is Gamma, and canonical log link is used, then we call the model Poisson conjugate hierarchical generalized linear models. If $y\mid u$ follows binomial distribution with certain mean, $u$ has the conjugate beta distribution, and canonical logit link is used, then we call the model Beta conjugate model. Moreover, the mixed linear model is the normal conjugate hierarchical generalized linear models.^[2]

A summary of commonly used models are:^[5]

Commonly used models
Model name	distribution of y	Link function between y and u	distribution of u	Link function between u and v
Normal conjugate	Normal	Identity	Normal	Identity
Binomial conjugate	Binomial	Logit	Beta	Logit
Poisson conjugate	Poisson	Log	Gamma	Log
Gamma conjugate	Gamma	Reciprocal	Inv-gamma	Reciprocal
Binomial GLMM	Binomial	Logit	Normal	Identity
Poisson GLMM	Poisson	Log	Normal	Identity
Gamma GLMM	Gamma	Log	Normal	Identity

Fitting the hierarchical generalized linear models

Hierarchical generalized linear models are used when observations come from different clusters. There are two types of estimators: fixed effect estimators and random effect estimators, corresponding to parameters in : $\eta =\mathbf {x} {\boldsymbol {\beta }}$ and in $\mathbf {v(u)}$ , respectively. There are different ways to obtain parameter estimates for a hierarchical generalized linear model. If only fixed effect estimators are of interests, the population-averaged model can be used. If inference is focused on individuals, random effects will have to be predicted.^[3] There are different techniques to fit a hierarchical generalized linear model.

Examples and applications

Hierarchical generalized linear model have been used to solve different real-life problems.

Engineering

For example, this method was used to analyze semiconductor manufacturing, because interrelated processes form a complex hierarchy.^[6] Semiconductor fabrication is a complex process which requires different interrelated processes.^[7] Hierarchical generalized linear model, requiring clustered data, is able to deal with complicated process. Engineers can use this model to find out and analyze important subprocesses, and at the same time, evaluate the influences of these subprocesses on final performance.^[6]

Business

Market research problems can also be analyzed by using hierarchical generalized linear models. Researchers applied the model to consumers within countries in order to solve problems in nested data structure in international marketing research.^[8]

Related Research Articles

In regression analysis, least squares is a parameter estimation method based on minimizing the sum of the squares of the residuals made in the results of each individual equation.

<span class="mw-page-title-main">Gumbel distribution</span> Particular case of the generalized extreme value distribution

In probability theory and statistics, the Gumbel distribution is used to model the distribution of the maximum of a number of samples of various distributions.

In probability and statistics, an exponential family is a parametric set of probability distributions of a certain form, specified below. This special form is chosen for mathematical convenience, including the enabling of the user to calculate expectations, covariances using differentiation based on some useful algebraic properties, as well as for generality, as exponential families are in a sense very natural sets of distributions to consider. The term exponential class is sometimes used in place of "exponential family", or the older term Koopman–Darmois family. Sometimes loosely referred to as "the" exponential family, this class of distributions is distinct because they all possess a variety of desirable properties, most importantly the existence of a sufficient statistic.

Empirical Bayes methods are procedures for statistical inference in which the prior probability distribution is estimated from the data. This approach stands in contrast to standard Bayesian methods, for which the prior distribution is fixed before any data are observed. Despite this difference in perspective, empirical Bayes may be viewed as an approximation to a fully Bayesian treatment of a hierarchical model wherein the parameters at the highest level of the hierarchy are set to their most likely values, instead of being integrated out. Empirical Bayes, also known as maximum marginal likelihood, represents a convenient approach for setting hyperparameters, but has been mostly supplanted by fully Bayesian hierarchical analyses since the 2000s with the increasing availability of well-performing computation techniques. It is still commonly used, however, for variational methods in Deep Learning, such as variational autoencoders, where latent variable spaces are high-dimensional.

In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.

In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables. The data are fitted by a method of successive approximations (iterations).

In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables (IV) is used to estimate causal relationships when controlled experiments are not feasible or when a treatment is not successfully delivered to every unit in a randomized experiment. Intuitively, IVs are used when an explanatory variable of interest is correlated with the error term (endogenous), in which case ordinary least squares and ANOVA give biased results. A valid instrument induces changes in the explanatory variable but has no independent effect on the dependent variable and is not correlated with the error term, allowing a researcher to uncover the causal effect of the explanatory variable on the dependent variable.

In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the input dataset and the output of the (linear) function of the independent variable. Some sources consider OLS to be linear regression.

In statistics, generalized least squares (GLS) is a method used to estimate the unknown parameters in a linear regression model. It is used when there is a non-zero amount of correlation between the residuals in the regression model. GLS is employed to improve statistical efficiency and reduce the risk of drawing erroneous inferences, as compared to conventional least squares and weighted least squares methods. It was first described by Alexander Aitken in 1935.

Multilevel models are statistical models of parameters that vary at more than one level. An example could be a model of student performance that contains measures for individual students as well as measures for classrooms within which the students are grouped. These models can be seen as generalizations of linear models, although they can also extend to non-linear models. These models became much more popular after sufficient computing power and software became available.

In statistics, binomial regression is a regression analysis technique in which the response has a binomial distribution: it is the number of successes in a series of ⁠ $⁠$ independent Bernoulli trials, where each trial has probability of success ⁠ $⁠$ . In binomial regression, the probability of a success is related to explanatory variables: the corresponding concept in ordinary regression is to relate the mean value of the unobserved response to explanatory variables.

A mixed model, mixed-effects model or mixed error-component model is a statistical model containing both fixed effects and random effects. These models are useful in a wide variety of disciplines in the physical, biological and social sciences. They are particularly useful in settings where repeated measurements are made on the same statistical units, or where measurements are made on clusters of related statistical units. Mixed models are often preferred over traditional analysis of variance regression models because they don't rely on the independent observations assumption. Further, they have their flexibility in dealing with missing values and uneven spacing of repeated measurements. The Mixed model analysis allows measurements to be explicitly modeled in a wider variety of correlation and variance-covariance avoiding biased estimations structures.

In statistics, identifiability is a property which a model must satisfy for precise inference to be possible. A model is identifiable if it is theoretically possible to learn the true values of this model's underlying parameters after obtaining an infinite number of observations from it. Mathematically, this is equivalent to saying that different values of the parameters must generate different probability distributions of the observable variables. Usually the model is identifiable only under certain technical restrictions, in which case the set of these requirements is called the identification conditions.

In statistics, an errors-in-variables model or a measurement error model is a regression model that accounts for measurement errors in the independent variables. In contrast, standard regression models assume that those regressors have been measured exactly, or observed without error; as such, those models account only for errors in the dependent variables, or responses.

The generalized additive model for location, scale and shape (GAMLSS) is a semiparametric regression model in which a parametric statistical distribution is assumed for the response (target) variable but the parameters of this distribution can vary according to explanatory variables. GAMLSS is a form of supervised machine learning.

In statistics, the variance function is a smooth function that depicts the variance of a random quantity as a function of its mean. The variance function is a measure of heteroscedasticity and plays a large role in many settings of statistical modelling. It is a main ingredient in the generalized linear model framework and a tool used in non-parametric regression, semiparametric regression and functional data analysis. In parametric modeling, variance functions take on a parametric form and explicitly describe the relationship between the variance and the mean of a random quantity. In a non-parametric setting, the variance function is assumed to be a smooth function.

The generalized functional linear model (GFLM) is an extension of the generalized linear model (GLM) that allows one to regress univariate responses of various types on functional predictors, which are mostly random trajectories generated by a square-integrable stochastic processes. Similarly to GLM, a link function relates the expected value of the response variable to a linear predictor, which in case of GFLM is obtained by forming the scalar product of the random predictor function $with a smooth parameter function . Functional Linear Regression, Functional Poisson Regression and Functional Binomial Regression, with the important Functional Logistic Regression included, are special cases of GFLM. Applications of GFLM include classification and discrimination of stochastic processes and functional data.$

In statistics, linear regression is a model that estimates the linear relationship between a scalar response and one or more explanatory variables. A model with exactly one explanatory variable is a simple linear regression; a model with two or more explanatory variables is a multiple linear regression. This term is distinct from multivariate linear regression, which predicts multiple correlated dependent variables rather than a single dependent variable.

In statistics, the class of vector generalized linear models (VGLMs) was proposed to enlarge the scope of models catered for by generalized linear models (GLMs). In particular, VGLMs allow for response variables outside the classical exponential family and for more than one parameter. Each parameter can be transformed by a link function. The VGLM framework is also large enough to naturally accommodate multiple responses; these are several independent responses each coming from a particular statistical distribution with possibly different parameter values.

Nonlinear mixed-effects models constitute a class of statistical models generalizing linear mixed-effects models. Like linear mixed-effects models, they are particularly useful in settings where there are multiple measurements within the same statistical units or when there are dependencies between measurements on related statistical units. Nonlinear mixed-effects models are applied in many fields including medicine, public health, pharmacology, and ecology.

References

↑ Generalized Linear Models. Chapman and Hall/CRC. 1989. ISBN 0-412-31760-5.
1 2 3 4 5 6 7 Y. Lee; J. A. Nelder (1996). "Hierarchical Generalized Linear Models". Journal of the Royal Statistical Society, Series B. 58 (4): 619–678. doi:10.1111/j.2517-6161.1996.tb02105.x. JSTOR 2346105.
1 2 3 Agresti, Alan (2002). Categorical Data Analysis. Hoboken, New Jersey: John Wiley & Sons, Inc. ISBN 0-471-36093-7.
↑ Allman, Elizabeth S.; Matias, Catherine; Rhodes, John A. (2009). "Identifiability of Parameters in Latent Structure Models with Many Observed Variables". The Annals of Statistics. 37, No. 6A (6A): 3099–3132. arXiv: 0809.5032 . Bibcode:2008arXiv0809.5032A. doi:10.1214/09-AOS689. S2CID 16738108.
↑ Lars Rönnegård; Xia Shen; Moudud Alam (Dec 2010). "hglm: A Package for Fitting Hierarchical Generalized Linear Models". The R Journal. 2/2.
1 2 Naveen Kumar; Christina Mastrangelo; Doug Montgomery (2011). "Hierarchical Modeling Using Generalized Linear Models". Quality and Reliability Engineering International.
↑ Chung Kwan Shin; Sang Chan Park (2000). "A machine learning approach to yield management in semiconductor manufacturing". International Journal of Production Research. 38 (17): 4261–4271. doi:10.1080/00207540050205073. S2CID 111295634.
↑ Burcu Tasoluk; Cornelia Dröge; Roger J. Calantone (2011). "Interpreting interrelations across multiple levels in HGLM models: An application in international marketing research". International Marketing Review. 28 (1): 34–56. doi:10.1108/02651331111107099.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Generalized Linear Models. Chapman and Hall/CRC. 1989. ISBN 0-412-31760-5.

[paper1996-2] 1 2 3 4 5 6 7 Y. Lee; J. A. Nelder (1996). "Hierarchical Generalized Linear Models". Journal of the Royal Statistical Society, Series B. 58 (4): 619–678. doi:10.1111/j.2517-6161.1996.tb02105.x. JSTOR 2346105.

[CDA-3] 1 2 3 Agresti, Alan (2002). Categorical Data Analysis. Hoboken, New Jersey: John Wiley & Sons, Inc. ISBN 0-471-36093-7.

[identifiability-4] Allman, Elizabeth S.; Matias, Catherine; Rhodes, John A. (2009). "Identifiability of Parameters in Latent Structure Models with Many Observed Variables". The Annals of Statistics. 37, No. 6A (6A): 3099–3132. arXiv: 0809.5032 . Bibcode:2008arXiv0809.5032A. doi:10.1214/09-AOS689. S2CID 16738108.

[5] Lars Rönnegård; Xia Shen; Moudud Alam (Dec 2010). "hglm: A Package for Fitting Hierarchical Generalized Linear Models". The R Journal. 2/2.

[engg-6] 1 2 Naveen Kumar; Christina Mastrangelo; Doug Montgomery (2011). "Hierarchical Modeling Using Generalized Linear Models". Quality and Reliability Engineering International.

[7] Chung Kwan Shin; Sang Chan Park (2000). "A machine learning approach to yield management in semiconductor manufacturing". International Journal of Production Research. 38 (17): 4261–4271. doi:10.1080/00207540050205073. S2CID 111295634.

[mktresearch-8] Burcu Tasoluk; Cornelia Dröge; Roger J. Calantone (2011). "Interpreting interrelations across multiple levels in HGLM models: An application in international marketing research". International Marketing Review. 28 (1): 34–56. doi:10.1108/02651331111107099.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]