Multilevel model

Last updated

Multilevel models (also known as hierarchical linear models, linear mixed-effect model, mixed models, nested data models, random coefficient, random-effects models, random parameter models, or split-plot designs) are statistical models of parameters that vary at more than one level. [1] An example could be a model of student performance that contains measures for individual students as well as measures for classrooms within which the students are grouped. These models can be seen as generalizations of linear models (in particular, linear regression), although they can also extend to non-linear models. These models became much more popular after sufficient computing power and software became available. [1]

Contents

Multilevel models are particularly appropriate for research designs where data for participants are organized at more than one level (i.e., nested data). [2] The units of analysis are usually individuals (at a lower level) who are nested within contextual/aggregate units (at a higher level). [3] While the lowest level of data in multilevel models is usually an individual, repeated measurements of individuals may also be examined. [2] [4] As such, multilevel models provide an alternative type of analysis for univariate or multivariate analysis of repeated measures. Individual differences in growth curves may be examined. [2] Furthermore, multilevel models can be used as an alternative to ANCOVA, where scores on the dependent variable are adjusted for covariates (e.g. individual differences) before testing treatment differences. [5] Multilevel models are able to analyze these experiments without the assumptions of homogeneity-of-regression slopes that is required by ANCOVA. [2]

Multilevel models can be used on data with many levels, although 2-level models are the most common and the rest of this article deals only with these. The dependent variable must be examined at the lowest level of analysis. [1]

Level 1 regression equation

When there is a single level 1 independent variable, the level 1 model is

.

At Level 1, both the intercepts and slopes in the groups can be either fixed (meaning that all groups have the same values, although in the real world this would be a rare occurrence), non-randomly varying (meaning that the intercepts and/or slopes are predictable from an independent variable at Level 2), or randomly varying (meaning that the intercepts and/or slopes are different in the different groups, and that each have their own overall mean and variance). [2] [4]

When there are multiple level 1 independent variables, the model can be expanded by substituting vectors and matrices in the equation.

When the relationship between the response and predictor can not be described by the linear relationship, then one can find some non linear functional relationship between the response and predictor, and extend the model to nonlinear mixed-effects model. For example, when the response is the cumulative infection trajectory of the -th country, and represents the -th time points, then the ordered pair for each country may show a shape similar to logistic function. [6] [7]

Level 2 regression equation

The dependent variables are the intercepts and the slopes for the independent variables at Level 1 in the groups of Level 2.

Types of models

Before conducting a multilevel model analysis, a researcher must decide on several aspects, including which predictors are to be included in the analysis, if any. Second, the researcher must decide whether parameter values (i.e., the elements that will be estimated) will be fixed or random. [2] [5] [4] Fixed parameters are composed of a constant over all the groups, whereas a random parameter has a different value for each of the groups. [4] Additionally, the researcher must decide whether to employ a maximum likelihood estimation or a restricted maximum likelihood estimation type. [2]

Random intercepts model

A random intercepts model is a model in which intercepts are allowed to vary, and therefore, the scores on the dependent variable for each individual observation are predicted by the intercept that varies across groups. [5] [8] [4] This model assumes that slopes are fixed (the same across different contexts). In addition, this model provides information about intraclass correlations, which are helpful in determining whether multilevel models are required in the first place. [2]

Random slopes model

A random slopes model is a model in which slopes are allowed to vary according to a correlation matrix, and therefore, the slopes are different across grouping variable such as time or individuals. This model assumes that intercepts are fixed (the same across different contexts). [5]

Random intercepts and slopes model

A model that includes both random intercepts and random slopes is likely the most realistic type of model, although it is also the most complex. In this model, both intercepts and slopes are allowed to vary across groups, meaning that they are different in different contexts. [5]

Developing a multilevel model

In order to conduct a multilevel model analysis, one would start with fixed coefficients (slopes and intercepts). One aspect would be allowed to vary at a time (that is, would be changed), and compared with the previous model in order to assess better model fit. [1] There are three different questions that a researcher would ask in assessing a model. First, is it a good model? Second, is a more complex model better? Third, what contribution do individual predictors make to the model?

In order to assess models, different model fit statistics would be examined. [2] One such statistic is the chi-square likelihood-ratio test, which assesses the difference between models. The likelihood-ratio test can be employed for model building in general, for examining what happens when effects in a model are allowed to vary, and when testing a dummy-coded categorical variable as a single effect. [2] However, the test can only be used when models are nested (meaning that a more complex model includes all of the effects of a simpler model). When testing non-nested models, comparisons between models can be made using the Akaike information criterion (AIC) or the Bayesian information criterion (BIC), among others. [1] [2] [5] See further Model selection.

Assumptions

Multilevel models have the same assumptions as other major general linear models (e.g., ANOVA, regression), but some of the assumptions are modified for the hierarchical nature of the design (i.e., nested data).

Linearity
Linearity Graphs.jpg

The assumption of linearity states that there is a rectilinear (straight-line, as opposed to non-linear or U-shaped) relationship between variables. [9] However, the model can be extended to nonlinear relationships. [10] Particularly, when the mean part of the level 1 regression equation is replaced with a non-linear parametric function, then such a model framework is widely called the nonlinear mixed-effects model. [7]

Normality

The assumption of normality states that the error terms at every level of the model are normally distributed. [9] [ disputed ] However, most statistical software allows one to specify different distributions for the variance terms, such as a Poisson, binomial, logistic. The multilevel modelling approach can be used for all forms of Generalized Linear models.

Homoscedasticity

The assumption of homoscedasticity, also known as homogeneity of variance, assumes equality of population variances. [9] However, different variance-correlation matrix can be specified to account for this, and the heterogeneity of variance can itself be modeled.

Independence of observations (No Autocorrelation of Model's Residuals)

Independence is an assumption of general linear models, which states that cases are random samples from the population and that scores on the dependent variable are independent of each other. [9] One of the main purposes of multilevel models is to deal with cases where the assumption of independence is violated; multilevel models do, however, assume that 1) the level 1 and level 2 residuals are uncorrelated and 2) The errors (as measured by the residuals) at the highest level are uncorrelated. [11]

Orthogonality of regressors to random effects

The regressors must not correlate with the random effects, . This assumption is testable but often ignored, rendering the estimator inconsistent. [12] If this assumption is violated, the random-effect must be modeled explicitly in the fixed part of the model, either by using dummy variables or including cluster means of all regressors. [12] [13] [14] [15] This assumption is probably the most important assumption the estimator makes, but one that is misunderstood by most applied researchers using these types of models. [12]

Statistical tests

The type of statistical tests that are employed in multilevel models depend on whether one is examining fixed effects or variance components. When examining fixed effects, the tests are compared with the standard error of the fixed effect, which results in a Z-test. [5] A t-test can also be computed. When computing a t-test, it is important to keep in mind the degrees of freedom, which will depend on the level of the predictor (e.g., level 1 predictor or level 2 predictor). [5] For a level 1 predictor, the degrees of freedom are based on the number of level 1 predictors, the number of groups and the number of individual observations. For a level 2 predictor, the degrees of freedom are based on the number of level 2 predictors and the number of groups. [5]

Statistical Power Model.jpg

Statistical power

Statistical power for multilevel models differs depending on whether it is level 1 or level 2 effects that are being examined. Power for level 1 effects is dependent upon the number of individual observations, whereas the power for level 2 effects is dependent upon the number of groups. [16] To conduct research with sufficient power, large sample sizes are required in multilevel models. However, the number of individual observations in groups is not as important as the number of groups in a study. In order to detect cross-level interactions, given that the group sizes are not too small, recommendations have been made that at least 20 groups are needed, [16] although many fewer can be used if one is only interested in inference on the fixed effects and the random effects are control, or "nuisance", variables. [4] The issue of statistical power in multilevel models is complicated by the fact that power varies as a function of effect size and intraclass correlations, it differs for fixed effects versus random effects, and it changes depending on the number of groups and the number of individual observations per group. [16]

Applications

Level

The concept of level is the keystone of this approach. In an educational research example, the levels for a 2-level model might be

  1. pupil
  2. class

However, if one were studying multiple schools and multiple school districts, a 4-level model could include

  1. pupil
  2. class
  3. school
  4. district

The researcher must establish for each variable the level at which it was measured. In this example "test score" might be measured at pupil level, "teacher experience" at class level, "school funding" at school level, and "urban" at district level.

Example

As a simple example, consider a basic linear regression model that predicts income as a function of age, class, gender and race. It might then be observed that income levels also vary depending on the city and state of residence. A simple way to incorporate this into the regression model would be to add an additional independent categorical variable to account for the location (i.e. a set of additional binary predictors and associated regression coefficients, one per location). This would have the effect of shifting the mean income up or down—but it would still assume, for example, that the effect of race and gender on income is the same everywhere. In reality, this is unlikely to be the case—different local laws, different retirement policies, differences in level of racial prejudice, etc. are likely to cause all of the predictors to have different sorts of effects in different locales.

In other words, a simple linear regression model might, for example, predict that a given randomly sampled person in Seattle would have an average yearly income $10,000 higher than a similar person in Mobile, Alabama. However, it would also predict, for example, that a white person might have an average income $7,000 above a black person, and a 65-year-old might have an income $3,000 below a 45-year-old, in both cases regardless of location. A multilevel model, however, would allow for different regression coefficients for each predictor in each location. Essentially, it would assume that people in a given location have correlated incomes generated by a single set of regression coefficients, whereas people in another location have incomes generated by a different set of coefficients. Meanwhile, the coefficients themselves are assumed to be correlated and generated from a single set of hyperparameters. Additional levels are possible: For example, people might be grouped by cities, and the city-level regression coefficients grouped by state, and the state-level coefficients generated from a single hyper-hyperparameter.

Multilevel models are a subclass of hierarchical Bayesian models, which are general models with multiple levels of random variables and arbitrary relationships among the different variables. Multilevel analysis has been extended to include multilevel structural equation modeling, multilevel latent class modeling, and other more general models.

Uses

Multilevel models have been used in education research or geographical research, to estimate separately the variance between pupils within the same school, and the variance between schools. In psychological applications, the multiple levels are items in an instrument, individuals, and families. In sociological applications, multilevel models are used to examine individuals embedded within regions or countries. In organizational psychology research, data from individuals must often be nested within teams or other functional units. They are often used in ecological research as well under the more general term mixed models. [4]

Different covariables may be relevant on different levels. They can be used for longitudinal studies, as with growth studies, to separate changes within one individual and differences between individuals.

Cross-level interactions may also be of substantive interest; for example, when a slope is allowed to vary randomly, a level-2 predictor may be included in the slope formula for the level-1 covariate. For example, one may estimate the interaction of race and neighborhood to obtain an estimate of the interaction between an individual's characteristics and the social context.

Applications to longitudinal (repeated measures) data

Alternative ways of analyzing hierarchical data

There are several alternative ways of analyzing hierarchical data, although most of them have some problems. First, traditional statistical techniques can be used. One could disaggregate higher-order variables to the individual level, and thus conduct an analysis on this individual level (for example, assign class variables to the individual level). The problem with this approach is that it would violate the assumption of independence, and thus could bias our results. This is known as atomistic fallacy. [17] Another way to analyze the data using traditional statistical approaches is to aggregate individual level variables to higher-order variables and then to conduct an analysis on this higher level. The problem with this approach is that it discards all within-group information (because it takes the average of the individual level variables). As much as 80–90% of the variance could be wasted, and the relationship between aggregated variables is inflated, and thus distorted. [18] This is known as ecological fallacy, and statistically, this type of analysis results in decreased power in addition to the loss of information. [2]

Another way to analyze hierarchical data would be through a random-coefficients model. This model assumes that each group has a different regression model—with its own intercept and slope. [5] Because groups are sampled, the model assumes that the intercepts and slopes are also randomly sampled from a population of group intercepts and slopes. This allows for an analysis in which one can assume that slopes are fixed but intercepts are allowed to vary. [5] However this presents a problem, as individual components are independent but group components are independent between groups, but dependent within groups. This also allows for an analysis in which the slopes are random; however, the correlations of the error terms (disturbances) are dependent on the values of the individual-level variables. [5] Thus, the problem with using a random-coefficients model in order to analyze hierarchical data is that it is still not possible to incorporate higher order variables.

Error terms

Multilevel models have two error terms, which are also known as disturbances. The individual components are all independent, but there are also group components, which are independent between groups but correlated within groups. However, variance components can differ, as some groups are more homogeneous than others. [18]

Bayesian nonlinear mixed-effects model

Bayesian research cycle using Bayesian nonlinear mixed effects model: (a) standard research cycle and (b) Bayesian-specific workflow. Bayesian research cycle.png
Bayesian research cycle using Bayesian nonlinear mixed effects model: (a) standard research cycle and (b) Bayesian-specific workflow.

Multilevel modeling is frequently used in diverse applications and it can be formulated by the Bayesian framework. Particularly, Bayesian nonlinear mixed-effects models have recently received significant attention. A basic version of the Bayesian nonlinear mixed-effects models is represented as the following three-stage:

Stage 1: Individual-Level Model

Stage 2: Population Model

Stage 3: Prior

Here, denotes the continuous response of the -th subject at the time point , and is the -th covariate of the -th subject. Parameters involved in the model are written in Greek letters. is a known function parameterized by the -dimensional vector . Typically, is a `nonlinear' function and describes the temporal trajectory of individuals. In the model, and describe within-individual variability and between-individual variability, respectively. If Stage 3: Prior is not considered, then the model reduces to a frequentist nonlinear mixed-effect model.

A central task in the application of the Bayesian nonlinear mixed-effect models is to evaluate the posterior density:

The panel on the right displays Bayesian research cycle using Bayesian nonlinear mixed-effects model. [19] A research cycle using the Bayesian nonlinear mixed-effects model comprises two steps: (a) standard research cycle and (b) Bayesian-specific workflow. Standard research cycle involves literature review, defining a problem and specifying the research question and hypothesis. Bayesian-specific workflow comprises three sub-steps: (b)–(i) formalizing prior distributions based on background knowledge and prior elicitation; (b)–(ii) determining the likelihood function based on a nonlinear function ; and (b)–(iii) making a posterior inference. The resulting posterior inference can be used to start a new research cycle.

See also

Related Research Articles

In statistics, the term linear model refers to any model which assumes linearity in the system. The most common occurrence is in connection with regression models and the term is often taken as synonymous with linear regression model. However, the term is also used in time series analysis with a different meaning. In each case, the designation "linear" is used to identify a subclass of models for which substantial reduction in the complexity of the related statistical theory is possible.

<span class="mw-page-title-main">Least squares</span> Approximation method in statistics

The method of least squares is a parameter estimation method in regression analysis based on minimizing the sum of the squares of the residuals made in the results of each individual equation.

<span class="mw-page-title-main">Logistic regression</span> Statistical model for a binary dependent variable

In statistics, the logistic model is a statistical model that models the log-odds of an event as a linear combination of one or more independent variables. In regression analysis, logistic regression is estimating the parameters of a logistic model. Formally, in binary logistic regression there is a single binary dependent variable, coded by an indicator variable, where the two values are labeled "0" and "1", while the independent variables can each be a binary variable or a continuous variable. The corresponding probability of the value labeled "1" can vary between 0 and 1, hence the labeling; the function that converts log-odds to probability is the logistic function, hence the name. The unit of measurement for the log-odds scale is called a logit, from logistic unit, hence the alternative names. See § Background and § Definition for formal mathematics, and § Example for a worked example.

Analysis of covariance (ANCOVA) is a general linear model that blends ANOVA and regression. ANCOVA evaluates whether the means of a dependent variable (DV) are equal across levels of one or more categorical independent variables (IV) and across one or more continuous variables. For example, the categorical variable(s) might describe treatment and the continuous variable(s) might be covariates or nuisance variables; or vice versa. Mathematically, ANCOVA decomposes the variance in the DV into variance explained by the CV(s), variance explained by the categorical IV, and residual variance. Intuitively, ANCOVA can be thought of as 'adjusting' the DV by the group means of the CV(s).

In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.

In the statistical analysis of time series, autoregressive–moving-average (ARMA) models provide a parsimonious description of a (weakly) stationary stochastic process in terms of two polynomials, one for the autoregression (AR) and the second for the moving average (MA). The general ARMA model was described in the 1951 thesis of Peter Whittle, Hypothesis testing in time series analysis, and it was popularized in the 1970 book by George E. P. Box and Gwilym Jenkins.

In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the input dataset and the output of the (linear) function of the independent variable.

Functional data analysis (FDA) is a branch of statistics that analyses data providing information about curves, surfaces or anything else varying over a continuum. In its most general form, under an FDA framework, each sample element of functional data is considered to be a random function. The physical continuum over which these functions are defined is often time, but may also be spatial location, wavelength, probability, etc. Intrinsically, functional data are infinite dimensional. The high intrinsic dimensionality of these data brings challenges for theory as well as computation, where these challenges vary with how the functional data were sampled. However, the high or infinite dimensional structure of the data is a rich source of information and there are many interesting challenges for research and data analysis.

In statistics, the Bayesian information criterion (BIC) or Schwarz information criterion is a criterion for model selection among a finite set of models; models with lower BIC are generally preferred. It is based, in part, on the likelihood function and it is closely related to the Akaike information criterion (AIC).

A mixed model, mixed-effects model or mixed error-component model is a statistical model containing both fixed effects and random effects. These models are useful in a wide variety of disciplines in the physical, biological and social sciences. They are particularly useful in settings where repeated measurements are made on the same statistical units, or where measurements are made on clusters of related statistical units. Mixed models are often preferred over traditional analysis of variance regression models because of their flexibility in dealing with missing values and uneven spacing of repeated measurements. The Mixed model analysis allows measurements to be explicitly modeled in a wider variety of correlation and variance-covariance structures.

In statistics, particularly in analysis of variance and linear regression, a contrast is a linear combination of variables whose coefficients add up to zero, allowing comparison of different treatments.

In statistics, multivariate adaptive regression splines (MARS) is a form of regression analysis introduced by Jerome H. Friedman in 1991. It is a non-parametric regression technique and can be seen as an extension of linear models that automatically models nonlinearities and interactions between variables.

In statistics, the two-way analysis of variance (ANOVA) is an extension of the one-way ANOVA that examines the influence of two different categorical independent variables on one continuous dependent variable. The two-way ANOVA not only aims at assessing the main effect of each independent variable but also if there is any interaction between them.

One application of multilevel modeling (MLM) is the analysis of repeated measures data. Multilevel modeling for repeated measures data is most often discussed in the context of modeling change over time ; however, it may also be used for repeated measures data in which time is not a factor.

Bayesian hierarchical modelling is a statistical model written in multiple levels that estimates the parameters of the posterior distribution using the Bayesian method. The sub-models combine to form the hierarchical model, and Bayes' theorem is used to integrate them with the observed data and account for all the uncertainty that is present. The result of this integration is the posterior distribution, also known as the updated probability estimate, as additional evidence on the prior distribution is acquired.

The generalized functional linear model (GFLM) is an extension of the generalized linear model (GLM) that allows one to regress univariate responses of various types on functional predictors, which are mostly random trajectories generated by a square-integrable stochastic processes. Similarly to GLM, a link function relates the expected value of the response variable to a linear predictor, which in case of GFLM is obtained by forming the scalar product of the random predictor function with a smooth parameter function . Functional Linear Regression, Functional Poisson Regression and Functional Binomial Regression, with the important Functional Logistic Regression included, are special cases of GFLM. Applications of GFLM include classification and discrimination of stochastic processes and functional data.

In statistics, linear regression is a statistical model which estimates the linear relationship between a scalar response and one or more explanatory variables. The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable. If the explanatory variables are measured with error then errors-in-variables models are required, also known as measurement error models.

In statistics, the class of vector generalized linear models (VGLMs) was proposed to enlarge the scope of models catered for by generalized linear models (GLMs). In particular, VGLMs allow for response variables outside the classical exponential family and for more than one parameter. Each parameter can be transformed by a link function. The VGLM framework is also large enough to naturally accommodate multiple responses; these are several independent responses each coming from a particular statistical distribution with possibly different parameter values.

The Fay–Herriot model is a statistical model which includes some distinct variation for each of several subgroups of observations. It is an area-level model, meaning some input data are associated with sub-aggregates such as regions, jurisdictions, or industries. The model produces estimates about the subgroups. The model is applied in the context of small area estimation in which there is a lot of data overall, but not much for each subgroup.

Nonlinear mixed-effects models constitute a class of statistical models generalizing linear mixed-effects models. Like linear mixed-effects models, they are particularly useful in settings where there are multiple measurements within the same statistical units or when there are dependencies between measurements on related statistical units. Nonlinear mixed-effects models are applied in many fields including medicine, public health, pharmacology, and ecology.

References

  1. 1 2 3 4 5 Bryk, Stephen W. Raudenbush, Anthony S. (2002). Hierarchical linear models : applications and data analysis methods (2. ed., [3. Dr.] ed.). Thousand Oaks, CA [u.a.]: Sage Publications. ISBN   978-0-7619-1904-9.{{cite book}}: CS1 maint: multiple names: authors list (link)
  2. 1 2 3 4 5 6 7 8 9 10 11 12 Fidell, Barbara G. Tabachnick, Linda S. (2007). Using multivariate statistics (5th ed.). Boston; Montreal: Pearson/A & B. ISBN   978-0-205-45938-4.{{cite book}}: CS1 maint: multiple names: authors list (link)
  3. Luke, Douglas A. (2004). Multilevel modeling (3. repr. ed.). Thousand Oaks, CA: Sage. ISBN   978-0-7619-2879-9.
  4. 1 2 3 4 5 6 7 Gomes, Dylan G.E. (20 January 2022). "Should I use fixed effects or random effects when I have fewer than five levels of a grouping factor in a mixed-effects model?". PeerJ. 10: e12794. doi: 10.7717/peerj.12794 . PMC   8784019 . PMID   35116198.
  5. 1 2 3 4 5 6 7 8 9 10 11 12 Cohen, Jacob (3 October 2003). Applied multiple regression/correlation analysis for the behavioral sciences (3. ed.). Mahwah, NJ [u.a.]: Erlbaum. ISBN   978-0-8058-2223-6.
  6. Lee, Se Yoon; Lei, Bowen; Mallick, Bani (2020). "Estimation of COVID-19 spread curves integrating global data and borrowing information". PLOS ONE. 15 (7): e0236860. arXiv: 2005.00662 . Bibcode:2020PLoSO..1536860L. doi: 10.1371/journal.pone.0236860 . PMC   7390340 . PMID   32726361.
  7. 1 2 Lee, Se Yoon; Mallick, Bani (2021). "Bayesian Hierarchical Modeling: Application Towards Production Results in the Eagle Ford Shale of South Texas". Sankhya B. 84: 1–43. doi:10.1007/s13571-020-00245-8. S2CID   234027590.
  8. editor, G. David Garson (10 April 2012). Hierarchical linear modeling : guide and applications. Thousand Oaks, Calif.: Sage Publications. ISBN   978-1-4129-9885-7.{{cite book}}: |last= has generic name (help)
  9. 1 2 3 4 Salkind, Samuel B. Green, Neil J. (2004). Using SPSS for Windows and Macintosh : analyzing and understanding data (4th ed.). Upper Saddle River, NJ: Pearson Education. ISBN   978-0-13-146597-8.{{cite book}}: CS1 maint: multiple names: authors list (link)
  10. Goldstein, Harvey (1991). "Nonlinear Multilevel Models, with an Application to Discrete Response Data". Biometrika. 78 (1): 45–51. doi:10.1093/biomet/78.1.45. JSTOR   2336894.
  11. ATS Statistical Consulting Group. "Introduction to Multilevel Modeling Using HLM 6" (PDF). Archived from the original (PDF) on 31 December 2010.
  12. 1 2 3 Antonakis, John; Bastardoz, Nicolas; Rönkkö, Mikko (2021). "On Ignoring the Random Effects Assumption in Multilevel Models: Review, Critique, and Recommendations" (PDF). Organizational Research Methods. 24 (2): 443–483. doi:10.1177/1094428119877457. ISSN   1094-4281. S2CID   210355362.
  13. McNeish, Daniel; Kelley, Ken (2019). "Fixed effects models versus mixed effects models for clustered data: Reviewing the approaches, disentangling the differences, and making recommendations" . Psychological Methods. 24 (1): 20–35. doi:10.1037/met0000182. ISSN   1939-1463. PMID   29863377. S2CID   44145669.
  14. Bliese, Paul D.; Schepker, Donald J.; Essman, Spenser M.; Ployhart, Robert E. (2020). "Bridging Methodological Divides Between Macro- and Microresearch: Endogeneity and Methods for Panel Data". Journal of Management. 46 (1): 70–99. doi:10.1177/0149206319868016. ISSN   0149-2063. S2CID   202288849.
  15. Wooldridge, Jeffrey M. (1 October 2010). Econometric Analysis of Cross Section and Panel Data, second edition. MIT Press. ISBN   978-0-262-29679-3.
  16. 1 2 3 Leeuw, Ita Kreft, Jan de (1998). Introducing multilevel modeling (Repr. ed.). London: Sage Publications Ltd. ISBN   978-0-7619-5141-4.{{cite book}}: CS1 maint: multiple names: authors list (link)
  17. Hox, Joop (2002). Multilevel analysis : techniques and applications (Reprint. ed.). Mahwah, NJ [u.a.]: Erlbaum. ISBN   978-0-8058-3219-8.
  18. 1 2 Bryk, Anthony S.; Raudenbush, Stephen W. (1 January 1988). "Heterogeneity of variance in experimental studies: A challenge to conventional interpretations". Psychological Bulletin. 104 (3): 396–404. doi:10.1037/0033-2909.104.3.396.
  19. 1 2 Lee, Se Yoon (2022). "Bayesian Nonlinear Models for Repeated Measurement Data: An Overview, Implementation, and Applications". Mathematics. 10 (6): 898. arXiv: 2201.12430 . doi: 10.3390/math10060898 .

Further reading