One application of multilevel modeling (MLM) is the analysis of repeated measures data. Multilevel modeling for repeated measures data is most often discussed in the context of modeling change over time (i.e. growth curve modeling for longitudinal designs); however, it may also be used for repeated measures data in which time is not a factor. [1]
In multilevel modeling, an overall change function (e.g. linear, quadratic, cubic etc.) is fitted to the whole sample and, just as in multilevel modeling for clustered data, the slope and intercept may be allowed to vary. For example, in a study looking at income growth with age, individuals might be assumed to show linear improvement over time. However, the exact intercept and slope could be allowed to vary across individuals (i.e. defined as random coefficients).
Multilevel modeling with repeated measures employs the same statistical techniques as MLM with clustered data. In multilevel modeling for repeated measures data, the measurement occasions are nested within cases (e.g. individual or subject). Thus, level-1 units consist of the repeated measures for each subject, and the level-2 unit is the individual or subject. In addition to estimating overall parameter estimates, MLM allows regression equations at the level of the individual. Thus, as a growth curve modeling technique, it allows the estimation of inter-individual differences in intra-individual change over time by modeling the variances and covariances. [2] In other words, it allows the testing of individual differences in patterns of responses over time (i.e. growth curves). This characteristic of multilevel modeling makes it preferable to other repeated measures statistical techniques such as repeated measures-analysis of variance (RM-ANOVA) for certain research questions.
The assumptions of MLM that hold for clustered data also apply to repeated measures:
One of the assumptions of using MLM for growth curve modeling is that all subjects show the same relationship over time (e.g. linear, quadratic etc.). Another assumption of MLM for growth curve modeling is that the observed changes are related to the passage of time. [4]
Mathematically, multilevel analysis with repeated measures is very similar to the analysis of data in which subjects are clustered in groups. However, one point to note is that time-related predictors must be explicitly entered into the model to evaluate trend analyses and to obtain an overall test of the repeated measure. Furthermore, interpretation of these analyses is dependent on the scale of the time variable (i.e. how it is coded).
Repeated measures analysis of variance (RM-ANOVA) has been traditionally used for analysis of repeated measures designs. However, violation of the assumptions of RM-ANOVA can be problematic. Multilevel modeling (MLM) is commonly used for repeated measures designs because it presents an alternative approach to analyzing this type of data with three main advantages over RM-ANOVA: [5]
An alternative method of growth curve analysis is latent growth curve modeling using structural equation modeling (SEM). This approach will provide the same estimates as the multilevel modeling approach, provided that the model is specified identically in SEM. However, there are circumstances in which either MLM or SEM are preferable: [4] [6]
The distinction between multilevel modeling and latent growth curve analysis has become less defined. Some statistical programs incorporate multilevel features within their structural equation modeling software, and some multilevel modeling software is beginning to add latent growth curve features.
Multilevel modeling with repeated measures data is computationally complex. Computer software capable of performing these analyses may require data to be represented in “long form” as opposed to “wide form” prior to analysis. In long form, each subject’s data is represented in several rows – one for every “time” point (observation of the dependent variable). This is opposed to wide form in which there is one row per subject, and the repeated measures are represented in separate columns. Also note that, in long form, time invariant variables are repeated across rows for each subject. See below for an example of wide form data transposed into long form:
Wide form:
Subject | Group | Time0 | Time1 | Time2 |
---|---|---|---|---|
1 | 1 | 12 | 8 | 4 |
2 | 1 | 11 | 7 | 6 |
3 | 2 | 15 | 12 | 10 |
4 | 2 | 11 | 10 | 9 |
Long form:
Subject | Group | Time | DepVar |
---|---|---|---|
1 | 1 | 0 | 12 |
1 | 1 | 1 | 8 |
1 | 1 | 2 | 4 |
... | ... | ... | ... |
4 | 2 | 0 | 11 |
4 | 2 | 1 | 10 |
4 | 2 | 2 | 9 |
{{cite book}}
: CS1 maint: multiple names: authors list (link) Concentrates on SAS and on simpler growth models.Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures used to analyze the differences among means. ANOVA was developed by the statistician Ronald Fisher. ANOVA is based on the law of total variance, where the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether two or more population means are equal, and therefore generalizes the t-test beyond two means. In other words, the ANOVA is used to test the difference between two or more means.
Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more than one outcome variable, i.e., multivariate random variables. Multivariate statistics concerns understanding the different aims and background of each of the different forms of multivariate analysis, and how they relate to each other. The practical application of multivariate statistics to a particular problem may involve several types of univariate and multivariate analyses in order to understand the relationships between variables and their relevance to the problem being studied.
Analysis of covariance (ANCOVA) is a general linear model that blends ANOVA and regression. ANCOVA evaluates whether the means of a dependent variable (DV) are equal across levels of one or more categorical independent variables (IV) and across one or more continuous variables. For example, the categorical variable(s) might describe treatment and the continuous variable(s) might be covariates or nuisance variables; or vice versa. Mathematically, ANCOVA decomposes the variance in the DV into variance explained by the CV(s), variance explained by the categorical IV, and residual variance. Intuitively, ANCOVA can be thought of as 'adjusting' the DV by the group means of the CV(s).
In statistics, path analysis is used to describe the directed dependencies among a set of variables. This includes models equivalent to any form of multiple regression analysis, factor analysis, canonical correlation analysis, discriminant analysis, as well as more general families of models in the multivariate analysis of variance and covariance analyses.
SUDAAN is a proprietary statistical software package for the analysis of correlated data, including correlated data encountered in complex sample surveys. SUDAAN originated in 1972 at RTI International. Individual commercial licenses are sold for $1,460 a year, or $3,450 permanently.
Linear discriminant analysis (LDA), normal discriminant analysis (NDA), or discriminant function analysis is a generalization of Fisher's linear discriminant, a method used in statistics and other fields, to find a linear combination of features that characterizes or separates two or more classes of objects or events. The resulting combination may be used as a linear classifier, or, more commonly, for dimensionality reduction before later classification.
In statistics, the coefficient of determination, denoted R2 or r2 and pronounced "R squared", is the proportion of the variation in the dependent variable that is predictable from the independent variable(s).
Structural equation modeling (SEM) is a diverse set of methods used by scientists doing both observational and experimental research. SEM is used mostly in the social and behavioral sciences but it is also used in epidemiology, business, and other fields. A definition of SEM is difficult without reference to technical language, but a good starting place is the name itself.
Multilevel models are statistical models of parameters that vary at more than one level. An example could be a model of student performance that contains measures for individual students as well as measures for classrooms within which the students are grouped. These models can be seen as generalizations of linear models, although they can also extend to non-linear models. These models became much more popular after sufficient computing power and software became available.
A mixed model, mixed-effects model or mixed error-component model is a statistical model containing both fixed effects and random effects. These models are useful in a wide variety of disciplines in the physical, biological and social sciences. They are particularly useful in settings where repeated measurements are made on the same statistical units, or where measurements are made on clusters of related statistical units. Mixed models are often preferred over traditional analysis of variance regression models because of their flexibility in dealing with missing values and uneven spacing of repeated measurements. The Mixed model analysis allows measurements to be explicitly modeled in a wider variety of correlation and variance-covariance structures.
Latent growth modeling is a statistical technique used in the structural equation modeling (SEM) framework to estimate growth trajectories. It is a longitudinal analysis technique to estimate growth over a period of time. It is widely used in the field of psychology, behavioral science, education and social science. It is also called latent growth curve analysis. The latent growth model was derived from theories of SEM. General purpose SEM software, such as OpenMx, lavaan, AMOS, Mplus, LISREL, or EQS among others may be used to estimate growth trajectories.
Omnibus tests are a kind of statistical test. They test whether the explained variance in a set of data is significantly greater than the unexplained variance, overall. One example is the F-test in the analysis of variance. There can be legitimate significant effects within a model even if the omnibus test is not significant. For instance, in a model with two independent variables, if only one variable exerts a significant effect on the dependent variable and the other does not, then the omnibus test may be non-significant. This fact does not affect the conclusions that may be drawn from the one significant variable. In order to test effects within an omnibus test, researchers often use contrasts.
Repeated measures design is a research design that involves multiple measures of the same variable taken on the same or matched subjects either under different conditions or over two or more time periods. For instance, repeated measurements are collected in a longitudinal study in which change over time is assessed.
In statistics, a generalized estimating equation (GEE) is used to estimate the parameters of a generalized linear model with a possible unmeasured correlation between observations from different timepoints. Although some believe that Generalized estimating equations are robust in everything even with the wrong choice of working-correlation matrix, Generalized estimating equations are only robust to loss of consistency with the wrong choice.
In statistics, a mixed-design analysis of variance model, also known as a split-plot ANOVA, is used to test for differences between two or more independent groups whilst subjecting participants to repeated measures. Thus, in a mixed-design ANOVA model, one factor is a between-subjects variable and the other is a within-subjects variable. Thus, overall, the model is a type of mixed-effects model.
In statistics, one purpose for the analysis of variance (ANOVA) is to analyze differences in means between groups. The test statistic, F, assumes independence of observations, homogeneous variances, and population normality. ANOVA on ranks is a statistic designed for situations when the normality assumption has been violated.
Bivariate analysis is one of the simplest forms of quantitative (statistical) analysis. It involves the analysis of two variables, for the purpose of determining the empirical relationship between them.
In statistics, the two-way analysis of variance (ANOVA) is an extension of the one-way ANOVA that examines the influence of two different categorical independent variables on one continuous dependent variable. The two-way ANOVA not only aims at assessing the main effect of each independent variable but also if there is any interaction between them.
In statistics, a sequence of random variables is homoscedastic if all its random variables have the same finite variance; this is also known as homogeneity of variance. The complementary notion is called heteroscedasticity, also known as heterogeneity of variance. The spellings homoskedasticity and heteroskedasticity are also frequently used. Assuming a variable is homoscedastic when in reality it is heteroscedastic results in unbiased but inefficient point estimates and in biased estimates of standard errors, and may result in overestimating the goodness of fit as measured by the Pearson coefficient.