Partial least squares path modeling

Last updated

The partial least squares path modeling or partial least squares structural equation modeling (PLS-PM, PLS-SEM) [1] [2] [3] is a method for structural equation modeling that allows estimation of complex cause-effect relationships in path models with latent variables.

Contents

Overview

PLS-PM [4] [5] is a component-based estimation approach that differs from the covariance-based structural equation modeling. Unlike covariance-based approaches to structural equation modeling, PLS-PM does not fit a common factor model to the data, it rather fits a composite model. [6] [7] In doing so, it maximizes the amount of variance explained (though what this means from a statistical point of view is unclear and PLS-PM users do not agree on how this goal might be achieved).

In addition, by an adjustment PLS-PM is capable of consistently estimating certain parameters of common factor models as well, through an approach called consistent PLS-PM (PLSc-PM). [8] A further related development is factor-based PLS-PM (PLSF), a variation of which employs PLSc-PM as a basis for the estimation of the factors in common factor models; this method significantly increases the number of common factor model parameters that can be estimated, effectively bridging the gap between classic PLS-PM and covariance‐based structural equation modeling. [9]

The PLS-PM structural equation model is composed of two sub-models: the measurement models and the structural model. The measurement models represent the relationships between the observed data and the latent variables. The structural model represents the relationships between the latent variables.

An iterative algorithm solves the structural equation model by estimating the latent variables by using the measurement and structural model in alternating steps, hence the procedure's name, partial. The measurement model estimates the latent variables as a weighted sum of its manifest variables. The structural model estimates the latent variables by means of simple or multiple linear regression between the latent variables estimated by the measurement model. This algorithm repeats itself until convergence is achieved.

PLS is viewed critically by several methodological researchers. [10] [11] A major point of contention has been the claim that PLS-PM can always be used with very small sample sizes. [12] A recent study suggests that this claim is generally unjustified, and proposes two methods for minimum sample size estimation in PLS-PM. [13] [14] Another point of contention is the ad hoc way in which PLS-PM has been developed and the lack of analytic proofs to support its main feature: the sampling distribution of PLS-PM weights. However, PLS-PM is still considered preferable (over covariance‐based structural equation modeling) when it is unknown whether the data's nature is common factor- or composite-based. [15]

See also

Related Research Articles

Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more than one outcome variable, i.e., multivariate random variables. Multivariate statistics concerns understanding the different aims and background of each of the different forms of multivariate analysis, and how they relate to each other. The practical application of multivariate statistics to a particular problem may involve several types of univariate and multivariate analyses in order to understand the relationships between variables and their relevance to the problem being studied.

Psychological statistics is application of formulas, theorems, numbers and laws to psychology. Statistical methods for psychology include development and application statistical theory and methods for modeling psychological data. These methods include psychometrics, factor analysis, experimental designs, and Bayesian statistics. The article also discusses journals in the same field.

<span class="mw-page-title-main">Least squares</span> Approximation method in statistics

The method of least squares is a parameters estimation method in regression analysis based on minimizing the sum of the squares of the residuals made in the results of each individual equation.

Simultaneous equations models are a type of statistical model in which the dependent variables are functions of other dependent variables, rather than just independent variables. This means some of the explanatory variables are jointly determined with the dependent variable, which in economics usually is the consequence of some underlying equilibrium mechanism. Take the typical supply and demand model: whilst typically one would determine the quantity supplied and demanded to be a function of the price set by the market, it is also possible for the reverse to be true, where producers observe the quantity that consumers demand and then set the price.

In statistics, path analysis is used to describe the directed dependencies among a set of variables. This includes models equivalent to any form of multiple regression analysis, factor analysis, canonical correlation analysis, discriminant analysis, as well as more general families of models in the multivariate analysis of variance and covariance analyses.

Partial least squares regression is a statistical method that bears some relation to principal components regression; instead of finding hyperplanes of maximum variance between the response and independent variables, it finds a linear regression model by projecting the predicted variables and the observable variables to a new space. Because both the X and Y data are projected to new spaces, the PLS family of methods are known as bilinear factor models. Partial least squares discriminant analysis (PLS-DA) is a variant used when the Y is categorical.

<span class="mw-page-title-main">Structural equation modeling</span> Form of causal modeling that fit networks of constructs to data

Structural equation modeling (SEM) is a diverse set of methods used by scientists doing both observational and experimental research. SEM is used mostly in the social and behavioral sciences but it is also used in epidemiology, business, and other fields. A definition of SEM is difficult without reference to technical language, but a good starting place is the name itself.

In psychology, discriminant validity tests whether concepts or measurements that are not supposed to be related are actually unrelated.

Latent growth modeling is a statistical technique used in the structural equation modeling (SEM) framework to estimate growth trajectories. It is a longitudinal analysis technique to estimate growth over a period of time. It is widely used in the field of psychology, behavioral science, education and social science. It is also called latent growth curve analysis. The latent growth model was derived from theories of SEM. General purpose SEM software, such as OpenMx, lavaan, AMOS, Mplus, LISREL, or EQS among others may be used to estimate growth trajectories.

In statistics, confirmatory factor analysis (CFA) is a special form of factor analysis, most commonly used in social science research. It is used to test whether measures of a construct are consistent with a researcher's understanding of the nature of that construct. As such, the objective of confirmatory factor analysis is to test whether the data fit a hypothesized measurement model. This hypothesized model is based on theory and/or previous analytic research. CFA was first developed by Jöreskog (1969) and has built upon and replaced older methods of analyzing construct validity such as the MTMM Matrix as described in Campbell & Fiske (1959).

Karl Gustav Jöreskog is a Swedish statistician. Jöreskog is a professor emeritus at Uppsala University, and a co-author of the LISREL statistical program. He is also a member of the Royal Swedish Academy of Sciences. Jöreskog received his bachelor's, master's, and doctoral degrees at Uppsala University. He is also a former student of Herman Wold. He was a statistician at Educational Testing Service (ETS) and a visiting professor at Princeton University.

One application of multilevel modeling (MLM) is the analysis of repeated measures data. Multilevel modeling for repeated measures data is most often discussed in the context of modeling change over time ; however, it may also be used for repeated measures data in which time is not a factor.

<span class="mw-page-title-main">SmartPLS</span> Software

SmartPLS is a software with graphical user interface for variance-based structural equation modeling (SEM) using the partial least squares (PLS) path modeling method. Users can estimate models with their data by using basic PLS-SEM, weighted PLS-SEM (WPLS), consistent PLS-SEM (PLSc-SEM), and sumscores regression algorithms. The software computes standard results assessment criteria and it supports additional statistical analyses . Since SmartPLS is programmed in Java, it can be executed and run on different computer operating systems such as Windows and Mac.

<span class="mw-page-title-main">WarpPLS</span>

WarpPLS is a software with graphical user interface for variance-based and factor-based structural equation modeling (SEM) using the partial least squares and factor-based methods. The software can be used in empirical research to analyse collected data and test hypothesized relationships. Since it runs on the MATLAB Compiler Runtime, it does not require the MATLAB software development application to be installed; and can be installed and used on various operating systems in addition to Windows, with virtual installations.

<span class="mw-page-title-main">Average variance extracted</span>

In statistics (classical test theory), average variance extracted (AVE) is a measure of the amount of variance that is captured by a construct in relation to the amount of variance due to measurement error.

In statistics, confirmatory composite analysis (CCA) is a sub-type of structural equation modeling (SEM). Although, historically, CCA emerged from a re-orientation and re-start of partial least squares path modeling (PLS-PM), it has become an independent approach and the two should not be confused. In many ways it is similar to, but also quite distinct from confirmatory factor analysis (CFA). It shares with CFA the process of model specification, model identification, model estimation, and model assessment. However, in contrast to CFA which always assumes the existence of latent variables, in CCA all variables can be observable, with their interrelationships expressed in terms of composites, i.e., linear compounds of subsets of the variables. The composites are treated as the fundamental objects and path diagrams can be used to illustrate their relationships. This makes CCA particularly useful for disciplines examining theoretical concepts that are designed to attain certain goals, so-called artifacts, and their interplay with theoretical concepts of behavioral sciences.

<span class="mw-page-title-main">Marko Sarstedt</span> German academic and a marketing researcher

Marko Sarstedt is a German academic and a marketing researcher. He is a Full Professor at the Ludwig Maximilian University of Munich and Adjunct Research Professor at Babeș-Bolyai-University.

<span class="mw-page-title-main">Homoscedasticity and heteroscedasticity</span> Statistical property

In statistics, a sequence of random variables is homoscedastic if all its random variables have the same finite variance; this is also known as homogeneity of variance. The complementary notion is called heteroscedasticity, also known as heterogeneity of variance. The spellings homoskedasticity and heteroskedasticity are also frequently used. Assuming a variable is homoscedastic when in reality it is heteroscedastic results in unbiased but inefficient point estimates and in biased estimates of standard errors, and may result in overestimating the goodness of fit as measured by the Pearson coefficient.

Necessary Condition Analysis (NCA) is a research approach and tool employed to discern "necessary conditions" within datasets. These indispensable conditions stand as pivotal determinants of particular outcomes, wherein the absence of such conditions ensures the absence of the intended result. Illustratively, the admission of a student into a Ph.D. program necessitates an adequate GMAT score; the progression of AIDS mandates the presence of HIV; and the realization of organizational change will not occur without the commitment of management. Singular in nature, these conditions possess the potential to function as bottlenecks for the desired outcome. Their absence unequivocally guarantees the failure of the intended objective, a deficiency that cannot be offset by the influence of other contributing factors. It is noteworthy, however, that the mere presence of the necessary condition does not ensure the assured attainment of success. In such instances, the condition demonstrates its necessity but lacks sufficiency. To obviate the risk of failure, the simultaneous satisfaction of each distinct necessary condition is imperative. NCA serves as a systematic mechanism, furnishing the rationale and methodological apparatus requisite for the identification and assessment of necessary conditions within extant or novel datasets. It is a powerful method for investigating causal relationships and determining the minimum requirements that must be present for an outcome to be achieved.

References

  1. Hair, J.F.; Hult, G.T.M.; Ringle, C.M.; Sarstedt, M. (2017). A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM) (2 ed.). Thousand Oaks, CA: Sage. ISBN   9781483377445.
  2. Vinzi, V.E.; Trinchera, L.; Amato, S. (2010). Handbook of partial least squares. Springer Berlin Heidelberg.
  3. Hair, J.F.; Sarstedt, M.; Ringle, C.M.; Gudergan, S.P. (2018). Advanced Issues in Partial Least Squares Structural Equation Modeling (PLS-SEM). Thousand Oaks, CA: Sage. ISBN   9781483377391.
  4. Wold, H. O. A. (1982). "Soft Modeling: The Basic Design and Some Extensions". In Jöreskog, K. G.; Wold, H. O. A. (eds.). Systems Under Indirect Observations: Part II. Amsterdam: North-Holland. pp. 1–54. ISBN   0-444-86301-X.
  5. Lohmöller, J.-B. (1989). Latent Variable Path Modeling with Partial Least Squares. Heidelberg: Physica. ISBN   3-7908-0437-1.
  6. Henseler, Jörg; Dijkstra, Theo K.; Sarstedt, Marko; Ringle, Christian M.; Diamantopoulos, Adamantios; Straub, Detmar W.; Ketchen, David J.; Hair, Joseph F.; Hult, G. Tomas M. (2014-04-10). "Common Beliefs and Reality About PLS". Organizational Research Methods. 17 (2): 182–209. doi: 10.1177/1094428114526928 . hdl: 10362/117915 .
  7. Rigdon, E. E.; Sarstedt, M.; Ringle, M. (2017). "On Comparing Results from CB-SEM and PLS-SEM: Five Perspectives and Five Recommendations". Marketing ZFP. 39 (3): 4–16. doi: 10.15358/0344-1369-2017-3-4 .
  8. Dijkstra, Theo K.; Henseler, Jörg (2015-01-01). "Consistent and asymptotically normal PLS-PM estimators for linear structural equations". Computational Statistics & Data Analysis. 81: 10–23. doi: 10.1016/j.csda.2014.07.008 .
  9. Kock, N. (2019). From composites to factors: Bridging the gap between PLS and covariance‐based structural equation modeling. Information Systems Journal, 29(3), 674-706.
  10. Rönkkö, M.; McIntosh, C.N.; Antonakis, J.; Edwards, J.R. (2016). "Partial least squares path modeling: Time for some serious second thoughts". Journal of Operations Management. 47–48: 9–27. doi:10.1016/j.jom.2016.05.002.
  11. Goodhue, D. L., Lewis, W., & Thompson, R. (2012). Does PLS have advantages for small sample size or non-normal data? MIS Quarterly, 981-1001.
  12. Kock, N., & Hadaya, P. (2018). Minimum sample size estimation in PLS-SEM: The inverse square root and gamma-exponential methods. Information Systems Journal, 28(1), 227–261.
  13. Kock, N., & Hadaya, P. (2018). Minimum sample size estimation in PLS-SEM: The inverse square root and gamma-exponential methods. Information Systems Journal, 28(1), 227–261.
  14. Sarstedt, Marko; Cheah, Jun-Hwa (2019-06-27). "Partial least squares structural equation modeling using SmartPLS: a software review" (PDF). Journal of Marketing Analytics. 7 (3): 196–202. doi:10.1057/s41270-019-00058-3. ISSN   2050-3318. S2CID   198334897.
  15. Sarstedt, M.; Hair, J.F.; Ringle, C.M.; Thiele, K.O.; Gudergan, S.P. (2016). "Estimation issues with PLS and CBSEM: Where the bias lies!". Journal of Business Research. 69 (10): 3998–4010. doi: 10.1016/j.jbusres.2016.06.007 . hdl: 11420/1817 .