22–25 Jul 2025
Atlantic/Canary timezone

A Machine Learning-Based Workflow for Model Evaluation and Revision in SEM

24 Jul 2025, 11:00
15m

Description

Despite the popularity of structural equation modeling (SEM), investigating the fit of SEM models is still challenging—especially, if the global model fit evaluation implies non-negligible misfit, and researchers need to further investigate the type and severity of the misspecification in their model. Being overwhelmed by poorly fitting models, researchers sometimes strain the interpretation of their global model test (e.g., the χ2-test or model fit indices, such as the CFI and the RMSEA, in combination with cutoff values) and attest acceptable model fit, even though they would be well advised to reject or revise their model. To counteract this questionable research practice, we developed a method that guides researchers through a more thorough process of model fit evaluation and, if necessary, revision.
Based on a proof-of-concept study, in which we have previously shown that a pre-trained machine learning (ML) model can detect misfit in multifactorial measurement models with a high accuracy, we developed an automated ML-based workflow for SEM evaluation and revision. This workflow involves several ML models that we trained based on a maximum of 173 model and data features extracted from more than 1 million simulated data sets and multifactorial models fitted by means of confirmatory factor analysis. In the first step of the workflow, the researcher’s model is classified as either (a) correctly specified or misspecified by neglecting (b) a factor, (c) factor correlations, (d) cross-loadings, or (e) residual correlations. For classes a–c, we, in summary, give the following recommendations: (a) accept the model, (b) reject the model and revise the underlying theory or operationalization, (c) free the factor correlations, if willing to lift orthogonality constraints, or revise model by including method factor(s). For classes d–e, the second step of the workflow is initiated that determines the number of cross-loadings or residual correlations. Based on the severity of the misspecification, we, in summary, recommend the following: In case of a mild misspecification, researchers might freely estimate the concerned parameter(s), scrutinize their operationalization to understand the misspecification, and cross-validate it based on new data. In case of a moderate misspecification, researchers might revise their operationalization. In case of severe misspecification, researchers might reject the model and revise the underlying theory.
While this ML-based workflow for SEM evaluation and revision is not without limitations (e.g., it cannot identify a mix of misspecifications, it is only applicable for multifactorial measurement models so far), it provides applied researchers with unprecedented guidance in the complex, often iterative process of measurement and theory development, thereby hopefully encouraging them to face up to model misfit instead of neglecting it.
Keywords: Structural Equation Modeling (SEM), Latent Measurement Models, Model Misspecifications, Model Fit Evaluation, Model Revision, Machine Learning

Primary authors

David Goretzko Melanie Viola Partsch (Utrecht University)

Presentation materials

There are no materials yet.