Most statistical inference methods were established under the assumption that the fitted model is known in advance. In practice, however, researchers often obtain their final model by some data-driven selection process. The selection process makes the finally fitted model random, and it also influences the sampling distribution of the estimator. Therefore, implementing naive inference methods may result in wrong conclusions—which is probably a prime source of the reproducibility crisis in psychological science. The present study accommodates three valid state-of-the-art postselection inference methods for structural equation modeling (SEM) from the statistical literature: data splitting (DS), postselection inference (PoSI), and the polyhedral (PH) method. A simulation is conducted to compare the three methods with the commonly used naive procedure under selection events made by L1-penalized SEM. The results show that the naive method often yields incorrect inference, and that the valid methods control the coverage rate in most cases with their own pros and cons. Real world data examples show the practical use of the valid inference methods.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Experimental and Cognitive Psychology
- Arts and Humanities (miscellaneous)