lessSEM (lessSEM estimates sparse SEM) is an R package for regularized structural equation modeling (regularized SEM) with non-smooth penalty functions (e.g., lasso) building on lavaan. lessSEM is heavily inspired by the regsem package and the lslx packages that have similar functionality. If you use lessSEM, please also cite regsem and and lslx!
The objectives of lessSEM are to provide …
The following penalty functions are currently implemented in lessSEM:
The column “penalty” refers to the name of the function call in the
lessSEM package (e.g., lasso is called with the
lasso()
function).
The best model can be selected with the AIC or BIC. If you want to
use cross-validation, use cvLasso
,
cvAdaptiveLasso
, etc. instead (see, e.g.,
?lessSEM::cvLasso
).
The packages regsem, lslx, and lessSEM can all be used to regularize basic SEM. In fact, as outlined above, lessSEM is heavily inspired by regsem and lslx. However, the packages differ in their targets: The objective of lessSEM is not to replace the more mature packages regsem and lslx. Instead, our objective is to provide method developers with a flexible framework for regularized SEM. The following shows an incomplete comparison of some features implemented in the three packages:
regsem | lslx | lessSEM | |
---|---|---|---|
Model specification | based on lavaan | similar to lavaan | based on lavaan |
Maximum likelihood estimation | Yes | Yes | Yes |
Least squares estimation | No | Yes | Dev. |
Categorical variables | No | Yes | No |
Confidence Intervals | No | Yes | No |
Missing Data | FIML | Auxiliary Variables | FIML |
Multi-group models | No | Yes | Yes |
Stability selection | Yes | No | Dev. |
Mixed penalties | No | No | Yes |
Equality constraints | Yes | No | Yes |
Parameter transformations | diff_lasso | No | Yes |
Definition variables | No | No | Yes |
Warning Dev. refers to features that are supported, but still under development and may have bugs. Use with caution!
If you want to install lessSEM from CRAN, use the following commands in R:
install.packages("lessSEM")
The newest version of the package can be installed from GitHub using the following commands in R:
if(!require(devtools)) install.packages("devtools")
::install_github("jhorzek/lessSEM",
devtoolsref = "development")
Note The lessSEM project has multiple branches. The main branch will match the version currently available from CRAN. The development branch will have newer features not yet available from CRAN. This branch will have passed all current tests of our test suite, but may not be ready for CRAN yet (e.g., because not all objectives of the road map have been met). gh-pages is used to create the documentation website. Finally, all other branches are used for ongoing development and should be considered unstable.
Please visit the lessSEM
website for the latest documentation. You will also find a short
introduction to regularized SEM in
vignette('lessSEM', package = 'lessSEM')
and the
documentation of the individual functions (e.g., see
?lessSEM::scad
). Finally, you will find templates for a
selection of models that can be used with lessSEM
(e.g., the cross-lagged panel model) in the package lessTemplates.
library(lessSEM)
library(lavaan)
# Identical to regsem, lessSEM builds on the lavaan
# package for model specification. The first step
# therefore is to implement the model in lavaan.
<- simulateExampleData()
dataset
<- "
lavaanSyntax f =~ l1*y1 + l2*y2 + l3*y3 + l4*y4 + l5*y5 +
l6*y6 + l7*y7 + l8*y8 + l9*y9 + l10*y10 +
l11*y11 + l12*y12 + l13*y13 + l14*y14 + l15*y15
f ~~ 1*f
"
<- lavaan::sem(lavaanSyntax,
lavaanModel data = dataset,
meanstructure = TRUE,
std.lv = TRUE)
# Optional: Plot the model
# if(!require("semPlot")) install.packages("semPlot")
# semPlot::semPaths(lavaanModel,
# what = "est",
# fade = FALSE)
<- lasso(
lsem # pass the fitted lavaan model
lavaanModel = lavaanModel,
# names of the regularized parameters:
regularized = c("l6", "l7", "l8", "l9", "l10",
"l11", "l12", "l13", "l14", "l15"),
# in case of lasso and adaptive lasso, we can specify the number of lambda
# values to use. lessSEM will automatically find lambda_max and fit
# models for nLambda values between 0 and lambda_max. For the other
# penalty functions, lambdas must be specified explicitly
nLambdas = 50)
# use the plot-function to plot the regularized parameters:
plot(lsem)
# use the coef-function to show the estimates
coef(lsem)
# the best parameters can be extracted with:
coef(lsem, criterion = "AIC")
coef(lsem, criterion = "BIC")
# if you just want the estimates, use estimates():
estimates(lsem, criterion = "AIC")
# elements of lsem can be accessed with the @ operator:
@parameters[1,]
lsem
# AIC and BIC for all tuning parameter configurations:
AIC(lsem)
BIC(lsem)
# cross-validation
<- cvLasso(lavaanModel = lavaanModel,
cv regularized = c("l6", "l7", "l8", "l9", "l10",
"l11", "l12", "l13", "l14", "l15"),
lambdas = seq(0,1,.1),
standardize = TRUE)
# get best model according to cross-validation:
coef(cv)
#### Advanced ####
# Switching the optimizer:
# Use the "method" argument to switch the optimizer. The control argument
# must also be changed to the corresponding function:
<- lasso(
lsemIsta lavaanModel = lavaanModel,
regularized = paste0("l", 6:15),
nLambdas = 50,
method = "ista",
control = controlIsta(
# Here, we can also specify that we want to use multiple cores:
nCores = 2))
# Note: The results are basically identical:
@parameters - lsem@parameters lsemIsta
lessSEM allows for parameter transformations that
could, for instance, be used to test measurement invariance in
longitudinal models (e.g., Liang, 2018; Bauer et al., 2020). A thorough
introduction is provided in
vignette('Parameter-transformations', package = 'lessSEM')
.
As an example, we will test measurement invariance in the
PoliticalDemocracy
data set.
library(lessSEM)
library(lavaan)
# we will use the PoliticalDemocracy from lavaan (see ?lavaan::sem)
<- '
model # latent variable definitions
ind60 =~ x1 + x2 + x3
# assuming different loadings for different time points:
dem60 =~ y1 + a1*y2 + b1*y3 + c1*y4
dem65 =~ y5 + a2*y6 + b2*y7 + c2*y8
# regressions
dem60 ~ ind60
dem65 ~ ind60 + dem60
# residual correlations
y1 ~~ y5
y2 ~~ y4 + y6
y3 ~~ y7
y4 ~~ y8
y6 ~~ y8
'
<- sem(model, data = PoliticalDemocracy)
fit
# We will define a transformation which regularizes differences
# between loadings over time:
<- "
transformations // which parameters do we want to use?
parameters: a1, a2, b1, b2, c1, c2, delta_a2, delta_b2, delta_c2
// transformations:
a2 = a1 + delta_a2;
b2 = b1 + delta_b2;
c2 = c1 + delta_c2;
"
# setting delta_a2, delta_b2, or delta_c2 to zero implies measurement invariance
# for the respective parameters (a1, b1, c1)
<- lasso(lavaanModel = fit,
lassoFit # we want to regularize the differences between the parameters
regularized = c("delta_a2", "delta_b2", "delta_c2"),
nLambdas = 100,
# Our model modification must make use of the modifyModel - function:
modifyModel = modifyModel(transformations = transformations)
)
Finally, we can extract the best parameters:
coef(lassoFit, criterion = "BIC")
As all differences (delta_a2
, delta_b2
, and
delta_c2
) have been zeroed, we can assume measurement
invariance.
The following features are relatively new and you may still experience some bugs. Please be aware of that when using these features.
lessSEM supports exporting specific models to lavaan. This can be very useful when plotting the final model.
<- lessSEM2Lavaan(regularizedSEM = lsem,
lavaanModel criterion = "BIC")
The result can be plotted with, for instance, semPlot:
library(semPlot)
semPaths(lavaanModel,
what = "est",
fade = FALSE)
lessSEM supports multi-group SEM and, to some
degree, definition variables. Regularized multi-group SEM have been
proposed by Huang (2018) and are implemented in lslx
(Huang, 2020). Here, differences between groups are regularized. A
detailed introduction can be found in
vignette(topic = "Definition-Variables-and-Multi-Group-SEM", package = "lessSEM")
.
Therein it is also explained how the multi-group SEM can be used to
implement definition variables (e.g., for latent growth curve
models).
lessSEM allows for defining different penalties for
different parts of the model. This feature is new and very experimental.
Please keep that in mind when using the procedure. A detailed
introduction can be found in
vignette(topic = "Mixed-Penalties", package = "lessSEM")
.
To provide a short example, we will regularize the loadings and the
regression parameters of the Political Democracy data set with different
penalties. The following script is adapted from
?lavaan::sem
.
<- '
model # latent variable definitions
ind60 =~ x1 + x2 + x3 + c2*y2 + c3*y3 + c4*y4
dem60 =~ y1 + y2 + y3 + y4
dem65 =~ y5 + y6 + y7 + c*y8
# regressions
dem60 ~ r1*ind60
dem65 ~ r2*ind60 + r3*dem60
'
<- sem(model,
lavaanModel data = PoliticalDemocracy)
# Let's add a lasso penalty on the cross-loadings c2 - c4 and
# scad penalty on the regressions r1-r3
<- lavaanModel |>
fitMp mixedPenalty() |>
addLasso(regularized = c("c2", "c3", "c4"),
lambdas = seq(0,1,.1)) |>
addScad(regularized = c("r1", "r2", "r3"),
lambdas = seq(0,1,.2),
thetas = 3.7) |>
fit()
The best model according to the BIC can be extracted with:
coef(fitMp, criterion = "BIC")
Currently, lessSEM has the following optimizers:
These optimizers are implemented based on the regCtsem
package. Most importantly, all optimizers in lessSEM are
available for other packages. There are four ways to implement
them which are documented in
vignette("General-Purpose-Optimization", package = "lessSEM")
.
In short, these are:
gpLasso
,
gpScad
, …). More information and examples can be found in
the documentation of these functions (e.g.,
?lessSEM::gpLasso
, ?lessSEM::gpAdaptiveLasso
,
?lessSEM::gpElasticNet
). The interface is similar to the
optim optimizers in R.gpLassoCpp
, gpScadCpp
, … (e.g.,
?lessSEM::gpLassoCpp
)vignette("The-optimizer-interface", package = "lessSEM")
.THE SOFTWARE IS PROVIDED ‘AS IS’, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.