In this vignette, we discuss how to use multilevelcoda
to specify multilevel models where compositional data are used as
predictors.
The following table outlines the packages used and a brief description of their purpose.
Package | Purpose |
---|---|
multilevelcoda |
calculate between and within composition variables, calculate substitutions and plots |
brms |
fit Bayesian multilevel models using Stan as a backend |
bayestestR |
compute Bayes factors used to compare models |
doFuture |
parallel processing to speed up run times |
library(multilevelcoda)
library(brms)
library(bayestestR)
library(doFuture)
options(digits = 3) # reduce number of digits shown
For the examples, we make use of three built in datasets:
Dataset | Purpose |
---|---|
mcompd |
compositional sleep and wake variables and additional predictors/outcomes (simulated) |
sbp |
a pre-specified sequential binary partition, used in calculating compositional predictors |
psub |
all possible pairwise substitutions between compositional variables, used for substitution analyses |
The following table shows a few rows of data from
mcompd
.
ID | Time | Stress | TST | WAKE | MVPA | LPA | SB | Age | Female |
---|---|---|---|---|---|---|---|---|---|
185 | 1 | 3.67 | 542 | 99.0 | 297.4 | 460 | 41.4 | 29.7 | 0 |
185 | 2 | 7.21 | 458 | 49.4 | 117.3 | 653 | 162.3 | 29.7 | 0 |
185 | 3 | 2.84 | 271 | 41.1 | 488.7 | 625 | 14.5 | 29.7 | 0 |
184 | 12 | 2.36 | 286 | 52.7 | 106.9 | 906 | 89.2 | 22.3 | 1 |
184 | 13 | 1.18 | 281 | 18.8 | 403.0 | 611 | 126.3 | 22.3 | 1 |
184 | 14 | 0.00 | 397 | 26.5 | 39.9 | 587 | 389.8 | 22.3 | 1 |
The following table shows the sequential binary partition being used
in sbp
. Columns correspond to the composition variables
(TST, WAKE, MVPA, LPA, SB). Rows correspond to distinct ILR
coordinates.
TST | WAKE | MVPA | LPA | SB |
---|---|---|---|---|
1 | 1 | -1 | -1 | -1 |
1 | -1 | 0 | 0 | 0 |
0 | 0 | 1 | -1 | -1 |
0 | 0 | 0 | 1 | -1 |
The following table shows how all the possible binary substitutions contrasts are setup. Time substitutions work by taking time from the -1 variable and adding time to the +1 variable.
TST | WAKE | MVPA | LPA | SB |
---|---|---|---|---|
1 | -1 | 0 | 0 | 0 |
1 | 0 | -1 | 0 | 0 |
1 | 0 | 0 | -1 | 0 |
1 | 0 | 0 | 0 | -1 |
-1 | 1 | 0 | 0 | 0 |
0 | 1 | -1 | 0 | 0 |
0 | 1 | 0 | -1 | 0 |
0 | 1 | 0 | 0 | -1 |
-1 | 0 | 1 | 0 | 0 |
0 | -1 | 1 | 0 | 0 |
0 | 0 | 1 | -1 | 0 |
0 | 0 | 1 | 0 | -1 |
-1 | 0 | 0 | 1 | 0 |
0 | -1 | 0 | 1 | 0 |
0 | 0 | -1 | 1 | 0 |
0 | 0 | 0 | 1 | -1 |
-1 | 0 | 0 | 0 | 1 |
0 | -1 | 0 | 0 | 1 |
0 | 0 | -1 | 0 | 1 |
0 | 0 | 0 | -1 | 1 |
Compositional data are often expressed as a set of isometric log
ratio (ILR) coordinates in regression models. We can use the
compilr()
function to calculate both between- and
within-level ILR coordinates for use in subsequent models as
predictors.
Notes: compilr()
also calculates total ILR
coordinates to be used as outcomes (or predictors) in models, if the
decomposition into a between- and within-level ILR coordinates was not
desired.
The compilr()
function for multilevel data requires four
arguments:
Argument | Description |
---|---|
data |
A long data set containing all variables needed to fit the multilevel models, |
including the repeated measure compositional predictors and outcomes, along with any additional covariates. | |
sbp |
A Sequential Binary Partition to calculate \(ilr\) coordinates. |
parts |
The name of the compositional components in data . |
idvar |
The grouping factor on data to compute the
between-person and within-person composition and \(ilr\) coordinates. |
total |
Optional argument to specify the amount to which the compositions should be closed. |
We now will use output from the c