In this vignette, we discuss how to use multilevelcoda to specify multilevel models where compositional data are used as predictors.

The following table outlines the packages used and a brief description of their purpose.

Package Purpose
multilevelcoda calculate between and within composition variables, calculate substitutions and plots
brms fit Bayesian multilevel models using Stan as a backend
bayestestR compute Bayes factors used to compare models
doFuture parallel processing to speed up run times
library(multilevelcoda)
library(brms)
library(bayestestR)
library(doFuture)

options(digits = 3) # reduce number of digits shown

For the examples, we make use of three built in datasets:

Dataset Purpose
mcompd compositional sleep and wake variables and additional predictors/outcomes (simulated)
sbp a pre-specified sequential binary partition, used in calculating compositional predictors
psub all possible pairwise substitutions between compositional variables, used for substitution analyses
data("mcompd") 
data("sbp")
data("psub")

The following table shows a few rows of data from mcompd.

ID Time Stress TST WAKE MVPA LPA SB Age Female
185 1 3.67 542 99.0 297.4 460 41.4 29.7 0
185 2 7.21 458 49.4 117.3 653 162.3 29.7 0
185 3 2.84 271 41.1 488.7 625 14.5 29.7 0
184 12 2.36 286 52.7 106.9 906 89.2 22.3 1
184 13 1.18 281 18.8 403.0 611 126.3 22.3 1
184 14 0.00 397 26.5 39.9 587 389.8 22.3 1

The following table shows the sequential binary partition being used in sbp. Columns correspond to the composition variables (TST, WAKE, MVPA, LPA, SB). Rows correspond to distinct ILR coordinates.

TST WAKE MVPA LPA SB
1 1 -1 -1 -1
1 -1 0 0 0
0 0 1 -1 -1
0 0 0 1 -1

The following table shows how all the possible binary substitutions contrasts are setup. Time substitutions work by taking time from the -1 variable and adding time to the +1 variable.

TST WAKE MVPA LPA SB
1 -1 0 0 0
1 0 -1 0 0
1 0 0 -1 0
1 0 0 0 -1
-1 1 0 0 0
0 1 -1 0 0
0 1 0 -1 0
0 1 0 0 -1
-1 0 1 0 0
0 -1 1 0 0
0 0 1 -1 0
0 0 1 0 -1
-1 0 0 1 0
0 -1 0 1 0
0 0 -1 1 0
0 0 0 1 -1
-1 0 0 0 1
0 -1 0 0 1
0 0 -1 0 1
0 0 0 -1 1

1 Multilevel model with compositional predictors

1.1 Compositions and isometric log ratio (ILR) coordinates.

Compositional data are often expressed as a set of isometric log ratio (ILR) coordinates in regression models. We can use the compilr() function to calculate both between- and within-level ILR coordinates for use in subsequent models as predictors.

Notes: compilr() also calculates total ILR coordinates to be used as outcomes (or predictors) in models, if the decomposition into a between- and within-level ILR coordinates was not desired.

The compilr() function for multilevel data requires four arguments:

Argument Description
data A long data set containing all variables needed to fit the multilevel models,
including the repeated measure compositional predictors and outcomes, along with any additional covariates.
sbp A Sequential Binary Partition to calculate \(ilr\) coordinates.
parts The name of the compositional components in data.
idvar The grouping factor on data to compute the between-person and within-person composition and \(ilr\) coordinates.
total Optional argument to specify the amount to which the compositions should be closed.
cilr <- compilr(data = mcompd, sbp = sbp,
                parts = c("TST", "WAKE", "MVPA", "LPA", "SB"), idvar = "ID", total = 1440)

1.2 Fitting model

We now will use output from the c