# Modifying the environment of a PADRINO IPM

Sometimes, we may want to modify the environmental sequence of a stochastic IPM stored in PADRINO. As of now, there is no single interace for doing this, and so you may need to write some code on your own. This vignette is meant to provide a bit of help with that. It will make use of Rpadrino and ipmr to modify proto_ipms, and doesn’t assume any prior knowledge of the data structure or of PADRINO itself (I don’t think, anyway. Let me know if I’m incorrect via a Github Issue).

## Finding parameter-resampled IPMs

These aren’t explicitly marked in the Metadata table. However, we can find them by searching the EnvironmentalVariables table, specifically the ipm_id column.

library(Rpadrino)

unique(pdb$EnvironmentalVariables$ipm_id)
## [1] "aaaa15" "aaaa16" "aaaa21" "aaaa54" "aaaa59"

For now, we’ll just work with the two IPMs from Westerband et al. (2016):

stoch_ids <- c("aaaa15", "aaaa16")

Ok, now we have the data we want to work with, we can check out various ways of modifying them to explore the questions we want to answer!

# re-set environmental variables w/ pre-defined sequence of values

First, we’ll examine how to re-set IPMs to sample from a pre-determined sequence of environmental values. This is useful for exploring things like, for example, environmental autocorrelation and/or enhancing reproducibility of our subsequent analyses.

This will require a few steps:

1. Create proto_ipm objects from the PADRINO data.

2. Remove the current stochastic environmental state expressions

3. Replace them with something that will generate the values we want.

## Create the proto_ipm’s and find the stochastic parameter names

First, we should investigate what’s in each model by creating and printing the proto_ipms.

# Step 1 - create proto_ipms, and then figure out which parameters in the
# vital rate expressions we need to work with.

protos <- pdb_make_proto_ipm(pdb, stoch_ids)
## 'ipm_id' aaaa15 has resampled parameters, resetting 'det_stoch' to 'stoch' and
## 'kern_param' to 'param'!
## Default number of iterations for 'pdb_make_ipm' will be 50. Modify this behavior
## with the 'addl_args' argument of 'pdb_make_ipm'.
## 'ipm_id' aaaa16 has resampled parameters, resetting 'det_stoch' to 'stoch' and
## 'kern_param' to 'param'!
## Default number of iterations for 'pdb_make_ipm' will be 50. Modify this behavior
## with the 'addl_args' argument of 'pdb_make_ipm'.
protos[[1]]
## A simple, density independent, stochastic, parameter-resampled proto_ipm with 2 kernels defined:
##  P, F
##
## Kernel formulae:
##
## P: s * g
## F: f
##
## Vital rates:
##
## s: 1/(1 + exp(-(s_0 + s_1 * leafarea_1 + s_2 * j + s_3 * leafarea_1 *
##     j)))
## g_mu: g_0 + g_1 * leafarea_1 + g_2 * j + g_3 * A_max + g_4 * leafarea_1 *
##     j + g_5 * leafarea_1 * A_max + g_6 * j * A_max + g_7 * leafarea_1 *
##     j * A_max
## g: dnorm(leafarea_2, g_mu, g_sd)
## f: r_p * r_o * n_f * n_s * s_s * sdl_s * sdl_size
## r_p: 1/(1 + exp(-(r_0 + r_1 * leafarea_1 + r_2 * j + r_3 * leafarea_1 *
##     j)))
## r_o: exp(i_0 + i_1 * leafarea_1 + i_2 * j + i_3 * leafarea_1 * j)
## s_s: ifelse(j < 6, s_sl, s_sh)
## sdl_s: ifelse(j < 6, sdl_sl, sdl_sh)
## sdl_size: ifelse(j < 6, sdl_m_l, sdl_m_h)
## sdl_m_l: dnorm(leafarea_2, sdl_meanl, sdl_sdl)
## sdl_m_h: dnorm(leafarea_2, sdl_meanh, sdl_sdh)
##
## Parameter names:
##
##  [1] "s_0"       "s_1"       "s_2"       "s_3"       "g_0"       "g_1"
##  [7] "g_2"       "g_3"       "g_4"       "g_5"       "g_6"       "g_7"
## [13] "g_sd"      "r_0"       "r_1"       "r_2"       "r_3"       "i_0"
## [19] "i_1"       "i_2"       "i_3"       "n_f"       "n_s"       "s_sl"
## [25] "s_sh"      "sdl_sl"    "sdl_sh"    "sdl_meanl" "sdl_meanh" "sdl_sdl"
## [31] "sdl_sdh"
##
## All parameters in vital rate expressions found in 'data_list':  FALSE
## Did not test parameters in 'define_env_state()'.
##
##
## *P*
## j
## A_max
##
## *F*
## j
##
## This may be because 'define_domains()' has not yet been called.
##  If it has been called, then double check the model definition and 'data_list'.
##
##
## Domains for state variables:
##
## leafarea: lower_bound = 0.57, upper_bound = 11.9, n_meshpoints = 50
##
## Population states defined:
##
## n_leafarea: Pre-defined population state.
##
## Internally generated model iteration procedure:
##
## n_leafarea_t_1: right_mult(kernel = P, vectr = n_leafarea_t) + right_mult(kernel = F,
##     vectr = n_leafarea_t)

It seems that j and A_max are the two that we need to create sequences for. Let’s double check the EnvironmentalVariables table to be sure:

protos$aaaa15_2, env_vars = sample_env(use_env_data, index = t), data_list = list(use_env_data = use_env_data, sample_env = sample_env) ) ## Rebuild the models We can rebuild the models now using pdb_make_ipm(): ipms <- pdb_make_ipm(protos) lambda(ipms) ##$aaaa15
##     lambda
## -0.1033479
##
## $aaaa16 ## lambda ## 0.2710611 ## ##$aaaa15_2
##     lambda
## -0.1031656
Environmental covariate sequence from unaltered PADRINO data (first 10 iterations)
j A_max
3 6.885493
1 5.606567
5 6.185809
2 5.683118
2 6.073683
4 5.216322
3 6.408164
4 6.350631
1 6.722912
1 6.100611
Environmental covariate sequence from custom function (first 10 iterations)
j A_max
2 5.872700
2 7.091870
3 5.454284
1 6.470718
1 5.086940
3 7.249801
1 5.748229
2 6.741378
2 6.420061
2 6.368241

And finally, a quick check to verify that our pre-defined sequence is indeed the one that got used for aaaa15_2:

all.equal(ipms$aaaa15_2$env_seq$j, use_env_data$j) 
## [1] TRUE
all.equal(ipms$aaaa15_2$env_seq$A_max, use_env_data$A_max)
## [1] TRUE

# Modifying parameter values

This is a bit simpler, because we just need to make some alterations to a table, rather than creating new expressions. Unfortunately, we can’t use the parameters()<- interface for environmental variation because that setter function does not modify the right place in the proto_ipm object.

## Identify parameter names

The env_variable column of the EnvironmentalVariables usually provides a very brief summary of what these parameters are, but not always. In general, PADRINO notation in the vr_expr_name column mirrors the publication notation. So to understand the meaning of A_max, we need to consult the original publication (Westerband, A. C., Horvitz, C. C. (2017). Photosynthetic rates influence the population dynamics of understory herbs in stochastic light environments. Ecology, 98 (2): 370-381). Upon doing this, we see that A_max corresponds to photosynthetic capacity. For this instance, we’ll pretend that we want to set the lower limit of A_max to a smaller value, and the upper limit to a higher value.

EnvironmentalVariables table, subsetted using ‘stoch_ids’
ipm_id env_variable vr_expr_name env_range env_function model_type
aaaa15 LightLevel j NULL sample(j_seq) Substituted
aaaa15 LightLevel j_seq 1:5 NULL Parameter
aaaa15 A_max A_max NULL Unif(a_max_low, a_max_high) Substituted
aaaa15 A_max a_max_low 5 NULL Parameter
aaaa15 A_max a_max_high 7 NULL Parameter
aaaa16 LightLevel j NULL sample(j_seq) Substituted
aaaa16 LightLevel j_seq 1:5 NULL Parameter
aaaa16 A_max A_max NULL Unif(a_max_low, a_max_high) Substituted
aaaa16 A_max a_max_low 5 NULL Parameter
aaaa16 A_max a_max_high 8 NULL Parameter

We can see that A_max is given by a Uniform distribution with parameters A_max_low and A_max_high. These two, as their names imply, are the lower and upper limits for A_max, respectively. We can modify these just as we do any other data.frame. We’ll create a copy of the pdb object to modify, but you can modify it in place and re-download an unaltered copy, or save the unaltered copy before altering it to preserve the original version. Again, we’ll only modify aaaa15.

pdb_2 <- pdb

inds <- which(pdb_2$EnvironmentalVariables$vr_expr_name %in% paste0("a_max_",
c("low",
"high")) &
pdb_2$EnvironmentalVariables$ipm_id == "aaaa15")

pdb_2$EnvironmentalVariables$env_range[inds[1]] <- 3 # Bump min down two steps
pdb_2$EnvironmentalVariables$env_range[inds[2]] <- 9 # Bump max up two steps

# The rest is the usual building workflow: make a proto_ipm, then convert to
# ipm.

new_protos <- pdb_make_proto_ipm(pdb_2, "aaaa15")
## 'ipm_id' aaaa15 has resampled parameters, resetting 'det_stoch' to 'stoch' and
## 'kern_param' to 'param'!
## Default number of iterations for 'pdb_make_ipm' will be 50. Modify this behavior
## with the 'addl_args' argument of 'pdb_make_ipm'.
new_ipms   <- pdb_make_ipm(new_protos)

lambda(new_ipms)
## log(lambda) is returned by default for stochastic models. Set 'log = FALSE' for lambda on linear scale.
## $aaaa15 ## lambda ## -0.1032111 # Changing distributions What if we want to change the distribution that a certain environmental variable is sampled from? For example, rather than use a uniform distribution, we decide we want to sample it from a Gamma distribution. There are a couple steps we need to go through. They are: 1. Write a function that substitutes distribution names. We can find PADRINO’s distribution naming convention in the “Abbreviation” column here. 2. Write a function to insert the new parameter values and names into the EnvironmentalVariables Table. 3. Wrap 1-2 in a top-level function that we’ll actually use. 4. Use it to modify EnvironmentalVariables, then rebuild the IPMs using pdb_make_proto_ipm() and pdb_make_ipm(). NB: The functions we define in 1-3 will probably be incorporated into Rpadrino at some point in the near future, but further testing for stability is needed before they become part of the package. As such, use them at your own risk! The function in 1 will make use of rlang to safely modify function names and arguments. It is possible to do this with gsub("Unif", "Gamma", pdb$EnvironmentalVariables$env_function[index]) (where index is the row number of the function you want to update), but could lead to unexpected results sometimes. ## Top-level function This is probably the most straightforward of the functions to write. We know that there two steps we have to implement that this wraps: substitution the functioal form of the distribution, and substituting the parameter values. Additionally, we’re going to insert a subsetting capacity so that we can pass a larger database loop over a set of ipm_ids to rapidly update many models at once. library(rlang) sub_env_fun <- function(db, # pdb object parameter, # parameter created by the distribution new_fun, # The new distribution new_pars, # The parameters for the new distribution id = NULL) { # ipm_id to operate on if(!is.null(id)) { if(length(id) > 1L) stop("'id' should only have length 1!") db <- pdb_subset(db, id) } ev_tab <- db$EnvironmentalVariables %>%
sub_rng_fun(parameter, new_fun, new_pars) %>%
sub_rng_pars(parameter, new_pars)

db$EnvironmentalVariables <- ev_tab return(db) } ## Substituting functions The first substituting function we’ll write is the one that substitutes the distribution we wish to use into the env_function column. This will use the names of the new_pars object as arguments in a call to the new_fun function. Don’t worry about finding the correct R r<distribution> function here - we use PADRINO’s syntax to modify these. sub_rng_fun <- function(ev_tab, parameter, new_fun, new_pars) { # Get the old function form old_fun_ind <- which(ev_tab$vr_expr_name == parameter)

# Create a new function call, and insert new arguments
arg_nms     <- syms(names(new_pars))
new_fun     <- call2(new_fun, !!! syms(names(new_pars)))

# Convert back to a string and insert into the table
ev_tab$env_function[old_fun_ind] <- expr_text(new_fun) return(ev_tab) } There is a lot going on in this function, and most of it is beyond the scope of this vignette (if you’re curious about the details, see the Metaprogramming topics here). The key takeaway is that we take our new function, "Gamma" and convert it into a function call with arguments that are named after new_pars. Then we stick it back into the table, overwriting the old "Unif" function. Adding a parameter value to the table is far less complicated. First, we remove the parameters associated with the old distribution. We have to do this because of the way that Rpadrino finds the parameters to functions in the EnvironmentalVariables table. The first lines of the function accomplish this. After that, we create a version of the EnvironmentalVariables table that contains our new parameters and add those back in. sub_rng_pars <- function(ev_tab, parameter, new_pars) { # Remove the existing parameters associated with the distribution # that we've replaced. if(sum(ev_tab$env_variable == parameter) > 1) {
rm_ind <- ev_tab$env_variable == parameter & ev_tab$model_type == "Parameter"
ev_tab <- ev_tab[!rm_ind, ]
}

new_rows <- data.frame(
ipm_id       = rep(unique(ev_tab$ipm_id), length(new_pars)), env_variable = rep(parameter, length(new_pars)), vr_expr_name = names(new_pars), env_range = unlist(new_pars), env_function = rep("NULL", length(new_pars)), model_type = "Parameter" ) rownames(new_rows) <- NULL out <- rbind(ev_tab, new_rows) return(out) } small_pdb <- pdb %>% pdb_subset("aaaa15") new_pdb <- sub_env_fun(db = small_pdb, id = NULL, "A_max", "Gamma", list(a_max_shape = 14, a_max_rate = 3.5)) New ‘EnvironmentalVariables’ table ipm_id env_variable vr_expr_name env_range env_function model_type aaaa15 LightLevel j NULL sample(j_seq) Substituted aaaa15 LightLevel j_seq 1:5 NULL Parameter aaaa15 A_max A_max NULL Gamma(a_max_shape, a_max_rate) Substituted aaaa15 A_max a_max_shape 14 NULL Parameter aaaa15 A_max a_max_rate 3.5 NULL Parameter Notice that our new Gamma distribution parameters have found their way into the parameter table. We can now rebuild the model as using the same pipeline as above. new_ipm <- new_pdb %>% pdb_make_proto_ipm() %>% pdb_make_ipm() lambda(new_ipm) ##$aaaa15
##    lambda
## -0.102334

It is probably ok to copy/paste the functions we defined above and apply to in your own analyses. However, as they are not yet incorporated into Rpadrino, I cannot promise that the results will be error free.