This vignette shows how to use the package to sequentially enrich the design for adaptive improvements of a DGP emulator.

```
library(tidyverse)
library(lhs)
library(ggplot2)
library(dgpsi)
```

We consider a non-stationary synthetic simulator which has a 2-dimensional input with the functional form (Ba and Joseph 2018) defined by:

```
<- function(x) {
f sin(1/((0.7*x[,1,drop=F]+0.3)*(0.7*x[,2,drop=F]+0.3)))
}
```

Note that to provide the simulator for the sequential design below,
we have defined the above function such that its input `x`

and output are both matrices. The commands below generate the contour of
the function:

```
<- seq(0, 1, length.out = 100)
x1 <- seq(0, 1, length.out = 100)
x2 <- expand_grid(x1 = x1, x2 = x2)
dat <- mutate(dat, f = f(cbind(x1, x2)))
dat ggplot(dat, aes(x1, x2, fill = f)) + geom_tile() +
scale_fill_continuous(type = "viridis")
```

We can see from the figure above that the synthetic simulator exhibits more fluctuations on the bottom left of its input space while in the top-right part the simulator shows little variations.

We now specify a seed with `set_seed()`

from the package
for reproducibility

`set_seed(99)`

and generate an initial design with 5 design points using the maximin Latin hypercube sampler:

```
<- maximinLHS(5,2)
X <- f(X) Y
```

To track the qualities of constructed emulators during the sequential design, we generate a validation dataset:

```
<- maximinLHS(200,2)
validate_x <- f(validate_x) validate_y
```

To start with the sequential design, we initialize a two-layered DGP emulator using the generated initial design:

`<- dgp(X, Y) m `

```
## Auto-generating a 2-layered DGP structure ... done
## Initializing the DGP emulator ... done
## Training the DGP emulator:
## Iteration 500: Layer 2: 100%|â–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆâ–ˆ| 500/500 [00:01<00:00, 344.20it/s]
## Imputing ... done
```

We then specify the boundaries of input parameters of `f`

for the sequential design to locate design points to be added:

```
<- c(0, 1)
lim_1 <- c(0, 1)
lim_2 <- rbind(lim_1, lim_2) lim
```

The boundaries of input parameters are defined as a matrix with each
row giving the lower and upper limits of an input parameter. After the
boundaries are specified, we are ready to conduct the sequential design
to adaptively improve the emulator `m`

via
`design()`

. The function `design()`

provides a
simple and flexible implementation of sequential designs for DGP
emulators. In this vignette, we only demonstrate its basic usage and
refer users to `?design`

for more advanced specifications,
e.g., on checkpoints to manually control the design progress and on
schedules to re-fit and validate emulators.

For illustrative purpose, we implement three waves of sequential
designs on `m`

:

```
# 1st wave with 15 steps
<- design(m, N = 15, limits = lim, f = f, x_test = validate_x, y_test = validate_y) m
```

```
## Initializing ... done
## * RMSE: 0.529337
## Iteration 1:
## - Locating ... done
## * Next design point: 0.093332 0.075594
## - Updating and re-fitting ... done
## - Validating ... done
## * RMSE: 0.521201
##
## ...
##
## Iteration 15:
## - Locating ... done
## * Next design point: 0.996120 0.904685
## - Updating and re-fitting ... done
## - Validating ... done
## * RMSE: 0.170959
```

```
# 2nd wave with 10 steps
<- design(m, N = 10, limits = lim, f = f, x_test = validate_x, y_test = validate_y) m
```

```
## Initializing ... done
## * RMSE: 0.170959
## Iteration 1:
## - Locating ... done
## * Next design point: 0.862600 0.722212
## - Updating and re-fitting ... done
## - Validating ... done
## * RMSE: 0.150731
##
## ...
##
## Iteration 10:
## - Locating ... done
## * Next design point: 0.519293 0.463623
## - Updating and re-fitting ... done
## - Validating ... done
## * RMSE: 0.112657
```

```
# 3rd wave with 10 steps
<- design(m, N = 10, limits = lim, f = f, x_test = validate_x, y_test = validate_y) m
```

```
## Initializing ... done
## * RMSE: 0.112657
## Iteration 1:
## - Locating ... done
## * Next design point: 0.045685 0.039926
## - Updating and re-fitting ... done
## - Validating ... done
## * RMSE: 0.033327
##
## ...
##
## Iteration 10:
## - Locating ... done
## * Next design point: 0.776195 0.364631
## - Updating and re-fitting ... done
## - Validating ... done
## * RMSE: 0.004254
```

After the sequential design is done, we can inspect the enriched
design by applying `draw()`

to `m`

:

`draw(m, 'design')`

It can be seen from the figure
above that most of the added design points concentrate at the
bottom-left corner of the input space where the simulator `f`

exhibits more variations and thus needs more data to be well-emulated.
We can also visualize the changes of qualities (in terms of RMSEs wrt
the validation dataset) of emulators constructed during the three waves
of sequential designs:

`draw(m, 'rmse')`

We build four DGP emulators with static space-filling Latin hypercube designs (LHD) of size 10, 20, 30, and 40 respectively:

```
# DGP emulator with a LHD of size 10
<- maximinLHS(10,2)
X1 <- f(X1)
Y1 <- dgp(X1, Y1, verb = F) m1
```

```
# DGP emulator with a LHD of size 20
<- maximinLHS(20,2)
X2 <- f(X2)
Y2 <- dgp(X2, Y2, verb = F) m2
```

```
# DGP emulator with a LHD of size 30
<- maximinLHS(30,2)
X3 <- f(X3)
Y3 <- dgp(X3, Y3, verb = F) m3
```

```
# DGP emulator with a LHD of size 40
<- maximinLHS(40,2)
X4 <- f(X4)
Y4 <- dgp(X4, Y4, verb = F) m4
```

We then extract their RMSEs

```
# validation of the DGP emulator with the LHD of size 10
<- validate(m1, x_test = validate_x, y_test = validate_y, verb = F)
m1 <- m1$oos$rmse
rmse1 # validation of the DGP emulator with the LHD of size 20
<- validate(m2, x_test = validate_x, y_test = validate_y, verb = F)
m2 <- m2$oos$rmse
rmse2 # validation of the DGP emulator with the LHD of size 30
<- validate(m3, x_test = validate_x, y_test = validate_y, verb = F)
m3 <- m3$oos$rmse
rmse3 # validation of the DGP emulator with the LHD of size 40
<- validate(m4, x_test = validate_x, y_test = validate_y, verb = F)
m4 <- m4$oos$rmse
rmse4 # create a dataframe that stores the RMSEs of the four DGP emulators
<- data.frame('N' = c(10, 20, 30, 40), 'rmse' = c(rmse1, rmse2, rmse3, rmse4), 'LHD' = c('lhd-10', 'lhd-20', 'lhd-30', 'lhd-40')) rmse_static
```

and add them to the sequential design validation plot (in log-scale) for comparisons:

```
draw(m, 'rmse', log = T) +
geom_point(data = rmse_static, mapping = aes(x = N, y = rmse, group = LHD, shape = LHD), color = '#E69F00', size = 1.5) +
scale_shape_manual(values = c(2, 3, 4, 8))
```

It can be seen from the plot above that with static space-filling designs, the quality of an emulator may not be improved as the design size increases. This is because increasing the size of a space-filling design may not capture regions where the simulator exhibits more variations, and thus cause DGP emulators with higher RMSEs than those constructed through the sequential design.

See `Sequential Design II`

for the sequential design of a bundle of DGP emulators with automatic
terminations.

Ba, Shan, and V. Roshan Joseph. 2018. *CGP: Composite Gaussian
Process Models*. https://CRAN.R-project.org/package=CGP.