Function for prediction at new locations for multi-season multi-species spatial occupancy models
predict.stMsPGOcc.Rd
The function predict
collects posterior predictive samples for a set of new locations given an object of class `stMsPGOcc`. Prediction is possible for both the latent occupancy state as well as detection. Predictions are currently only possible for sampled primary time periods.
Usage
# S3 method for stMsPGOcc
predict(object, X.0, coords.0, t.cols, n.omp.threads = 1,
verbose = TRUE, n.report = 100,
ignore.RE = FALSE, type = 'occupancy', grid.index.0, ...)
Arguments
- object
an object of class stMsPGOcc
- X.0
the design matrix of covariates at the prediction locations. This should be a three-dimensional array, with dimensions corresponding to site, primary time period, and covariate, respectively. Note that the first covariate should consist of all 1s for the intercept if an intercept is included in the model. If random effects are included in the occupancy (or detection if
type = 'detection'
) portion of the model, the levels of the random effects at the new locations/time periods should be included as an element of the three-dimensional array. The ordering of the levels should match the ordering used to fit the data instMsPGOcc
. The covariates should be organized in the same order as they were specified in the corresponding formula argument ofstMsPGOcc
. Names of the third dimension (covariates) of any random effects in X.0 must match the name of the random effects used to fit the model, if specified in the corresponding formula argument ofstMsPGOcc
. See example below.- coords.0
the spatial coordinates corresponding to
X.0
. Note thatspOccupancy
assumes coordinates are specified in a projected coordinate system.- t.cols
an indexing vector used to denote which primary time periods are contained in the design matrix of covariates at the prediction locations (
X.0
). The values should denote the specific primary time periods used to fit the model. The values should indicate the columns indata$y
used to fit the model for which prediction is desired. See example below.- n.omp.threads
a positive integer indicating the number of threads to use for SMP parallel processing. The package must be compiled for OpenMP support. For most Intel-based machines, we recommend setting
n.omp.threads
up to the number of hyperthreaded cores. Note,n.omp.threads
> 1 might not work on some systems.- verbose
if
TRUE
, model specification and progress of the sampler is printed to the screen. Otherwise, nothing is printed to the screen.- ignore.RE
logical value that specifies whether or not to remove random unstructured occurrence (or detection if
type = 'detection'
) effects from the subsequent predictions. IfTRUE
, random effects will be included. IfFALSE
, unstructured random effects will be set to 0 and predictions will only be generated from the fixed effects, the spatial random effects, and AR(1) random effects if the model was fit withar1 = TRUE
.- n.report
the interval to report sampling progress.
- type
a quoted keyword indicating what type of prediction to produce. Valid keywords are 'occupancy' to predict latent occupancy probability and latent occupancy values (this is the default), or 'detection' to predict detection probability given new values of detection covariates.
- grid.index.0
an indexing vector used to specify how each row in
X.0
corresponds to the coordinates specified incoords.0
. Only relevant if the spatial random effect was estimated at a higher spatial resolution (e.g., grid cells) than point locations.- ...
currently no additional arguments
Note
When ignore.RE = FALSE
, both sampled levels and non-sampled levels of unstructured random effects are supported for prediction. For sampled levels, the posterior distribution for the random intercept corresponding to that level of the random effect will be used in the prediction. For non-sampled levels, random values are drawn from a normal distribution using the posterior samples of the random effect variance, which results in fully propagated uncertainty in predictions with models that incorporate random effects.
Occurrence predictions at sites that are only sampled for a subset of the total number of primary time periods are obtained directly when fitting the model. See the psi.samples
and z.samples
portions of the output list from the model object of class stMsPGOcc
.
Author
Jeffrey W. Doser doserjef@msu.edu,
Andrew O. Finley finleya@msu.edu
Value
A list object of class predict.stMsPGOcc
. When type = 'occupancy'
, the list consists of:
- psi.0.samples
a four-dimensional object of posterior predictive samples for the latent occupancy probability values with dimensions corresponding to posterior predictive sample, species, site, and primary time period.
- z.0.samples
a three-dimensional object of posterior predictive samples for the latent occupancy values with dimensions corresponding to posterior predictive sample, species, site, and primary time period.
- w.0.samples
a three-dimensional array of posterior predictive samples for the latent spatial factors with dimensions correpsonding to MCMC sample, latent factor, and site.
When type = 'detection'
, the list consists of:
- p.0.samples
a four-dimensional object of posterior predictive samples for the detection probability values with dimensions corresponding to posterior predictive sample, site, and primary time period.
The return object will include additional objects used for standard extractor functions.
Examples
# Simulate Data -----------------------------------------------------------
set.seed(500)
J.x <- 8
J.y <- 8
J <- J.x * J.y
# Years sampled
n.time <- sample(3:10, J, replace = TRUE)
# n.time <- rep(10, J)
n.time.max <- max(n.time)
# Replicates
n.rep <- matrix(NA, J, max(n.time))
for (j in 1:J) {
n.rep[j, 1:n.time[j]] <- sample(2:4, n.time[j], replace = TRUE)
# n.rep[j, 1:n.time[j]] <- rep(4, n.time[j])
}
N <- 7
# Community-level covariate effects
# Occurrence
beta.mean <- c(-3, -0.2, 0.5)
trend <- FALSE
sp.only <- 0
p.occ <- length(beta.mean)
tau.sq.beta <- c(0.6, 1.5, 1.4)
# Detection
alpha.mean <- c(0, 1.2, -1.5)
tau.sq.alpha <- c(1, 0.5, 2.3)
p.det <- length(alpha.mean)
# Random effects
psi.RE <- list()
p.RE <- list()
# Draw species-level effects from community means.
beta <- matrix(NA, nrow = N, ncol = p.occ)
alpha <- matrix(NA, nrow = N, ncol = p.det)
for (i in 1:p.occ) {
beta[, i] <- rnorm(N, beta.mean[i], sqrt(tau.sq.beta[i]))
}
for (i in 1:p.det) {
alpha[, i] <- rnorm(N, alpha.mean[i], sqrt(tau.sq.alpha[i]))
}
sp <- TRUE
svc.cols <- c(1)
p.svc <- length(svc.cols)
n.factors <- 3
phi <- runif(p.svc * n.factors, 3 / .9, 3 / .3)
factor.model <- TRUE
cov.model <- 'exponential'
ar1 <- TRUE
sigma.sq.t <- runif(N, 0.05, 1)
rho <- runif(N, 0.1, 1)
dat <- simTMsOcc(J.x = J.x, J.y = J.y, n.time = n.time, n.rep = n.rep, N = N,
beta = beta, alpha = alpha, sp.only = sp.only, trend = trend,
psi.RE = psi.RE, p.RE = p.RE, factor.model = factor.model,
svc.cols = svc.cols, n.factors = n.factors, phi = phi, sp = sp,
cov.model = cov.model, ar1 = ar1, sigma.sq.t = sigma.sq.t, rho = rho)
# Subset data for prediction
pred.indx <- sample(1:J, round(J * .25), replace = FALSE)
y <- dat$y[, -pred.indx, , , drop = FALSE]
# Occupancy covariates
X <- dat$X[-pred.indx, , , drop = FALSE]
# Prediction covariates
X.0 <- dat$X[pred.indx, , , drop = FALSE]
# Detection covariates
X.p <- dat$X.p[-pred.indx, , , , drop = FALSE]
# Coordinates
coords <- dat$coords[-pred.indx, ]
coords.0 <- dat$coords[pred.indx, ]
occ.covs <- list(occ.cov.1 = X[, , 2],
occ.cov.2 = X[, , 3])
det.covs <- list(det.cov.1 = X.p[, , , 2],
det.cov.2 = X.p[, , , 3])
data.list <- list(y = y, occ.covs = occ.covs,
det.covs = det.covs,
coords = coords)
# Priors
prior.list <- list(beta.comm.normal = list(mean = 0, var = 2.72),
alpha.comm.normal = list(mean = 0, var = 2.72),
tau.sq.beta.ig = list(a = 0.1, b = 0.1),
tau.sq.alpha.ig = list(a = 0.1, b = 0.1),
rho.unif = list(a = -1, b = 1),
sigma.sq.t.ig = list(a = 0.1, b = 0.1),
phi.unif = list(a = 3 / .9, b = 3 / .1))
z.init <- apply(y, c(1, 2, 3), function(a) as.numeric(sum(a, na.rm = TRUE) > 0))
inits.list <- list(alpha.comm = 0, beta.comm = 0, beta = 0,
alpha = 0, tau.sq.beta = 1, tau.sq.alpha = 1,
rho = 0.5, sigma.sq.t = 0.5,
phi = 3 / .5, z = z.init)
# Tuning
tuning.list <- list(phi = 1, rho = 0.5)
# Number of batches
n.batch <- 2
# Batch length
batch.length <- 25
n.burn <- 25
n.thin <- 1
n.samples <- n.batch * batch.length
# Note that this is just a test case and more iterations/chains may need to
# be run to ensure convergence.
out <- stMsPGOcc(occ.formula = ~ occ.cov.1 + occ.cov.2,
det.formula = ~ det.cov.1 + det.cov.2,
data = data.list,
inits = inits.list,
n.batch = n.batch,
batch.length = batch.length,
accept.rate = 0.43,
ar1 = TRUE,
NNGP = TRUE,
n.neighbors = 5,
n.factors = n.factors,
cov.model = 'exponential',
priors = prior.list,
tuning = tuning.list,
n.omp.threads = 1,
verbose = TRUE,
n.report = 1,
n.burn = n.burn,
n.thin = n.thin,
n.chains = 1)
#> ----------------------------------------
#> Preparing to run the model
#> ----------------------------------------
#> lambda is not specified in initial values.
#> Setting initial values of the lower triangle to 0
#> ----------------------------------------
#> Building the neighbor list
#> ----------------------------------------
#> ----------------------------------------
#> Building the neighbors of neighbors list
#> ----------------------------------------
#> ----------------------------------------
#> Model description
#> ----------------------------------------
#> Spatial Factor NNGP Multi-season Multi-species Occupancy Model with Polya-Gamma latent
#> variables with 48 sites, 7 species, and 10 primary time periods.
#>
#> Samples per chain: 50 (2 batches of length 25)
#> Burn-in: 25
#> Thinning Rate: 1
#> Number of Chains: 1
#> Total Posterior Samples: 25
#>
#> Using the exponential spatial correlation model.
#>
#> Using 3 latent spatial factors.
#> Using 5 nearest neighbors.
#>
#> Source compiled with OpenMP support and model fit using 1 thread(s).
#>
#> Adaptive Metropolis with target acceptance rate: 43.0
#> ----------------------------------------
#> Chain 1
#> ----------------------------------------
#> Sampling ...
#> Batch: 1 of 2, 50.00%
#> Latent Factor Parameter Acceptance Tuning
#> 1 phi 64.0 1.02020
#> 2 phi 56.0 1.02020
#> 3 phi 88.0 1.02020
#> Species Parameter Acceptance Tuning
#> 1 rho 96.0 0.51010
#> 2 rho 84.0 0.51010
#> 3 rho 72.0 0.51010
#> 4 rho 84.0 0.51010
#> 5 rho 68.0 0.51010
#> 6 rho 80.0 0.51010
#> 7 rho 68.0 0.51010
#> -------------------------------------------------
#> Batch: 2 of 2, 100.00%
summary(out)
#>
#> Call:
#> stMsPGOcc(occ.formula = ~occ.cov.1 + occ.cov.2, det.formula = ~det.cov.1 +
#> det.cov.2, data = data.list, inits = inits.list, priors = prior.list,
#> tuning = tuning.list, cov.model = "exponential", NNGP = TRUE,
#> n.neighbors = 5, n.factors = n.factors, n.batch = n.batch,
#> batch.length = batch.length, accept.rate = 0.43, n.omp.threads = 1,
#> verbose = TRUE, ar1 = TRUE, n.report = 1, n.burn = n.burn,
#> n.thin = n.thin, n.chains = 1)
#>
#> Samples per Chain: 50
#> Burn-in: 25
#> Thinning Rate: 1
#> Number of Chains: 1
#> Total Posterior Samples: 25
#> Run Time (min): 0.0028
#>
#> ----------------------------------------
#> Community Level
#> ----------------------------------------
#> Occurrence Means (logit scale):
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> (Intercept) -3.1954 0.2919 -3.7299 -3.2130 -2.7291 NA 25
#> occ.cov.1 0.5636 0.1224 0.3740 0.5673 0.7532 NA 13
#> occ.cov.2 0.5272 0.5239 -0.3549 0.4481 1.4771 NA 25
#>
#> Occurrence Variances (logit scale):
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> (Intercept) 0.8643 0.5468 0.3426 0.7342 2.2173 NA 7
#> occ.cov.1 0.1655 0.1898 0.0399 0.1161 0.6132 NA 25
#> occ.cov.2 2.2613 1.7700 0.6033 1.6873 6.8030 NA 25
#>
#> Detection Means (logit scale):
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> (Intercept) 0.3483 0.3827 -0.4923 0.4204 0.9452 NA 25
#> det.cov.1 1.0121 0.2185 0.5755 0.9769 1.4039 NA 8
#> det.cov.2 -1.0229 0.7532 -2.2908 -1.0010 0.5570 NA 25
#>
#> Detection Variances (logit scale):
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> (Intercept) 0.7999 0.5360 0.2861 0.4979 2.0088 NA 25
#> det.cov.1 0.2021 0.1753 0.0454 0.1439 0.6170 NA 25
#> det.cov.2 4.2037 2.9878 0.9484 3.8357 10.6184 NA 25
#>
#> ----------------------------------------
#> Species Level
#> ----------------------------------------
#> Occurrence (logit scale):
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> (Intercept)-sp1 -3.2678 0.2238 -3.6713 -3.2746 -2.8777 NA 5
#> (Intercept)-sp2 -3.3036 0.2637 -3.8166 -3.3394 -2.9198 NA 5
#> (Intercept)-sp3 -3.7154 0.1715 -3.9653 -3.7427 -3.4554 NA 6
#> (Intercept)-sp4 -3.9140 0.4204 -4.6944 -3.9665 -3.1738 NA 5
#> (Intercept)-sp5 -4.0496 0.4544 -5.0515 -3.9128 -3.5246 NA 2
#> (Intercept)-sp6 -1.5687 0.1534 -1.8541 -1.5242 -1.2857 NA 10
#> (Intercept)-sp7 -3.6643 0.1920 -3.9911 -3.6769 -3.3250 NA 18
#> occ.cov.1-sp1 0.2988 0.2143 -0.0450 0.2927 0.7274 NA 8
#> occ.cov.1-sp2 0.6899 0.1399 0.4790 0.6570 0.9067 NA 7
#> occ.cov.1-sp3 0.4741 0.2077 0.1325 0.4913 0.8821 NA 9
#> occ.cov.1-sp4 0.4144 0.2473 -0.1117 0.3948 0.7881 NA 6
#> occ.cov.1-sp5 0.7242 0.1538 0.4686 0.7483 1.0121 NA 9
#> occ.cov.1-sp6 0.2707 0.1579 0.0590 0.2451 0.6280 NA 25
#> occ.cov.1-sp7 1.0389 0.2315 0.6829 1.0442 1.3969 NA 10
#> occ.cov.2-sp1 1.4815 0.3057 0.9415 1.4683 1.9314 NA 7
#> occ.cov.2-sp2 1.9986 0.2007 1.6724 2.0005 2.2729 NA 8
#> occ.cov.2-sp3 -0.8551 0.2283 -1.3189 -0.7893 -0.5885 NA 12
#> occ.cov.2-sp4 -0.5746 0.2389 -0.9716 -0.5406 -0.1571 NA 6
#> occ.cov.2-sp5 0.2369 0.2722 -0.1545 0.2368 0.7573 NA 7
#> occ.cov.2-sp6 1.1040 0.2584 0.6742 1.0424 1.5418 NA 4
#> occ.cov.2-sp7 1.3127 0.1641 1.0320 1.2928 1.6200 NA 25
#>
#> Detection (logit scale):
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> (Intercept)-sp1 1.1430 0.3167 0.7416 1.1106 1.7837 NA 7
#> (Intercept)-sp2 -0.6973 0.1835 -0.9866 -0.6973 -0.4225 NA 25
#> (Intercept)-sp3 -0.0727 0.6205 -1.1066 0.0468 0.9218 NA 10
#> (Intercept)-sp4 1.0372 0.2785 0.5775 1.0518 1.4554 NA 24
#> (Intercept)-sp5 0.1780 0.4290 -0.5290 0.1156 0.7938 NA 8
#> (Intercept)-sp6 0.7014 0.2098 0.3254 0.7455 1.0377 NA 13
#> (Intercept)-sp7 -0.1688 0.2062 -0.5102 -0.1566 0.2150 NA 25
#> det.cov.1-sp1 1.2992 0.2372 0.9180 1.2845 1.6283 NA 23
#> det.cov.1-sp2 0.7861 0.2222 0.4663 0.7384 1.2467 NA 12
#> det.cov.1-sp3 0.5501 0.3393 0.0222 0.5553 1.2552 NA 9
#> det.cov.1-sp4 0.9494 0.2538 0.3487 0.9751 1.2814 NA 25
#> det.cov.1-sp5 1.1180 0.2281 0.7610 1.0572 1.4752 NA 37
#> det.cov.1-sp6 0.8690 0.1400 0.6828 0.9231 1.1108 NA 25
#> det.cov.1-sp7 1.2592 0.2326 0.8154 1.2885 1.6719 NA 11
#> det.cov.2-sp1 -0.2317 0.3018 -0.8039 -0.1536 0.1451 NA 6
#> det.cov.2-sp2 -0.0249 0.1693 -0.2694 -0.0706 0.2792 NA 11
#> det.cov.2-sp3 -4.9103 1.0032 -6.6918 -4.7640 -3.6317 NA 2
#> det.cov.2-sp4 -1.2373 0.2156 -1.5843 -1.2734 -0.7239 NA 25
#> det.cov.2-sp5 -0.6104 0.3098 -1.2420 -0.6630 -0.1337 NA 23
#> det.cov.2-sp6 -2.2039 0.3036 -2.6720 -2.2527 -1.5771 NA 16
#> det.cov.2-sp7 -1.8536 0.3258 -2.4508 -1.7920 -1.4084 NA 10
#>
#> ----------------------------------------
#> Spatio-temporal Covariance:
#> ----------------------------------------
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> phi-1 28.9223 1.8729 24.1267 29.7315 29.9908 NA 4
#> phi-2 5.7173 2.0652 3.7635 4.5142 9.6623 NA 6
#> phi-3 21.9339 4.5352 12.1110 22.1993 27.6110 NA 10
#> sigma.sq.t-sp1 0.2524 0.2172 0.0706 0.1963 0.8326 NA 25
#> sigma.sq.t-sp2 0.6711 0.4107 0.2043 0.6104 1.6934 NA 7
#> sigma.sq.t-sp3 0.7681 0.6142 0.2540 0.5694 2.3622 NA 9
#> sigma.sq.t-sp4 0.2899 0.1806 0.1179 0.2270 0.7190 NA 25
#> sigma.sq.t-sp5 0.2449 0.4243 0.0547 0.1025 1.4558 NA 9
#> sigma.sq.t-sp6 0.1156 0.1089 0.0170 0.0929 0.4073 NA 9
#> sigma.sq.t-sp7 2.4598 1.9571 0.6719 1.7630 6.7120 NA 6
#> rho-sp1 0.2603 0.1805 -0.0954 0.2812 0.5782 NA 25
#> rho-sp2 0.6910 0.1474 0.3673 0.6996 0.8655 NA 8
#> rho-sp3 0.5476 0.1113 0.3484 0.5630 0.7117 NA 10
#> rho-sp4 0.8175 0.0796 0.6044 0.8213 0.9069 NA 8
#> rho-sp5 0.7929 0.0738 0.6390 0.8051 0.8866 NA 4
#> rho-sp6 0.1137 0.2946 -0.4302 0.2335 0.5088 NA 4
#> rho-sp7 -0.4087 0.3052 -0.7247 -0.5123 0.1953 NA 4
# Predict at new sites across all n.max.years
# Take a look at array of covariates for prediction
str(X.0)
#> num [1:16, 1:10, 1:3] 1 1 1 1 1 1 1 1 1 1 ...
# Subset to only grab time periods 1, 2, and 5
t.cols <- c(1, 2, 5)
X.pred <- X.0[, t.cols, ]
out.pred <- predict(out, X.pred, coords.0, t.cols = t.cols, type = 'occupancy')
#> ----------------------------------------
#> Prediction description
#> ----------------------------------------
#> Spatial Factor NNGP Multi-season, Multi-species Occupancy model with Polya-Gamma latent
#> variable fit with 48 sites and 3 years.
#>
#> Number of covariates 3 (including intercept if specified).
#>
#> Number of spatially-varying covariates 1 (including intercept if specified).
#>
#> Using the exponential spatial correlation model.
#>
#> Using 5 nearest neighbors.
#> Using 3 latent spatial factors.
#>
#> Number of MCMC samples 25.
#>
#> Predicting at 16 non-sampled locations.
#>
#>
#> Source compiled with OpenMP support and model fit using 1 threads.
#> -------------------------------------------------
#> Predicting
#> -------------------------------------------------
#> Location: 16 of 16, 100.00%
#> Generating latent occupancy state
str(out.pred)
#> List of 6
#> $ z.0.samples : num [1:25, 1:7, 1:16, 1:3] 0 0 0 0 0 1 1 0 0 0 ...
#> $ w.0.samples : num [1:25, 1:3, 1:16] 1.008 -0.122 -1.478 0.992 -1.274 ...
#> $ psi.0.samples: num [1:25, 1:7, 1:16, 1:3] 0.05567 0.00589 0.00324 0.03192 0.00356 ...
#> $ run.time : 'proc_time' Named num [1:5] 0 0.01 0.003 0 0
#> ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
#> $ call : language predict.svcTMsPGOcc(object = object, X.0 = X.0, coords.0 = coords.0, t.cols = t.cols, n.omp.threads = n.omp.| __truncated__ ...
#> $ object.class : chr "stMsPGOcc"
#> - attr(*, "class")= chr "predict.svcTMsPGOcc"