Function for prediction at new locations for multi-season single-species spatially-varying coefficient integrated occupancy models
predict.svcTIntPGOcc.Rd
The function predict
collects posterior predictive samples for a set of new locations given an object of class `svcTIntPGOcc`. Detection prediction is not currently supported. Predictions are currently only possible for sampled primary time periods.
Usage
# S3 method for svcTIntPGOcc
predict(object, X.0, coords.0, t.cols, n.omp.threads = 1,
verbose = TRUE, n.report = 100,
ignore.RE = FALSE, type = 'occupancy', forecast = FALSE, ...)
Arguments
- object
an object of class svcTIntPGOcc
- X.0
the design matrix of covariates at the prediction locations. This should be a three-dimensional array, with dimensions corresponding to site, primary time period, and covariate, respectively. Note that the first covariate should consist of all 1s for the intercept if an intercept is included in the model. If random effects are included in the occupancy (or detection if
type = 'detection'
) portion of the model, the levels of the random effects at the new locations/time periods should be included as an element of the three-dimensional array. The ordering of the levels should match the ordering used to fit the data insvcTIntPGOcc
. The covariates should be organized in the same order as they were specified in the corresponding formula argument ofsvcTIntPGOcc
. Names of the third dimension (covariates) of any random effects in X.0 must match the name of the random effects used to fit the model, if specified in the corresponding formula argument ofsvcTIntPGOcc
. See example below.- coords.0
the spatial coordinates corresponding to
X.0
. Note thatspOccupancy
assumes coordinates are specified in a projected coordinate system.- t.cols
an indexing vector used to denote which primary time periods are contained in the design matrix of covariates at the prediction locations (
X.0
). The values should denote the specific primary time periods used to fit the model. The values should indicate the columns indata$y
used to fit the model for which prediction is desired. See example below. Not required whenforecast = TRUE
.- n.omp.threads
a positive integer indicating the number of threads to use for SMP parallel processing. The package must be compiled for OpenMP support. For most Intel-based machines, we recommend setting
n.omp.threads
up to the number of hyperthreaded cores. Note,n.omp.threads
> 1 might not work on some systems.- verbose
if
TRUE
, model specification and progress of the sampler is printed to the screen. Otherwise, nothing is printed to the screen.- ignore.RE
logical value that specifies whether or not to remove random unstructured occurrence (or detection if
type = 'detection'
) effects from the subsequent predictions. IfTRUE
, random effects will be included. IfFALSE
, unstructured random effects will be set to 0 and predictions will only be generated from the fixed effects, the spatial random effects, and AR(1) random effects if the model was fit withar1 = TRUE
.- n.report
the interval to report sampling progress.
- type
a quoted keyword indicating what type of prediction to produce. Valid keywords are 'occupancy' to predict latent occupancy probability and latent occupancy values (this is the default), or 'detection' to predict detection probability given new values of detection covariates. Detection prediction is not currently supported for integrated models.
- forecast
a logical value indicating whether prediction is occurring at non-sampled primary time periods (e.g., forecasting).
- ...
currently no additional arguments
Note
When ignore.RE = FALSE
, both sampled levels and non-sampled levels of unstructured random effects are supported for prediction. For sampled levels, the posterior distribution for the random intercept corresponding to that level of the random effect will be used in the prediction. For non-sampled levels, random values are drawn from a normal distribution using the posterior samples of the random effect variance, which results in fully propagated uncertainty in predictions with models that incorporate random effects.
Occurrence predictions at sites that are only sampled for a subset of the total number of primary time periods are obtained directly when fitting the model. See the psi.samples
and z.samples
portions of the output list from the model object of class svcTIntPGOcc
.
Author
Jeffrey W. Doser doserjef@msu.edu,
Andrew O. Finley finleya@msu.edu
Value
A list object of class predict.svcTIntPGOcc
. When type = 'occupancy'
, the list consists of:
- psi.0.samples
a three-dimensional object of posterior predictive samples for the latent occupancy probability values with dimensions corresponding to posterior predictive sample, site, and primary time period.
- z.0.samples
a three-dimensional object of posterior predictive samples for the latent occupancy values with dimensions corresponding to posterior predictive sample, site, and primary time period.
- w.0.samples
a three-dimensional array of posterior predictive samples for the spatial random effects, with dimensions corresponding to MCMC iteration, coefficient, and site.
When type = 'detection'
, the list consists of:
- p.0.samples
a three-dimensional object of posterior predictive samples for the detection probability values with dimensions corresponding to posterior predictive sample, site, and primary time period.
The return object will include additional objects used for standard extractor functions.
Examples
set.seed(332)
# Simulate Data -----------------------------------------------------------
# Number of locations in each direction. This is the total region of interest
# where some sites may or may not have a data source.
J.x <- 15
J.y <- 15
J.all <- J.x * J.y
# Number of data sources.
n.data <- 3
# Sites for each data source.
J.obs <- sample(ceiling(0.2 * J.all):ceiling(0.4 * J.all), n.data, replace = TRUE)
# Maximum number of years for each data set
n.time.max <- c(4, 8, 10)
# Number of years each site in each data set is sampled
n.time <- list()
for (i in 1:n.data) {
n.time[[i]] <- sample(1:n.time.max[i], J.obs[i], replace = TRUE)
}
# Replicates for each data source.
n.rep <- list()
for (i in 1:n.data) {
n.rep[[i]] <- matrix(NA, J.obs[i], n.time.max[i])
for (j in 1:J.obs[i]) {
n.rep[[i]][j, sample(1:n.time.max[i], n.time[[i]][j], replace = FALSE)] <-
sample(1:4, n.time[[i]][j], replace = TRUE)
}
}
# Total number of years across all data sets
n.time.total <- 10
# List denoting the specific years each data set was sampled during.
data.seasons <- list()
for (i in 1:n.data) {
data.seasons[[i]] <- sort(sample(1:n.time.total, n.time.max[i], replace = FALSE))
}
# Occupancy covariates
beta <- c(0, 0.4, 0.3)
trend <- TRUE
# Random occupancy covariates
psi.RE <- list()
p.occ <- length(beta)
# Detection covariates
alpha <- list()
alpha[[1]] <- c(0, 0.2, -0.5)
alpha[[2]] <- c(-1, 0.5, 0.3, -0.8)
alpha[[3]] <- c(-0.5, 1)
p.RE <- list()
p.det.long <- sapply(alpha, length)
p.det <- sum(p.det.long)
# Spatial parameters
svc.cols <- c(1, 2)
sigma.sq <- c(0.9, 0.5)
phi <- c(3 / .5, 3 / .8)
# Simulate occupancy data.
dat <- simTIntOcc(n.data = n.data, J.x = J.x, J.y = J.y, J.obs = J.obs,
n.time = n.time, data.seasons = data.seasons, n.rep = n.rep,
beta = beta, alpha = alpha, trend = trend,
psi.RE = psi.RE, p.RE = p.RE, sp = TRUE,
sigma.sq = sigma.sq, phi = phi, cov.model = 'exponential',
svc.cols = svc.cols)
y <- dat$y
X <- dat$X.obs
X.p <- dat$X.p
sites <- dat$sites
coords <- dat$coords.obs
# Package all data into a list
occ.covs <- list(trend = X[, , 2],
occ.cov.1 = X[, , 3])
det.covs <- list()
# Add covariates one by one
det.covs[[1]] <- list(det.cov.1.1 = X.p[[1]][, , , 2],
det.cov.1.2 = X.p[[1]][, , , 3])
det.covs[[2]] <- list(det.cov.2.1 = X.p[[2]][, , , 2],
det.cov.2.2 = X.p[[2]][, , , 3],
det.cov.2.3 = X.p[[2]][, , , 4])
det.covs[[3]] <- list(det.cov.3.1 = X.p[[3]][, , , 2])
data.list <- list(y = y, occ.covs = occ.covs, det.covs = det.covs,
sites = sites, seasons = data.seasons, coords = coords)
# Testing
occ.formula <- ~ trend + occ.cov.1
# Note that the names are not necessary.
det.formula <- list(f.1 = ~ det.cov.1.1 + det.cov.1.2,
f.2 = ~ det.cov.2.1 + det.cov.2.2 + det.cov.2.3,
f.3 = ~ det.cov.3.1)
# NOTE: this is a short run of the model, in reality we would run the
# model for much longer.
out <- svcTIntPGOcc(occ.formula = occ.formula,
det.formula = det.formula,
data = data.list,
NNGP = TRUE,
n.neighbors = 15,
cov.model = 'exponential',
n.batch = 3,
svc.cols = c(1, 2),
batch.length = 25,
n.report = 1,
n.burn = 25,
n.thin = 1,
n.chains = 1)
#> ----------------------------------------
#> Preparing the data
#> ----------------------------------------
#> No prior specified for beta.normal.
#> Setting prior mean to 0 and prior variance to 2.72
#> No prior specified for alpha.normal.
#> Setting prior mean to 0 and prior variance to 2.72
#> No prior specified for phi.unif.
#> Setting uniform bounds based on the range of observed spatial coordinates.
#> No prior specified for sigma.sq.
#> Using an inverse-Gamma prior with the shape parameter set to 2 and scale parameter to 1.
#> z is not specified in initial values.
#> Setting initial values based on observed data
#> beta is not specified in initial values.
#> Setting initial values to random values from the prior distribution
#> alpha is not specified in initial values.
#> Setting initial values to random values from the prior distribution
#> phi is not specified in initial values.
#> Setting initial value to random values from the prior distribution
#> sigma.sq is not specified in initial values.
#> Setting initial values to random values from the prior distribution
#> ----------------------------------------
#> Building the neighbor list
#> ----------------------------------------
#> ----------------------------------------
#> Building the neighbors of neighbors list
#> ----------------------------------------
#> ----------------------------------------
#> Model description
#> ----------------------------------------
#> SVC integrated Multi-season Occupancy Model with Polya-Gamma latent
#> variable fit with 157 sites and 10 primary time periods.
#>
#> Integrating 3 occupancy data sets.
#>
#> Samples per chain: 75 (3 batches of length 25)
#> Burn-in: 25
#> Thinning Rate: 1
#> Number of Chains: 1
#> Total Posterior Samples: 50
#>
#> Number of spatially-varying coefficients: 2
#> Using the exponential spatial correlation model.
#>
#> Using 15 nearest neighbors.
#>
#> Source compiled with OpenMP support and model fit using 1 thread(s).
#>
#> Adaptive Metropolis with target acceptance rate: 43.0
#> ----------------------------------------
#> Chain 1
#> ----------------------------------------
#> Sampling ...
#> Batch: 1 of 3, 33.33%
#> Coefficient Parameter Acceptance Tuning
#> 1 phi 40.0 1.00000
#> 2 phi 32.0 1.00000
#> -------------------------------------------------
#> Batch: 2 of 3, 66.67%
#> Coefficient Parameter Acceptance Tuning
#> 1 phi 36.0 0.99005
#> 2 phi 52.0 1.01005
#> -------------------------------------------------
#> Batch: 3 of 3, 100.00%
summary(out)
#>
#> Call:
#> svcTIntPGOcc(occ.formula = occ.formula, det.formula = det.formula,
#> data = data.list, svc.cols = c(1, 2), cov.model = "exponential",
#> NNGP = TRUE, n.neighbors = 15, n.batch = 3, batch.length = 25,
#> n.report = 1, n.burn = 25, n.thin = 1, n.chains = 1)
#>
#> Samples per Chain: 75
#> Burn-in: 25
#> Thinning Rate: 1
#> Number of Chains: 1
#> Total Posterior Samples: 50
#> Run Time (min): 0.0039
#>
#> ----------------------------------------
#> Occurrence
#> ----------------------------------------
#> Fixed Effects (logit scale):
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> (Intercept) -0.0429 0.2538 -0.4815 -0.0613 0.4483 NA 4
#> trend 0.4938 0.1359 0.2201 0.5064 0.7093 NA 8
#> occ.cov.1 0.2324 0.1167 0.0528 0.2259 0.4544 NA 48
#>
#> ----------------------------------------
#> Data source 1 Detection
#> ----------------------------------------
#> Fixed Effects (logit scale):
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> (Intercept) -0.1392 0.2251 -0.5977 -0.1739 0.2309 NA 13
#> det.cov.1.1 0.3983 0.1938 0.0094 0.3933 0.6942 NA 23
#> det.cov.1.2 -0.8545 0.2200 -1.2473 -0.8312 -0.5673 NA 15
#>
#> ----------------------------------------
#> Data source 2 Detection
#> ----------------------------------------
#> Fixed Effects (logit scale):
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> (Intercept) -1.1365 0.1334 -1.3822 -1.1356 -0.8942 NA 17
#> det.cov.2.1 0.1898 0.1312 -0.0596 0.2002 0.3886 NA 27
#> det.cov.2.2 0.3216 0.1021 0.1768 0.3170 0.5250 NA 50
#> det.cov.2.3 -0.8731 0.1205 -1.1182 -0.8621 -0.6823 NA 28
#>
#> ----------------------------------------
#> Data source 3 Detection
#> ----------------------------------------
#> Fixed Effects (logit scale):
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> (Intercept) -0.3283 0.1592 -0.6741 -0.2901 -0.0666 NA 21
#> det.cov.3.1 0.5210 0.1443 0.2993 0.5005 0.8261 NA 30
#>
#> ----------------------------------------
#> Occurrence Spatio-temporal Covariance:
#> ----------------------------------------
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> sigma.sq-(Intercept) 0.9989 0.3041 0.6140 0.9050 1.6908 NA 4
#> sigma.sq-trend 0.7223 0.1639 0.4670 0.7128 1.0550 NA 6
#> phi-(Intercept) 18.0416 7.4030 6.2867 20.1484 32.2642 NA 3
#> phi-trend 18.5866 3.8904 11.5393 19.7140 26.2876 NA 6
t.cols <- 1:n.time.total
out.pred <- predict(out, X.0 = dat$X.pred, coords.0 = dat$coords.pred,
t.cols = t.cols, type = 'occupancy')
#> ----------------------------------------
#> Prediction description
#> ----------------------------------------
#> Spatial NNGP Multi-season Occupancy model with Polya-Gamma latent
#> variable fit with 157 observations and 10 years.
#>
#> Number of fixed covariates 3 (including intercept if specified).
#>
#> Using the exponential spatial correlation model.
#>
#> Using 15 nearest neighbors.
#>
#> Number of MCMC samples 50.
#>
#> Predicting at 68 non-sampled locations.
#>
#>
#> Source compiled with OpenMP support and model fit using 1 threads.
#> -------------------------------------------------
#> Predicting
#> -------------------------------------------------
#> Location: 68 of 68, 100.00%
#> Generating latent occupancy state
str(out.pred)
#> List of 6
#> $ z.0.samples : num [1:50, 1:68, 1:10] 1 0 0 0 1 1 0 0 0 0 ...
#> $ w.0.samples : num [1:50, 1:2, 1:68] 0.478 -1.5288 -0.0937 0.798 0.7443 ...
#> $ psi.0.samples: num [1:50, 1:68, 1:10] 0.5192 0.0428 0.2566 0.5082 0.5517 ...
#> $ run.time : 'proc_time' Named num [1:5] 0.06 0.144 0.03 0 0
#> ..- attr(*, "names")= chr [1:5] "user.self" "sys.self" "elapsed" "user.child" ...
#> $ call : language predict.svcTPGOcc(object = object, X.0 = X.0, coords.0 = coords.0, t.cols = t.cols, n.omp.threads = n.omp.th| __truncated__ ...
#> $ object.class : chr "svcTIntPGOcc"
#> - attr(*, "class")= chr "predict.svcTIntPGOcc"