Function for prediction at new locations for multi-season multi-species occupancy models
predict.tMsPGOcc.Rd
The function predict
collects posterior predictive samples for a set of new locations given an object of class `tMsPGOcc`. Prediction is possible for both the latent occupancy state as well as detection. Predictions are currently only possible for sampled primary time periods.
Usage
# S3 method for tMsPGOcc
predict(object, X.0, t.cols, ignore.RE = FALSE, type = 'occupancy', ...)
Arguments
- object
an object of class tMsPGOcc
- X.0
the design matrix of covariates at the prediction locations. This should be a three-dimensional array, with dimensions corresponding to site, primary time period, and covariate, respectively. Note that the first covariate should consist of all 1s for the intercept if an intercept is included in the model. If random effects are included in the occupancy (or detection if
type = 'detection'
) portion of the model, the levels of the random effects at the new locations/time periods should be included as an element of the three-dimensional array. The ordering of the levels should match the ordering used to fit the data intMsPGOcc
. The covariates should be organized in the same order as they were specified in the corresponding formula argument oftMsPGOcc
. Names of the third dimension (covariates) of any random effects in X.0 must match the name of the random effects used to fit the model, if specified in the corresponding formula argument oftMsPGOcc
. See example below.- t.cols
an indexing vector used to denote which primary time periods are contained in the design matrix of covariates at the prediction locations (
X.0
). The values should denote the specific primary time periods used to fit the model. The values should indicate the columns indata$y
used to fit the model for which prediction is desired. See example below.- ignore.RE
logical value that specifies whether or not to remove random unstructured occurrence (or detection if
type = 'detection'
) effects from the subsequent predictions. IfTRUE
, unstructured random effects will be included. IfFALSE
, unstructured random effects will be set to 0 and predictions will only be generated from the fixed effects and AR(1) random effects if the model was fit withar1 = TRUE
.- type
a quoted keyword indicating what type of prediction to produce. Valid keywords are 'occupancy' to predict latent occupancy probability and latent occupancy values (this is the default), or 'detection' to predict detection probability given new values of detection covariates.
- ...
currently no additional arguments
Note
When ignore.RE = FALSE
, both sampled levels and non-sampled levels of unstructured random effects are supported for prediction. For sampled levels, the posterior distribution for the random intercept corresponding to that level of the random effect will be used in the prediction. For non-sampled levels, random values are drawn from a normal distribution using the posterior samples of the random effect variance, which results in fully propagated uncertainty in predictions with models that incorporate random effects.
Occurrence predictions at sites that are only sampled for a subset of the total number of primary time periods are obtained directly when fitting the model. See the psi.samples
and z.samples
portions of the output list from the model object of class tMsPGOcc
.
Author
Jeffrey W. Doser doserjef@msu.edu
Value
A list object of class predict.tMsPGOcc
. When type = 'occupancy'
, the list consists of:
- psi.0.samples
a four-dimensional object of posterior predictive samples for the latent occupancy probability values with dimensions corresponding to posterior predictive sample, species, site, and primary time period.
- z.0.samples
a four-dimensional object of posterior predictive samples for the latent occupancy values with dimensions corresponding to posterior predictive sample, species, site, and primary time period.
When type = 'detection'
, the list consists of:
- p.0.samples
a four-dimensional object of posterior predictive samples for the detection probability values with dimensions corresponding to posterior predictive sample, species, site, and primary time period.
The return object will include additional objects used for standard extractor functions.
Examples
# Simulate Data -----------------------------------------------------------
set.seed(500)
J.x <- 8
J.y <- 8
J <- J.x * J.y
# Years sampled
n.time <- sample(3:10, J, replace = TRUE)
# n.time <- rep(10, J)
n.time.max <- max(n.time)
# Replicates
n.rep <- matrix(NA, J, max(n.time))
for (j in 1:J) {
n.rep[j, 1:n.time[j]] <- sample(2:4, n.time[j], replace = TRUE)
# n.rep[j, 1:n.time[j]] <- rep(4, n.time[j])
}
N <- 7
# Community-level covariate effects
# Occurrence
beta.mean <- c(-3, -0.2, 0.5)
trend <- FALSE
sp.only <- 0
p.occ <- length(beta.mean)
tau.sq.beta <- c(0.6, 1.5, 1.4)
# Detection
alpha.mean <- c(0, 1.2, -1.5)
tau.sq.alpha <- c(1, 0.5, 2.3)
p.det <- length(alpha.mean)
# Random effects
psi.RE <- list()
p.RE <- list()
# Draw species-level effects from community means.
beta <- matrix(NA, nrow = N, ncol = p.occ)
alpha <- matrix(NA, nrow = N, ncol = p.det)
for (i in 1:p.occ) {
beta[, i] <- rnorm(N, beta.mean[i], sqrt(tau.sq.beta[i]))
}
for (i in 1:p.det) {
alpha[, i] <- rnorm(N, alpha.mean[i], sqrt(tau.sq.alpha[i]))
}
sp <- FALSE
dat <- simTMsOcc(J.x = J.x, J.y = J.y, n.time = n.time, n.rep = n.rep, N = N,
beta = beta, alpha = alpha, sp.only = sp.only, trend = trend,
psi.RE = psi.RE, p.RE = p.RE, sp = sp)
# Subset data for prediction
pred.indx <- sample(1:J, round(J * .25), replace = FALSE)
y <- dat$y[, -pred.indx, , , drop = FALSE]
# Occupancy covariates
X <- dat$X[-pred.indx, , , drop = FALSE]
# Prediction covariates
X.0 <- dat$X[pred.indx, , , drop = FALSE]
# Detection covariates
X.p <- dat$X.p[-pred.indx, , , , drop = FALSE]
occ.covs <- list(occ.cov.1 = X[, , 2],
occ.cov.2 = X[, , 3])
det.covs <- list(det.cov.1 = X.p[, , , 2],
det.cov.2 = X.p[, , , 3])
data.list <- list(y = y, occ.covs = occ.covs,
det.covs = det.covs)
# Priors
prior.list <- list(beta.comm.normal = list(mean = 0, var = 2.72),
alpha.comm.normal = list(mean = 0, var = 2.72),
tau.sq.beta.ig = list(a = 0.1, b = 0.1),
tau.sq.alpha.ig = list(a = 0.1, b = 0.1))
z.init <- apply(y, c(1, 2, 3), function(a) as.numeric(sum(a, na.rm = TRUE) > 0))
inits.list <- list(alpha.comm = 0, beta.comm = 0, beta = 0,
alpha = 0, tau.sq.beta = 1, tau.sq.alpha = 1,
z = z.init)
# Tuning
tuning.list <- list(phi = 1)
# Number of batches
n.batch <- 5
# Batch length
batch.length <- 25
n.burn <- 25
n.thin <- 1
n.samples <- n.batch * batch.length
# Note that this is just a test case and more iterations/chains may need to
# be run to ensure convergence.
out <- tMsPGOcc(occ.formula = ~ occ.cov.1 + occ.cov.2,
det.formula = ~ det.cov.1 + det.cov.2,
data = data.list,
inits = inits.list,
n.batch = n.batch,
batch.length = batch.length,
accept.rate = 0.43,
priors = prior.list,
n.omp.threads = 1,
verbose = TRUE,
n.report = 1,
n.burn = n.burn,
n.thin = n.thin,
n.chains = 1)
#> ----------------------------------------
#> Preparing to run the model
#> ----------------------------------------
#> ----------------------------------------
#> Model description
#> ----------------------------------------
#> Multi-season Multi-species Occupancy Model with Polya-Gamma latent
#> variables with 48 sites, 7 species, and 10 primary time periods.
#>
#> Samples per chain: 125 (5 batches of length 25)
#> Burn-in: 25
#> Thinning Rate: 1
#> Number of Chains: 1
#> Total Posterior Samples: 100
#>
#> Source compiled with OpenMP support and model fit using 1 thread(s).
#>
#> Adaptive Metropolis with target acceptance rate: 43.0
#> ----------------------------------------
#> Chain 1
#> ----------------------------------------
#> Sampling ...
#> Batch: 1 of 5, 20.00%
#> -------------------------------------------------
#> Batch: 2 of 5, 40.00%
#> -------------------------------------------------
#> Batch: 3 of 5, 60.00%
#> -------------------------------------------------
#> Batch: 4 of 5, 80.00%
#> -------------------------------------------------
#> Batch: 5 of 5, 100.00%
summary(out)
#>
#> Call:
#> tMsPGOcc(occ.formula = ~occ.cov.1 + occ.cov.2, det.formula = ~det.cov.1 +
#> det.cov.2, data = data.list, inits = inits.list, priors = prior.list,
#> n.batch = n.batch, batch.length = batch.length, accept.rate = 0.43,
#> n.omp.threads = 1, verbose = TRUE, n.report = 1, n.burn = n.burn,
#> n.thin = n.thin, n.chains = 1)
#>
#> Samples per Chain: 125
#> Burn-in: 25
#> Thinning Rate: 1
#> Number of Chains: 1
#> Total Posterior Samples: 100
#> Run Time (min): 0.0029
#>
#> ----------------------------------------
#> Community Level
#> ----------------------------------------
#> Occurrence Means (logit scale):
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> (Intercept) -3.3109 0.2307 -3.8180 -3.3223 -2.9422 NA 48
#> occ.cov.1 0.2280 0.2316 -0.2252 0.2458 0.5956 NA 100
#> occ.cov.2 0.8558 0.4307 -0.0777 0.8530 1.7492 NA 100
#>
#> Occurrence Variances (logit scale):
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> (Intercept) 0.2768 0.2451 0.0441 0.1847 0.9448 NA 9
#> occ.cov.1 0.2956 0.3391 0.0437 0.2053 1.0132 NA 56
#> occ.cov.2 1.7869 1.6848 0.3878 1.2085 6.4331 NA 47
#>
#> Detection Means (logit scale):
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> (Intercept) 0.2090 0.2816 -0.2615 0.1777 0.7288 NA 47
#> det.cov.1 1.0884 0.2611 0.5824 1.0923 1.5399 NA 26
#> det.cov.2 -1.5798 0.8959 -3.3074 -1.7259 0.3233 NA 153
#>
#> Detection Variances (logit scale):
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> (Intercept) 0.3323 0.2590 0.0622 0.2508 1.0260 NA 37
#> det.cov.1 0.3643 0.2824 0.0709 0.2915 0.9383 NA 29
#> det.cov.2 7.9616 5.6275 1.3276 6.2532 20.4548 NA 44
#>
#> ----------------------------------------
#> Species Level
#> ----------------------------------------
#> Occurrence (logit scale):
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> (Intercept)-sp1 -3.4872 0.3468 -4.3103 -3.4762 -2.9194 NA 8
#> (Intercept)-sp2 -3.3259 0.2072 -3.7166 -3.3494 -2.8952 NA 30
#> (Intercept)-sp3 -3.4532 0.3561 -4.2085 -3.4772 -2.8463 NA 11
#> (Intercept)-sp4 -3.0669 0.2756 -3.5864 -3.0640 -2.5198 NA 14
#> (Intercept)-sp5 -3.0966 0.1938 -3.4963 -3.1097 -2.7534 NA 23
#> (Intercept)-sp6 -3.1886 0.2431 -3.6194 -3.1857 -2.7659 NA 20
#> (Intercept)-sp7 -4.0226 0.4035 -4.8233 -3.9927 -3.2591 NA 15
#> occ.cov.1-sp1 0.2603 0.1873 -0.1026 0.2623 0.5418 NA 20
#> occ.cov.1-sp2 -0.0771 0.2998 -0.7013 -0.0438 0.4383 NA 15
#> occ.cov.1-sp3 -0.2764 0.2282 -0.7217 -0.2580 0.1789 NA 18
#> occ.cov.1-sp4 -0.0571 0.2322 -0.5874 -0.0227 0.3321 NA 21
#> occ.cov.1-sp5 0.5879 0.2963 0.1452 0.5467 1.2141 NA 13
#> occ.cov.1-sp6 0.3897 0.2277 -0.0345 0.3714 0.8632 NA 21
#> occ.cov.1-sp7 0.6312 0.2912 0.0488 0.6563 1.1331 NA 22
#> occ.cov.2-sp1 1.0166 0.3295 0.3946 0.9706 1.6807 NA 6
#> occ.cov.2-sp2 1.5551 0.2962 1.0100 1.5765 2.0808 NA 14
#> occ.cov.2-sp3 -0.5533 0.2781 -1.1318 -0.5563 -0.0820 NA 20
#> occ.cov.2-sp4 -0.4854 0.2788 -1.0658 -0.4691 0.0265 NA 16
#> occ.cov.2-sp5 0.6401 0.2576 0.2398 0.6128 1.1507 NA 21
#> occ.cov.2-sp6 1.8028 0.2523 1.3883 1.7978 2.3222 NA 29
#> occ.cov.2-sp7 2.3386 0.3833 1.6353 2.2899 3.1333 NA 12
#>
#> Detection (logit scale):
#> Mean SD 2.5% 50% 97.5% Rhat ESS
#> (Intercept)-sp1 0.4957 0.4097 -0.3369 0.5105 1.2284 NA 38
#> (Intercept)-sp2 -0.1397 0.2827 -0.5827 -0.1852 0.4461 NA 59
#> (Intercept)-sp3 0.1428 0.4509 -0.7268 0.0941 0.9748 NA 46
#> (Intercept)-sp4 0.0227 0.3952 -0.6975 0.0318 0.8089 NA 96
#> (Intercept)-sp5 0.2110 0.3299 -0.2923 0.2036 0.8932 NA 44
#> (Intercept)-sp6 0.7258 0.2554 0.2885 0.7102 1.2129 NA 34
#> (Intercept)-sp7 -0.0750 0.2639 -0.4475 -0.0949 0.4483 NA 38
#> det.cov.1-sp1 1.4091 0.4053 0.7989 1.3594 2.2732 NA 28
#> det.cov.1-sp2 0.9480 0.3342 0.3666 0.9384 1.5320 NA 59
#> det.cov.1-sp3 1.0683 0.4146 0.1544 1.0407 1.7661 NA 33
#> det.cov.1-sp4 0.5424 0.3862 -0.1949 0.5676 1.2654 NA 47
#> det.cov.1-sp5 1.6667 0.4222 0.9868 1.6345 2.4248 NA 22
#> det.cov.1-sp6 0.8096 0.2738 0.3532 0.8220 1.3110 NA 21
#> det.cov.1-sp7 1.2270 0.3169 0.6383 1.2160 1.8338 NA 55
#> det.cov.2-sp1 0.5782 0.6434 -0.5428 0.5266 1.8323 NA 58
#> det.cov.2-sp2 -0.4742 0.3439 -1.1693 -0.4981 0.1707 NA 59
#> det.cov.2-sp3 -6.2777 1.5269 -9.2132 -6.4590 -3.1616 NA 21
#> det.cov.2-sp4 -3.6953 0.8696 -5.3149 -3.5892 -2.1963 NA 21
#> det.cov.2-sp5 -1.0684 0.4689 -2.0652 -1.0497 -0.1402 NA 29
#> det.cov.2-sp6 -1.7423 0.3810 -2.5516 -1.6913 -1.0495 NA 38
#> det.cov.2-sp7 -2.1465 0.4242 -2.9209 -2.1369 -1.4090 NA 35
# Predict at new sites during time periods 1, 2, and 5
# Take a look at array of covariates for prediction
str(X.0)
#> num [1:16, 1:10, 1:3] 1 1 1 1 1 1 1 1 1 1 ...
# Subset to only grab time periods 1, 2, and 5
t.cols <- c(1, 2, 5)
X.pred <- X.0[, t.cols, ]
out.pred <- predict(out, X.pred, t.cols = t.cols, type = 'occupancy')
str(out.pred)
#> List of 3
#> $ psi.0.samples: num [1:100, 1:7, 1:16, 1:3] 0.594 0.412 0.353 0.3 0.422 ...
#> $ z.0.samples : int [1:100, 1:7, 1:16, 1:3] 1 1 1 0 1 0 0 0 0 0 ...
#> $ call : language predict.tMsPGOcc(object = out, X.0 = X.pred, t.cols = t.cols, type = "occupancy")
#> - attr(*, "class")= chr "predict.tMsPGOcc"