Cross-validation methods for fitted unmarked models and fit lists

Test predictive accuracy of fitted models using several cross-validation approaches. The dataset is divided by site only into folds or testing and training datasets (i.e., encounter histories within sites are never split up).

# S4 method for unmarkedFit
crossVal(
  object, method=c("Kfold","holdout","leaveOneOut"),
  folds=10, holdoutPct=0.25, statistic=RMSE_MAE, parallel=FALSE, ncores, ...)
# S4 method for unmarkedFitList
crossVal(
  object, method=c("Kfold","holdout","leaveOneOut"),
  folds=10, holdoutPct=0.25, statistic=RMSE_MAE, parallel=FALSE, ncores, 
  sort = c("none", "increasing", "decreasing"), ...)

Arguments

object	A fitted model inheriting class `unmarkedFit` or a list of fitted models with class `unmarkedFitList`
method	Cross validation method to use as string. Valid options are `"Kfold"`, `"holdout"`, or `"leaveOneOut"`
folds	Number of folds to use for k-fold cross validation
holdoutPct	Proportion of dataset (value between 0-1) to use as the "holdout" or "test" set, for the holdout method
statistic	Function that calculates statistics for each fold. The function must take an `unmarkedFit` object as the first argument and return a named numeric vector with statistic value(s). The default function `RMSE_MAE` returns root-mean-square error and mean absolute error. See `unmarked:::RMSE_MAE` for an example of correct statistic function structure.
parallel	If `TRUE`, run folds in parallel. This may speed up cross-validation if the unmarked model takes a long time to fit or you have a large number of sites and are using leave-one-out cross-validation.
ncores	Number of parallel cores to use.
sort	If doing cross-validation on a `fitList`, you can optionally sort the resulting table(s) of statistic values for each model.
...	Other arguments passed to the statistic function.

Value

unmarkedCrossVal or unmarkedCrossValList object containing calculated statistic values for each fold.

Author

Ken Kellner contact@kenkellner.com

Examples


if (FALSE) {
#Get data
data(frogs)
pferUMF <- unmarkedFrameOccu(pfer.bin)
siteCovs(pferUMF) <- data.frame(sitevar1 = rnorm(numSites(pferUMF)))    
obsCovs(pferUMF) <- data.frame(obsvar1 = rnorm(numSites(pferUMF) * obsNum(pferUMF)))

#Fit occupancy model
fm <- occu(~ obsvar1 ~ 1, pferUMF)

#k-fold cross validation with 10 folds
(kfold = crossVal(fm, method="Kfold", folds=10))

#holdout method with 25
(holdout = crossVal(fm,method='holdout', holdoutPct=0.25))

#Leave-one-out method
(leave = crossVal(fm, method='leaveOneOut'))

#Fit a second model and combine into a fitList
fm2 <- occu(~1 ~1, pferUMF)
fl <- fitList(fm2,fm)

#Cross-validation for all fits in fitList using holdout method
(cvlist <- crossVal(fl, method='holdout'))

}

Cross-validation methods for fitted unmarked models and fit lists

Arguments

Value

Author

See also

Examples