These functions represent different cross-validation schemes that can be used with origami. They should be used as options for the fold_fun argument to make_folds, which will call the requested function specify n, based on its arguments, and pass any remaining arguments (e.g. V or pvalidation) on.

folds_vfold(n, V = 10L)

folds_resubstitution(n)

folds_loo(n)

folds_montecarlo(n, V = 1000L, pvalidation = 0.2)

folds_bootstrap(n, V = 1000L)

folds_rolling_origin(n, first_window, validation_size, gap = 0L, batch = 1L)

folds_rolling_window(n, window_size, validation_size, gap = 0L, batch = 1L)

folds_rolling_origin_pooled(
  n,
  t,
  id = NULL,
  time = NULL,
  first_window,
  validation_size,
  gap = 0L,
  batch = 1L
)

folds_rolling_window_pooled(
  n,
  t,
  id = NULL,
  time = NULL,
  window_size,
  validation_size,
  gap = 0L,
  batch = 1L
)

folds_vfold_rolling_origin_pooled(
  n,
  t,
  id = NULL,
  time = NULL,
  V = 10L,
  first_window,
  validation_size,
  gap = 0L,
  batch = 1L
)

folds_vfold_rolling_window_pooled(
  n,
  t,
  id = NULL,
  time = NULL,
  V = 10L,
  window_size,
  validation_size,
  gap = 0L,
  batch = 1L
)

Arguments

n

An integer indicating the number of observations.

V

An integer indicating the number of folds.

pvalidation

A numeric indicating the proportion of observation to be placed in the validation fold.

first_window

An integer indicating the number of observations in the first training sample.

validation_size

An integer indicating the number of points in the validation samples; should be equal to the largest forecast horizon.

gap

An integer indicating the number of points not included in the training or validation samples. The default is zero.

batch

An integer indicating increases in the number of time points added to the training set in each iteration of cross-validation. Applicable for larger time-series. The default is one.

window_size

An integer indicating the number of observations in each training sample.

t

An integer indicating the total amount of time to consider per time-series sample.

id

An optional vector of unique identifiers corresponding to the time vector. These can be used to subset the time vector.

time

An optional vector of integers of time points observed for each subject in the sample.

Value

A list of Folds.

See also

Other fold generation functions: fold_from_foldvec(), folds2foldvec(), make_folds(), make_repeated_folds()