Learner that encapsulates the Super Learner algorithm. Fits metalearner on cross-validated predictions from learners. Then forms a pipeline with the learners.
A learner object inheriting from Lrnr_base
with
methods for training and prediction. For a full list of learner
functionality, see the complete documentation of Lrnr_base
.
learners
: The "library" of user-specified algorithms for the
super learner to consider as candidates.
metalearner = "default"
: The metalearner to be fit on c
cross-validated predictions from the candidates. If "default"
,
the default_metalearner
is used to construct a
metalearner based on the outcome_type
of the training
task
.
cv_control = NULL
: Optional list of arguments that will be used
to define a specific cross-validation fold structure for fitting the
super learner. Intended for use in a nested cross-validation scheme,
such as cross-validated super learner (cv_sl
) or
when Lrnr_sl
is considered in the list of candidate
learners
in another Lrnr_sl
. Includes the arguments
listed below, and any others to be passed to
fold_funs
:
strata = NULL
: Discrete covariate or outcome name to
define stratified cross-validation folds. If NULL
and if
task$outcome_type$type
is binary or categorical, then the
default behavior is to consider stratified cross-validation, where
the strata are defined with respect to the outcome. To override
the default behavior, i.e., to not consider stratified
cross-validation when strata = NULL
and
task$outcome_type$type
is binary or categorical is not
NULL
, set strata = "none"
.
cluster_by_id = TRUE
: Logical to specify clustered
cross-validation scheme according to id
in task
.
Specifically, if task$nodes$id
is not NULL
and if
cluster_by_id = TRUE
(default) then task$nodes$id
is used to define a clustered cross-validation scheme, so
dependent units are placed together in the same training sets
and validation set. To override the default behavior, i.e., to not
consider clustered cross-validation when task$nodes$id
is
not NULL
, set cluster_by_id = FALSE
.
fold_fun = NULL
: A function indicating the origami
cross-validation scheme to use, such as
folds_vfold
for V-fold cross-validation.
See fold_funs
for a list of possibilities.
If NULL
(default) and if other cv_control
arguments
are specified, e.g., V
, strata
or
cluster_by_id
, then the default behavior is to set
fold_fun = origami::folds_vfold
.
...
: Other arguments to be passed to fold_fun
, such as
V
for fold_fun = folds_vfold
. See
fold_funs
for a list fold-function-specific
possible arguments.
keep_extra = TRUE
: Stores all sub-parts of the super learner
computation. When FALSE
, the resulting object has a memory
footprint that is significantly reduced through the discarding of
intermediary data structures.
verbose = NULL
: Whether to print cv_control
-related
messages. Warnings and errors are always printed. When
verbose = NULL
, verbosity specified by option
sl3.verbose
will be used, and the default sl3.verbose
option is FALSE
. (Note: to turn on sl3.verbose
option,
set options("sl3.verbose" = TRUE)
.)
...
: Any additional parameters that can be considered by
Lrnr_base
.
Other Learners:
Custom_chain
,
Lrnr_HarmonicReg
,
Lrnr_arima
,
Lrnr_bartMachine
,
Lrnr_base
,
Lrnr_bayesglm
,
Lrnr_bilstm
,
Lrnr_caret
,
Lrnr_cv_selector
,
Lrnr_cv
,
Lrnr_dbarts
,
Lrnr_define_interactions
,
Lrnr_density_discretize
,
Lrnr_density_hse
,
Lrnr_density_semiparametric
,
Lrnr_earth
,
Lrnr_expSmooth
,
Lrnr_gam
,
Lrnr_ga
,
Lrnr_gbm
,
Lrnr_glm_fast
,
Lrnr_glm_semiparametric
,
Lrnr_glmnet
,
Lrnr_glmtree
,
Lrnr_glm
,
Lrnr_grfcate
,
Lrnr_grf
,
Lrnr_gru_keras
,
Lrnr_gts
,
Lrnr_h2o_grid
,
Lrnr_hal9001
,
Lrnr_haldensify
,
Lrnr_hts
,
Lrnr_independent_binomial
,
Lrnr_lightgbm
,
Lrnr_lstm_keras
,
Lrnr_mean
,
Lrnr_multiple_ts
,
Lrnr_multivariate
,
Lrnr_nnet
,
Lrnr_nnls
,
Lrnr_optim
,
Lrnr_pca
,
Lrnr_pkg_SuperLearner
,
Lrnr_polspline
,
Lrnr_pooled_hazards
,
Lrnr_randomForest
,
Lrnr_ranger
,
Lrnr_revere_task
,
Lrnr_rpart
,
Lrnr_rugarch
,
Lrnr_screener_augment
,
Lrnr_screener_coefs
,
Lrnr_screener_correlation
,
Lrnr_screener_importance
,
Lrnr_solnp_density
,
Lrnr_solnp
,
Lrnr_stratified
,
Lrnr_subset_covariates
,
Lrnr_svm
,
Lrnr_tsDyn
,
Lrnr_ts_weights
,
Lrnr_xgboost
,
Pipeline
,
Stack
,
define_h2o_X()
,
undocumented_learner
if (FALSE) {
data(cpp_imputed)
covs <- c("apgar1", "apgar5", "parity", "gagebrth", "mage", "meducyrs")
task <- sl3_Task$new(cpp_imputed, covariates = covs, outcome = "haz")
# this is just for illustrative purposes, not intended for real applications
# of the super learner!
glm_lrn <- Lrnr_glm$new()
ranger_lrn <- Lrnr_ranger$new()
lasso_lrn <- Lrnr_glmnet$new()
eSL <- Lrnr_sl$new(learners = list(glm_lrn, ranger_lrn, lasso_lrn))
eSL_fit <- eSL$train(task)
# example with cv_control, where Lrnr_sl included as a candidate
eSL_nested5folds <- Lrnr_sl$new(
learners = list(glm_lrn, ranger_lrn, lasso_lrn),
cv_control = list(V = 5),
verbose = FALSE
)
dSL <- Lrnr_sl$new(
learners = list(glm_lrn, ranger_lrn, lasso_lrn, eSL_nested5folds),
metalearner = Lrnr_cv_selector$new(loss_squared_error)
)
dSL_fit <- dSL$train(task)
# example with cv_control, where we use cross-validated super learner
cvSL_fit <- cv_sl(
lrnr_sl = eSL_nested5folds, task = task, eval_fun = loss_squared_error
)
}