This learner uses the caret package's train function to automatically tune a predictive model. It does this by defining a grid of model-specific tuning parameters; fitting the model according to each tuning parameter specification, to establish a set of models fits; calculating a resampling-based performance measure each variation; and then selecting the model with the best performance.

Format

An R6Class object inheriting from Lrnr_base.

Value

A learner object inheriting from Lrnr_base with methods for training and prediction. For a full list of learner functionality, see the complete documentation of Lrnr_base.

Parameters

  • method: A string specifying which caret classification or regression model to use. Possible models can be found using names(caret::getModelInfo()). Information about a model, including the parameters that are tuned, can be found using caret::modelLookup(), e.g., caret::modelLookup("xgbLinear"). Consult the caret package's documentation on train for more details.

  • metric = NULL: An optional string specifying the summary metric to be used to select the optimal model. If not specified, it will be set to "RMSE" for continuous outcomes and "Accuracy" for categorical and binary outcomes. Other options include "MAE", "Kappa", "Rsquared" and "logLoss". Regression models are defined when metric is set as "RMSE", "logLoss", "Rsquared", or "MAE". Classification models are defined when metric is set as "Accuracy" or "Kappa". Custom performance metrics can also be used. Consult the caret package's train documentation for more details.

  • trControl = list(method = "cv", number = 10): A list for specifying the arguments for trainControl object. If not specified, it will consider "cv" with 10 folds as the resampling method, instead of caret's default resampling method, "boot". For a detailed description, consult the caret package's documentation for train and trainControl.

  • factor_binary_outcome = TRUE: Logical indicating whether a binary outcome should be defined as a factor instead of a numeric. This only needs to be modified to FALSE in the following uncommon instance: when metric is specified by the user, metric defines a regression model, and the task's outcome is binary. Note that train could throw warnings/errors when regression models are considered for binary outcomes; this argument should only be modified by advanced users in niche settings.

  • ...: Other parameters passed to train and additional arguments defined in Lrnr_base, such as params like formula.

Examples

data(cpp_imputed)
covs <- c("apgar1", "apgar5", "parity", "gagebrth", "mage", "meducyrs")
task <- sl3_Task$new(cpp_imputed, covariates = covs, outcome = "haz")
autotuned_RF_lrnr <- Lrnr_caret$new(method = "rf")
set.seed(693)
autotuned_RF_fit <- autotuned_RF_lrnr$train(task)
autotuned_RF_predictions <- autotuned_RF_fit$predict()