This learner provides fitting procedures for a faster implementation of Random Forests, using the routines from ranger (described in Wright and Ziegler (2017) ) through a call to the function ranger. Variable importance functionality is also provided through invocation of the importance method.

Format

An R6Class object inheriting from Lrnr_base.

Value

A learner object inheriting from Lrnr_base with methods for training and prediction. For a full list of learner functionality, see the complete documentation of Lrnr_base.

Parameters

  • num.trees = 500: Number of trees to be used in growing the forest.

  • write.forest = TRUE: If TRUE, forest is stored, which is required for prediction. Set to FALSE to reduce memory usage if downstream prediction is not intended.

  • importance = "none": Variable importance mode, one of "none", "impurity", "impurity_corrected", "permutation". The "impurity" measure is the Gini index for classification, the variance of the responses for regression, and the sum of test statistics (for survival analysis, see the splitrule argument of ranger).

  • num.threads = 1: Number of threads.

  • ...: Other parameters passed to ranger. See its documentation for details.

References

Wright MN, Ziegler A (2017). “ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R.” Journal of Statistical Software, 77(1), 1--17. doi:10.18637/jss.v077.i01 .

Examples

data(mtcars)
# create task for prediction
mtcars_task <- sl3_Task$new(
  data = mtcars,
  covariates = c(
    "cyl", "disp", "hp", "drat", "wt", "qsec", "vs", "am",
    "gear", "carb"
  ),
  outcome = "mpg"
)
# initialization, training, and prediction with the defaults
ranger_lrnr <- Lrnr_ranger$new()
ranger_fit <- ranger_lrnr$train(mtcars_task)
ranger_preds <- ranger_fit$predict()

# variable importance
ranger_lrnr_importance <- Lrnr_ranger$new(importance = "impurity_corrected")
ranger_fit_importance <- ranger_lrnr_importance$train(mtcars_task)
ranger_importance <- ranger_fit_importance$importance()

# screening based on variable importance, example in glm pipeline
ranger_importance_screener <- Lrnr_screener_importance$new(
  learner = ranger_lrnr_importance, num_screen = 3
)
glm_lrnr <- make_learner(Lrnr_glm)
ranger_screen_glm_pipe <- Pipeline$new(ranger_importance_screener, glm_lrnr)
ranger_screen_glm_pipe_fit <- ranger_screen_glm_pipe$train(mtcars_task)