GBM: Generalized Boosted Regression Models

This learner provides fitting procedures for generalized boosted regression trees, using the routines from gbm, through a call to the function gbm.fit. Though a variety of gradient boosting strategies have seen popularity in machine learning, a few of the early methodological descriptions were given by Friedman (2001) and Friedman (2002) .

Format

An R6Class object inheriting from Lrnr_base.

Value

A learner object inheriting from Lrnr_base with methods for training and prediction. For a full list of learner functionality, see the complete documentation of Lrnr_base.

Parameters

n.trees: An integer specifying the total number of trees to fit. This is equivalent to the number of iterations and the number of basis functions in the additive expansion. The default is 10000.
interaction.depth: An integer specifying the maximum depth of each tree (i.e., the highest level of allowed variable interactions). A value of 1 implies an additive model, while a value of 2 implies a model with up to 2-way interactions, etc. The default is 2.
shrinkage: A shrinkage parameter applied to each tree in the expansion. Also known as the learning rate or step-size reduction; values of 0.001 to 0.1 have been found to usually work, but a smaller learning rate typically requires more trees. The default is 0.001.
...: Other parameters passed to gbm. See its documentation for details.

References

Friedman JH (2001). “Greedy function approximation: a gradient boosting machine.” Annals of statistics, 1189--1232.

Friedman JH (2002). “Stochastic gradient boosting.” Computational statistics & data analysis, 38(4), 367--378.

Other Learners: Custom_chain, Lrnr_HarmonicReg, Lrnr_arima, Lrnr_bartMachine, Lrnr_base, Lrnr_bayesglm, Lrnr_bilstm, Lrnr_caret, Lrnr_cv_selector, Lrnr_cv, Lrnr_dbarts, Lrnr_define_interactions, Lrnr_density_discretize, Lrnr_density_hse, Lrnr_density_semiparametric, Lrnr_earth, Lrnr_expSmooth, Lrnr_gam, Lrnr_ga, Lrnr_glm_fast, Lrnr_glm_semiparametric, Lrnr_glmnet, Lrnr_glmtree, Lrnr_glm, Lrnr_grfcate, Lrnr_grf, Lrnr_gru_keras, Lrnr_gts, Lrnr_h2o_grid, Lrnr_hal9001, Lrnr_haldensify, Lrnr_hts, Lrnr_independent_binomial, Lrnr_lightgbm, Lrnr_lstm_keras, Lrnr_mean, Lrnr_multiple_ts, Lrnr_multivariate, Lrnr_nnet, Lrnr_nnls, Lrnr_optim, Lrnr_pca, Lrnr_pkg_SuperLearner, Lrnr_polspline, Lrnr_pooled_hazards, Lrnr_randomForest, Lrnr_ranger, Lrnr_revere_task, Lrnr_rpart, Lrnr_rugarch, Lrnr_screener_augment, Lrnr_screener_coefs, Lrnr_screener_correlation, Lrnr_screener_importance, Lrnr_sl, Lrnr_solnp_density, Lrnr_solnp, Lrnr_stratified, Lrnr_subset_covariates, Lrnr_svm, Lrnr_tsDyn, Lrnr_ts_weights, Lrnr_xgboost, Pipeline, Stack, define_h2o_X(), undocumented_learner

Examples

data(cpp_imputed)
# create task for prediction
cpp_task <- sl3_Task$new(
  data = cpp_imputed,
  covariates = c("apgar1", "apgar5", "parity", "gagebrth", "mage", "sexn"),
  outcome = "haz"
)

# initialization, training, and prediction with the defaults
gbm_lrnr <- Lrnr_gbm$new()
gbm_fit <- gbm_lrnr$train(cpp_task)
gbm_preds <- gbm_fit$predict()