This learner implements Generalized Random Forests, using the grf package. This is a pluggable package for forest-based statistical estimation and inference. GRF currently provides non-parametric methods for least-squares regression, quantile regression, and treatment effect estimation (optionally using instrumental variables). Current implementation trains a regression forest that can be used to estimate quantiles of the conditional distribution of (Y|X=x).

`R6Class`

object.

Learner object with methods for training and prediction. See
`Lrnr_base`

for documentation on learners.

`num.trees = 2000`

Number of trees grown in the forest. NOTE: Getting accurate confidence intervals generally requires more trees than getting accurate predictions.

`quantiles = c(0.1, 0.5, 0.9)`

Vector of quantiles used to calibrate the forest.

`regression.splitting = FALSE`

Whether to use regression splits when growing trees instead of specialized splits based on the quantiles (the default). Setting this flag to

`TRUE`

corresponds to the approach to quantile forests from Meinshausen (2006).`clusters = NULL`

Vector of integers or factors specifying which cluster each observation corresponds to.

`equalize.cluster.weights = FALSE`

If

`FALSE`

, each unit is given the same weight (so that bigger clusters get more weight). If`TRUE`

, each cluster is given equal weight in the forest. In this case, during training, each tree uses the same number of observations from each drawn cluster: If the smallest cluster has K units, then when we sample a cluster during training, we only give a random K elements of the cluster to the tree-growing procedure. When estimating average treatment effects, each observation is given weight 1/cluster size, so that the total weight of each cluster is the same.`sample.fraction = 0.5`

Fraction of the data used to build each tree. NOTE: If

`honesty = TRUE`

, these subsamples will further be cut by a factor of`honesty.fraction.`

.`mtry = NULL`

Number of variables tried for each split. By default, this is set based on the dimensionality of the predictors.

`min.node.size = 5`

A target for the minimum number of observations in each tree leaf. Note that nodes with size smaller than

`min.node.size`

can occur, as in the randomForest package.`honesty = TRUE`

Whether or not honest splitting (i.e., sub-sample splitting) should be used.

`alpha = 0.05`

A tuning parameter that controls the maximum imbalance of a split.

`imbalance.penalty = 0`

A tuning parameter that controls how harshly imbalanced splits are penalized.

`num.threads = 1`

Number of threads used in training. If set to

`NULL`

, the software automatically selects an appropriate amount.`quantiles_pred`

Vector of quantiles used to predict. This can be different than the vector of quantiles used for training.

Individual learners have their own sets of parameters. Below is a list of shared parameters, implemented by `Lrnr_base`

, and shared
by all learners.

`covariates`

A character vector of covariates. The learner will use this to subset the covariates for any specified task

`outcome_type`

A

`variable_type`

object used to control the outcome_type used by the learner. Overrides the task outcome_type if specified`...`

All other parameters should be handled by the invidual learner classes. See the documentation for the learner class you're instantiating

Other Learners:
`Custom_chain`

,
`Lrnr_HarmonicReg`

,
`Lrnr_arima`

,
`Lrnr_bartMachine`

,
`Lrnr_base`

,
`Lrnr_bayesglm`

,
`Lrnr_bilstm`

,
`Lrnr_caret`

,
`Lrnr_cv_selector`

,
`Lrnr_cv`

,
`Lrnr_dbarts`

,
`Lrnr_define_interactions`

,
`Lrnr_density_discretize`

,
`Lrnr_density_hse`

,
`Lrnr_density_semiparametric`

,
`Lrnr_earth`

,
`Lrnr_expSmooth`

,
`Lrnr_gam`

,
`Lrnr_ga`

,
`Lrnr_gbm`

,
`Lrnr_glm_fast`

,
`Lrnr_glm_semiparametric`

,
`Lrnr_glmnet`

,
`Lrnr_glmtree`

,
`Lrnr_glm`

,
`Lrnr_grfcate`

,
`Lrnr_gru_keras`

,
`Lrnr_gts`

,
`Lrnr_h2o_grid`

,
`Lrnr_hal9001`

,
`Lrnr_haldensify`

,
`Lrnr_hts`

,
`Lrnr_independent_binomial`

,
`Lrnr_lightgbm`

,
`Lrnr_lstm_keras`

,
`Lrnr_mean`

,
`Lrnr_multiple_ts`

,
`Lrnr_multivariate`

,
`Lrnr_nnet`

,
`Lrnr_nnls`

,
`Lrnr_optim`

,
`Lrnr_pca`

,
`Lrnr_pkg_SuperLearner`

,
`Lrnr_polspline`

,
`Lrnr_pooled_hazards`

,
`Lrnr_randomForest`

,
`Lrnr_ranger`

,
`Lrnr_revere_task`

,
`Lrnr_rpart`

,
`Lrnr_rugarch`

,
`Lrnr_screener_augment`

,
`Lrnr_screener_coefs`

,
`Lrnr_screener_correlation`

,
`Lrnr_screener_importance`

,
`Lrnr_sl`

,
`Lrnr_solnp_density`

,
`Lrnr_solnp`

,
`Lrnr_stratified`

,
`Lrnr_subset_covariates`

,
`Lrnr_svm`

,
`Lrnr_tsDyn`

,
`Lrnr_ts_weights`

,
`Lrnr_xgboost`

,
`Pipeline`

,
`Stack`

,
`define_h2o_X()`

,
`undocumented_learner`

```
# load example data
data(cpp_imputed)
# create sl3 task
task <- sl3_Task$new(
cpp_imputed,
covariates = c("apgar1", "apgar5", "parity", "gagebrth", "mage", "meducyrs"),
outcome = "haz"
)
# train grf learner and make predictions
lrnr_grf <- Lrnr_grf$new(seed = 123)
lrnr_grf_fit <- lrnr_grf$train(task)
lrnr_grf_pred <- lrnr_grf_fit$predict()
```