1 Welcome to the tlverse
1.1 Learning Objectives
- Understand the
tlverse
ecosystem conceptually - Identify the core components of the
tlverse
- Install
tlverse
R
packages - Understand the Targeted Learning roadmap
- Learn about the WASH Benefits example data
1.2 What is the tlverse
?
The tlverse
is a new framework for doing Targeted Learning in R, inspired by
the tidyverse
ecosystem of R packages.
By analogy to the tidyverse
:
The
tidyverse
is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures.
So, the tlverse
is
- an opinionated collection of R packages for Targeted Learning
- sharing an underlying philosophy, grammar, and set of data structures
1.3 tlverse
components
These are the main packages that represent the core of the tlverse
:
-
sl3
: Modern Super Learning with Pipelines-
What? A modern object-oriented re-implementation of the Super Learner
algorithm, employing recently developed paradigms for
R
programming. -
Why? A design that leverages modern tools for fast computation, is
forward-looking, and can form one of the cornerstones of the
tlverse
.
-
What? A modern object-oriented re-implementation of the Super Learner
algorithm, employing recently developed paradigms for
-
tmle3
: An Engine for Targeted Learning- What? A generalized framework that simplifies Targeted Learning by identifying and implementing a series of common statistical estimation procedures.
- Why? A common interface and engine that accommodates current algorithmic approaches to Targeted Learning and is still flexible enough to remain the engine even as new techniques are developed.
In addition to the engines that drive development in the tlverse
, there are
some supporting packages – in particular, we have two…
-
origami
: A Generalized Framework for Cross-Validation- What? A generalized framework for flexible cross-validation
- Why? Cross-validation is a key part of ensuring error estimates are honest and preventing overfitting. It is an essential part of the both the Super Learner algorithm and Targeted Learning.
-
delayed
: Parallelization Framework for Dependent Tasks- What? A framework for delayed computations (futures) based on task dependencies.
- Why? Efficient allocation of compute resources is essential when deploying large-scale, computationally intensive algorithms.
A key principle of the tlverse
is extensibility. That is, we want to support
new Targeted Learning estimators as they are developed. The model for this is
new estimators are implemented in additional packages using the core packages
above. There are currently two featured examples of this:
-
tmle3mopttx
: Optimal Treatments intlverse
- What? Learn an optimal rule and estimate the mean outcome under the rule
- Why? Optimal Treatment is a powerful tool in precision healthcare and other settings where a one-size-fits-all treatment approach is not appropriate.
-
tmle3shift
: Shift Interventions intlverse
- What? Shift interventions for continuous treatments
- Why? Not all treatment variables are discrete. Being able to estimate the effects of continuous treatment represents a powerful extension of the Targeted Learning approach.
1.4 Installation
The tlverse
ecosystem of packages are currently hosted at
https://github.com/tlverse, not yet on CRAN. You
can use the devtools
package to install them:
install.packages("devtools")
devtools::install_github("tlverse/tlverse")
The tlverse
depends on a large number of other packages that are also hosted
on GitHub. Because of this, you may see the following error:
Error: HTTP error 403.
API rate limit exceeded for 71.204.135.82. (But here's the good news:
Authenticated requests get a higher rate limit. Check out the documentation
for more details.)
Rate limit remaining: 0/60
Rate limit reset at: 2019-03-04 19:39:05 UTC
To increase your GitHub API rate limit
- Use `usethis::browse_github_pat()` to create a Personal Access Token.
- Use `usethis::edit_r_environ()` and add the token as `GITHUB_PAT`.
This just means that R tried to install too many packages from GitHub in too short of a window. To fix this, you need to tell R how to use GitHub as your user (you’ll need a GitHub user account). Follow these two steps:
- Type
usethis::browse_github_pat()
in your R console, which will direct you to GitHub’s page to create a New Personal Access Token. - Create a Personal Access Token simply by clicking “Generate token” at the bottom of the page.
- Copy your Personal Access Token, a long string of lowercase letters and numbers.
- Type
usethis::edit_r_environ()
in your R console, which will open your.Renviron
file in the source window of RStudio. If you are not able to access your.Renviron
file with this command, then try inputtingSys.setenv(GITHUB_PAT = )
with your Personal Access Token inserted as a string after the equals symbol; and if this does not error, then skip to step 8. - In your
.Renviron
file, typeGITHUB_PAT=
and then paste your Personal Access Token after the equals symbol with no space. - In your
.Renviron
file, press the enter key to ensure that your.Renviron
ends with a newline. - Save your
.Renviron
file. - Restart R for changes to take effect. You can restart R via the drop-down menu on the “Session” tab. The “Session” tab is at the top of the RStudio interface.
After following these steps, you should be able to successfully install the package which threw the error above.