# References

Avin, C., Shpitser, I. and Pearl, J. (2005) Identifiability of path-specific effects. In:

*IJCAI international joint conference on artificial intelligence*, 2005, pp. 357–363.
Baker, M. (2016) Is there a reproducibility crisis? A nature survey lifts the lid on how researchers view the crisis rocking science and what they think will help.

*Nature*,**533**, 452–455. Nature Publishing Group.
Bembom, O. and van der Laan, M. J. (2007) A practical illustration of the importance of realistic individualized treatment rules in causal inference.

*Electronic Journal of Statistics*,**1**, 574–596.
Bengtsson, H. (2021) A unifying framework for parallel and distributed processing in r using futures.

*The R Journal*. DOI: 10.32614/RJ-2021-048.
Benkeser, D. and Ran, J. (2021) Nonparametric inference for interventional effects with multiple mediators.

*Journal of Causal Inference*. De Gruyter. DOI: 10.1515/jci-2020-0018.
Benkeser, D. and van der Laan, M. J. (2016) The highly adaptive lasso estimator. In:

*2016 IEEE international conference on data science and advanced analytics (DSAA)*, 2016. IEEE. DOI: 10.1109/dsaa.2016.93.
Breiman, L. (1996) Stacked regressions.

*Machine learning*,**24**, 49–64. Springer.
Breiman, L. (2001) Random forests.

*Machine Learning*,**45**, 5–32. Springer.
Buckheit, J. B. and Donoho, D. L. (1995) Wavelab and reproducible research. In

*Wavelets and Statistics*, pp. 55–81. Springer.
Chakraborty, B. and Moodie, E. E. (2013)

*Statistical Methods for Dynamic Treatment Regimes: Reinforcement Learning, Causal Inference, and Personalized Medicine (Statistics for Biology and Health)*. Springer.
Coyle, J. R. and Hejazi, N. S. (2018) Origami: A generalized framework for cross-validation in r.

*Journal of Open Source Software*,**3**. The Open Journal. DOI: 10.21105/joss.00512.
Coyle, J. R., Hejazi, N. S., Malenica, I., et al. (2021) Targeting Learning: Robust statistics for reproducible research.

*arXiv*. Available at: https://arxiv.org/abs/2006.07333.
Coyle, J. R., Hejazi, N. S., Phillips, R. V., et al. (2022)

*hal9001: The Scalable Highly Adaptive Lasso*. DOI: 10.5281/zenodo.3558313.
Coyle, J. R., Hejazi, N. S., Malenica, I., et al. (n.d.)

*. DOI: 10.5281/zenodo.835602.*`origami`

: Generalized Framework for Cross-Validation
Davison, A. C. and Hinkley, D. V. (1997)

*Bootstrap Methods and Their Application*. Cambridge University Press.
Dawid, A. P. (2000) Causal inference without counterfactuals.

*Journal of the American Statistical Association*,**95**, 407–424. Taylor & Francis.
Didelez, V., Dawid, P. and Geneletti, S. (2006) Direct and indirect effects of sequential treatments. In:

*Proceedings of the 22nd annual conference on uncertainty in artificial intelligence*, 2006, pp. 138–146.
Dı́az, I. and Hejazi, N. S. (2020) Causal mediation analysis for stochastic interventions.

*Journal of the Royal Statistical Society: Series B (Statistical Methodology)*,**82**, 661–683. Wiley Online Library. DOI: 10.1111/rssb.12362.
Dı́az, I. and van der Laan, M. J. (2011) Super learner based conditional density estimation with application to marginal structural models.

*The International Journal of Biostatistics*,**7**, 1–20. De Gruyter.
Dı́az, I. and van der Laan, M. J. (2012) Population intervention causal effects based on stochastic interventions.

*Biometrics*,**68**, 541–549. Wiley Online Library.
Dı́az, I. and van der Laan, M. J. (2013) Sensitivity analysis for causal inference under unmeasured confounding and measurement error problems.

*The International Journal of Biostatistics*,**9**, 149–160. De Gruyter. DOI: 10.1515/ijb-2013-0004.
Dı́az, I. and van der Laan, M. J. (2018) Stochastic treatment regimes. In

*Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies*, pp. 167–180. Springer Science & Business Media.
Dı́az, I., Hejazi, N. S., Rudolph, K. E., et al. (2020) Non-parametric efficient causal mediation with intermediate confounders.

*Biometrika*. Oxford University Press. DOI: 10.1093/biomet/asaa085.
Donoho, D. (2017) 50 years of data science.

*Journal of Computational and Graphical Statistics*,**26**, 745–766. Taylor & Francis.
Dudoit, S. and van der Laan, M. J. (2005) Asymptotics of cross-validated risk estimation in estimator selection and performance assessment.

*Statistical Methodology*,**2**, 131–154. Elsevier.
Fisher, R. A. (1946)

*Statistical Methods for Research Workers*. 10th ed. Oliver; Boyd.
Gruber, S., Phillips, R. V., Lee, H., et al. (2022) Evaluating and improving real-world evidence with targeted learning.

*arXiv preprint arXiv:2208.07283*.
Gruber, S., Phillips, R. V., Lee, H., et al. (2023) Targeted Learning: Toward a future informed by real-world evidence.

*Statistics in Biopharmaceutical Research*. Taylor & Francis. DOI: 10.1080/19466315.2023.2182356.
Haneuse, S. and Rotnitzky, A. (2013) Estimation of the effect of interventions that modify the received treatment.

*Statistics in Medicine*,**32**, 5260–5277. Wiley Online Library.
Hejazi, N. S. (2021)

*Semiparametric statistical methods for causal inference with stochastic treatment regimes*. PhD thesis. University of California, Berkeley. Available at: https://www.stat.berkeley.edu/~nhejazi/publications/thesis-phd-biostat.pdf.
Hejazi, N. S., van der Laan, M. J., Janes, H. E., et al. (2020) Efficient nonparametric inference on the effects of stochastic interventions under two-phase sampling, with applications to vaccine efficacy trials.

*Biometrics*. Wiley Online Library. DOI: 10.1111/biom.13375.
Hejazi, N. S., Coyle, J. R. and van der Laan, M. J. (2020) hal9001: Scalable highly adaptive lasso regression in R.

*Journal of Open Source Software*. The Open Journal. DOI: 10.21105/joss.02526.
Hejazi, N. S., Benkeser, D. C. and van der Laan, M. J. (2022)

*haldensify: Highly Adaptive Lasso Conditional Density Estimation*. https://github.com/nhejazi/haldensify. DOI: 10.5281/zenodo.3698329.
Hejazi, N. S., Rudolph, K. E., van der Laan, M. J., et al. (2022) Nonparametric causal mediation analysis for stochastic interventional (in)direct effects.

*Biostatistics*,**(in press)**. Oxford University Press. DOI: 10.1093/biostatistics/kxac002.
Hernán, M. A. and Robins, J. M. (2022)

*Causal Inference: What If*. CRC Press.
Holland, P. W. (1986) Statistics and causal inference.

*Journal of the American Statistical Association*,**81**, 945–960. Taylor & Francis.
Imai, K., Keele, L. and Yamamoto, T. (2010) Identification, inference and sensitivity analysis for causal mediation effects.

*Statistical science*, 51–71. JSTOR.
Imbens, G. W. and Rubin, D. B. (2015)

*Causal Inference in Statistics, Social, and Biomedical Sciences*. Cambridge University Press.
Kennedy, E. H. (2016) Semiparametric theory and empirical processes in causal inference. In

*Statistical Causal Inferences and Their Applications in Public Health Research*, pp. 141–167. Springer.
Kennedy, E. H. (2019) Nonparametric causal effects based on incremental propensity score interventions.

*Journal of the American Statistical Association*,**114**, 645–656. Taylor & Francis.
Lok, J. J. (2016) Defining and estimating causal direct and indirect effects when setting the mediator to specific values is not feasible.

*Statistics in Medicine*,**35**, 4008–4020. Wiley Online Library.
Luedtke, A. and van der Laan, M. J. (2016) Super-learning of an optimal dynamic treatment rule.

*International Journal of Biostatistics*,**12**, 305–332.
Luedtke, A. R. and van der Laan, M. J. (2016) Optimal individualized treatments in resource-limited settings.

*The International Journal of Biostatisics*,**12**, 283–303. De Gruyter. DOI: 10.1515/ijb-2015-0007.
Montoya, L. M., van der Laan, M. J., Skeem, J. L., et al. (2023) Estimators for the value of the optimal dynamic treatment rule with application to criminal justice interventions.

*The International Journal of Biostatistics*,**19**, 239–259. De Gruyter. DOI: 10.1515/ijb-2020-0128.
Montoya, L. M., van der Laan, M. J., Luedtke, A. R., et al. (2023) The optimal dynamic treatment rule superlearner: Considerations, performance, and application to criminal justice interventions.

*The International Journal of Biostatistics*,**19**, 217–238. De Gruyter. DOI: 10.1515/ijb-2020-0127.
Munafò, M. R., Nosek, B. A., Bishop, D. V., et al. (2017) A manifesto for reproducible science.

*Nature Human Behaviour*,**1**, 0021. Nature Publishing Group.
Murphy, S. A. (2003) Optimal dynamic treatment regimes.

*Journal of the Royal Statistical Society: Series B (Statistical Methodology)*,**65**, 331–355. Wiley Online Library.
Naimi, A. I. and Balzer, L. B. (2018) Stacked generalization: An introduction to super learning.

*European Journal of Epidemiology*,**33**, 459–464. Springer.
Nature Editorial (Anonymous) (2015a) How scientists fool themselves — and how they can stop.

*Nature*,**526**. Springer Nature.
Nature Editorial (Anonymous) (2015b) Let’s think about cognitive bias.

*Nature*,**526**. Springer Nature. DOI: 10.1038/526163a.
Neyman, J. (1938) Contribution to the theory of sampling human populations.

*Journal of the American Statistical Association*,**33**, 101–116. Taylor & Francis.
Nguyen, T. Q., Schmid, I. and Stuart, E. A. (2019) Clarifying causal mediation analysis for the applied researcher: Defining effects based on what we want to learn.

*arXiv preprint arXiv:1904.08515*.
Nosek, B. A., Ebersole, C. R., DeHaven, A. C., et al. (2018) The preregistration revolution.

*Proceedings of the National Academy of Sciences*,**115**, 2600–2606. National Academy of Sciences.
Pearl, J. (1995) Causal diagrams for empirical research.

*Biometrika*,**82**, 669–688. Oxford University Press.
Pearl, J. (2001) Direct and indirect effects.

*arXiv preprint arXiv:1301.2300*.
Pearl, J. (2009)

*Causality: Models, Reasoning, and Inference*. Cambridge University Press.
Pearl, J. (2010) Brief report: On the consistency rule in causal inference: ‘Axiom, definition, assumption, or theorem?’

*Epidemiology*, 872–875. JSTOR.
Peng, R. (2015) The reproducibility crisis in science: A statistical counterattack.

*Significance*,**12**, 30–32. Wiley Online Library.
Petersen, M. L., Sinisi, S. E. and van der Laan, M. J. (2006) Estimation of direct causal effects.

*Epidemiology*, 276–284. JSTOR.
Phillips, R. V., Laan, M. J. van der, Lee, H., et al. (2023) Practical considerations for specifying a super learner.

*International Journal of Epidemiology*. Oxford University Press. DOI: 10.1093/ije/dyad023.
Polley, E. C. and van der Laan, M. J. (2010)

*Super learner in prediction*. Division of Biostatistics, University of California, Berkeley; bepress.
Popper, K. (1934)

*The Logic of Scientific Discovery*. Routledge.
Pullenayegum, E. M., Platt, R. W., Barwick, M., et al. (2016) Knowledge translation in biostatistics: A survey of current practices, preferences, and barriers to the dissemination and uptake of new statistical methods.

*Statistics in medicine*,**35**, 805–818. Wiley Online Library.
R Core Team (2021) : A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Available at: https://www.R-project.org/.

Robins, J. (1986) A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect.

*Mathematical Modelling*,**7**, 1393–1512. DOI: https://doi.org/10.1016/0270-0255(86)90088-6.
Robins, J. and Rotnitzky, A. (2014) Discussion of ‘dynamic treatment regimes: Technical challenges and applications’.

*Electron. J. Statist.*,**8**, 1273–1289. The Institute of Mathematical Statistics; the Bernoulli Society. DOI: 10.1214/14-EJS908.
Robins, J. M. (1986) A new approach to causal inference in mortality studies with sustained exposure periods — application to control of the healthy worker survivor effect.

*Mathematical Modelling*,**7**, 1393–1512.
Robins, J. M. (2004) Optimal structural nested models for optimal sequential decisions. In:

*Proceedings of the second seattle symposium in biostatistics: Analysis of correlated data*, 2004, pp. 189–326. Springer New York. DOI: 10.1007/978-1-4419-9076-1_11.
Robins, J. M. and Greenland, S. (1992) Identifiability and exchangeability for direct and indirect effects.

*Epidemiology*, 143–155. JSTOR.
Robins, J. M. and Richardson, T. S. (2010) Alternative graphical causal models and the identification of direct effects.

*Causality and psychopathology: Finding the determinants of disorders and their cures*, 103–158. Oxford: Oxford University Press.
Rubin, D. B. (1978) Bayesian inference for causal effects: The role of randomization.

*The Annals of statistics*, 34–58. JSTOR.
Rubin, D. B. (1980) Randomization analysis of experimental data: The fisher randomization test comment.

*Journal of the American Statistical Association*,**75**, 591–593. JSTOR.
Rubin, D. B. (2005) Causal inference using potential outcomes: Design, modeling, decisions.

*Journal of the American Statistical Association*,**100**, 322–331. Taylor & Francis.
Rudolph, K. E., Sofrygin, O., Zheng, W., et al. (2017) Robust and flexible estimation of stochastic mediation effects: A proposed method and example in a randomized trial setting.

*Epidemiologic Methods*,**7**. De Gruyter.
Spirtes, P., Glymour, C. N., Scheines, R., et al. (2000)

*Causation, Prediction, and Search*. MIT press.
Stark, P. B. and Saltelli, A. (2018) Cargo-cult statistics and scientific crisis.

*Significance*,**15**, 40–43. Wiley Online Library.
Stock, J. H. (1989) Nonparametric policy analysis.

*Journal of the American Statistical Association*,**84**, 567–575. Taylor & Francis Group.
Stromberg, A. et al. (2004) Why write statistical software? The case of robust statistical methods.

*Journal of Statistical Software*,**10**, 1–8.
Sutton, R. S., Barto, A. G., et al. (1998)

*Introduction to Reinforcement Learning*. MIT press Cambridge.
Szucs, D. and Ioannidis, J. (2017) When null hypothesis significance testing is unsuitable for research: A reassessment.

*Frontiers in Human Neuroscience*,**11**, 390. Frontiers.
Tchetgen Tchetgen, E. J. (2013) Inverse odds ratio-weighted estimation for causal mediation analysis.

*Statistics in Medicine*,**32**, 4567–4580. Wiley Online Library.
Tchetgen Tchetgen, E. J. and Shpitser, I. (2012) Semiparametric theory for causal mediation analysis: Efficiency bounds, multiple robustness, and sensitivity analysis.

*Annals of Statistics*,**40**, 1816–1845. DOI: 10.1214/12-AOS990.
Tchetgen Tchetgen, E. J. and VanderWeele, T. J. (2014) On identification of natural direct effects when a confounder of the mediator is directly affected by exposure.

*Epidemiology*,**25**, 282. NIH Public Access.
Textor, J., Hardt, J. and Knüppel, S. (2011) DAGitty: A graphical tool for analyzing causal diagrams.

*Epidemiology*,**22**, 745. LWW.
Tofail, F., Fernald, L. C., Das, K. K., et al. (2018) Effect of water quality, sanitation, hand washing, and nutritional interventions on child development in rural bangladesh (WASH benefits bangladesh): A cluster-randomised controlled trial.

*The Lancet Child & Adolescent Health*,**2**, 255–268. Elsevier.
Tukey, J. W. (1962) The future of data analysis.

*The Annals of Mathematical Statistics*,**33**, 1–67. JSTOR.
van der Laan, M. J. and Dudoit, S. (2003)

*Unified cross-validation methodology for selection among estimators and a general cross-validated adaptive epsilon-net estimator: Finite sample oracle inequalities and examples*. Division of Biostatistics, University of California, Berkeley; bepress.
van der Laan, M. J. and Luedtke, A. (2015) Targeted learning of the mean outcome under an optimal dynamic treatment rule.

*Journal of Causal Inference*,**3**, 61–95.
van der Laan, M. J. and Rose, S. (2011)

*Targeted Learning: Causal Inference for Observational and Experimental Data*. Springer Science & Business Media.
van der Laan, M. J. and Rose, S. (2018)

*Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies*. Springer Science & Business Media.
van der Laan, M. J. and Starmans, R. J. (2014) Entering the era of data science: Targeted learning and the integration of statistics and computational data analysis.

*Advances in Statistics*,**2014**. Hindawi.
van der Laan, M. J., Dudoit, S. and Keles, S. (2004) Asymptotic optimality of likelihood-based cross-validation.

*Statistical Applications in Genetics and Molecular Biology*,**3**, 1–23. De Gruyter.
van der Laan, M. J., Polley, E. C. and Hubbard, A. E. (2007) Super Learner.

*Statistical Applications in Genetics and Molecular Biology*,**6**. De Gruyter.
van der Vaart, A. W., Dudoit, S. and van der Laan, M. J. (2006) Oracle inequalities for multi-fold cross validation.

*Statistics & Decisions*,**24**, 351–371. Oldenbourg Wissenschaftsverlag.
VanderWeele, T. (2015)

*Explanation in Causal Inference: Methods for Mediation and Interaction*. Oxford University Press.
VanderWeele, T. J., Vansteelandt, S. and Robins, J. M. (2014) Effect decomposition in the presence of an exposure-induced mediator-outcome confounder.

*Epidemiology*,**25**, 300.
Vansteelandt, S. and Daniel, R. M. (2017) Interventional effects for mediation analysis with multiple mediators.

*Epidemiology*,**28**, 258.
Vansteelandt, S. and VanderWeele, T. J. (2012) Natural direct and indirect effects on the exposed: Effect decomposition under weaker assumptions.

*Biometrics*,**68**, 1019–1027. Wiley Online Library.
Vansteelandt, S., Bekaert, M. and Lange, T. (2012) Imputation strategies for the estimation of natural direct and indirect effects.

*Epidemiologic Methods*,**1**, 131–158. De Gruyter.
Wickham, H. (2014)

*Advanced r*. Chapman; Hall/CRC.
Wright, S. (1934) The method of path coefficients.

*The Annals of Mathematical Statistics*,**5**, 161–215. JSTOR.
Young, J. G., Hernán, M. A. and Robins, J. M. (2014) Identification, estimation and approximation of risk under interventions that depend on the natural value of treatment using observational data.

*Epidemiologic methods*,**3**, 1–19. De Gruyter.
Zhang, B., A Tsiatis, A., Davidian, M., et al. (2016) Estimating optimal treatment regimes from a classification perspective.

*Stat*,**5**, 278–278. DOI: 10.1002/sta4.124.
Zhao, Y., Zeng, D., Rush, A. J., et al. (2012) Estimating individualized treatment rules using outcome weighted learning.

*Journal of the American Statistical Association*,**107**, 1106–1118. Taylor & Francis. DOI: 10.1080/01621459.2012.695674.
Zheng, W. and van der Laan, M. J. (2011) Cross-validated targeted minimum-loss-based estimation. In

*Targeted Learning: Causal Inference for Observational and Experimental Data*(eds M. J. van der Laan and S. Rose), pp. 459–474. Springer. DOI: 10.1007/978-1-4419-9782-1_27.
Zheng, W. and van der Laan, M. J. (2012) Targeted maximum likelihood estimation of natural direct effects.

*International Journal of Biostatistics*,**8**. DOI: 10.2202/1557-4679.1361.