\(\DeclareMathOperator{\expit}{expit}\) \(\DeclareMathOperator{\logit}{logit}\) \(\DeclareMathOperator*{\argmin}{\arg\!\min}\) \(\newcommand{\indep}{\perp\!\!\!\perp}\) \(\newcommand{\coloneqq}{\mathrel{=}}\) \(\newcommand{\R}{\mathbb{R}}\) \(\newcommand{\E}{\mathbb{E}}\) \(\newcommand{\M}{\mathcal{M}}\) \(\renewcommand{\P}{\mathbb{P}}\) \(\newcommand{\I}{\mathbb{I}}\) \(\newcommand{\1}{\mathbbm{1}}\)

References

Avin, C., Shpitser, I. and Pearl, J. (2005) Identifiability of path-specific effects. In: IJCAI international joint conference on artificial intelligence, 2005, pp. 357–363.
Baker, M. (2016) Is there a reproducibility crisis? A nature survey lifts the lid on how researchers view the crisis rocking science and what they think will help. Nature, 533, 452–455. Nature Publishing Group.
Bembom, O. and van der Laan, M. J. (2007) A practical illustration of the importance of realistic individualized treatment rules in causal inference. Electronic Journal of Statistics, 1, 574–596.
Bengtsson, H. (2021) A unifying framework for parallel and distributed processing in r using futures. The R Journal. DOI: 10.32614/RJ-2021-048.
Benkeser, D. and Ran, J. (2021) Nonparametric inference for interventional effects with multiple mediators. Journal of Causal Inference. De Gruyter. DOI: 10.1515/jci-2020-0018.
Benkeser, D. and van der Laan, M. J. (2016) The highly adaptive lasso estimator. In: 2016 IEEE international conference on data science and advanced analytics (DSAA), 2016. IEEE. DOI: 10.1109/dsaa.2016.93.
Breiman, L. (1996) Stacked regressions. Machine learning, 24, 49–64. Springer.
Breiman, L. (2001) Random forests. Machine Learning, 45, 5–32. Springer.
Buckheit, J. B. and Donoho, D. L. (1995) Wavelab and reproducible research. In Wavelets and Statistics, pp. 55–81. Springer.
Chakraborty, B. and Moodie, E. E. (2013) Statistical Methods for Dynamic Treatment Regimes: Reinforcement Learning, Causal Inference, and Personalized Medicine (Statistics for Biology and Health). Springer.
Coyle, J. R. and Hejazi, N. S. (2018) Origami: A generalized framework for cross-validation in r. Journal of Open Source Software, 3. The Open Journal. DOI: 10.21105/joss.00512.
Coyle, J. R., Hejazi, N. S., Malenica, I., et al. (2021) Targeting Learning: Robust statistics for reproducible research. arXiv. Available at: https://arxiv.org/abs/2006.07333.
Coyle, J. R., Hejazi, N. S., Phillips, R. V., et al. (2022) hal9001: The Scalable Highly Adaptive Lasso. DOI: 10.5281/zenodo.3558313.
Coyle, J. R., Hejazi, N. S., Malenica, I., et al. (n.d.) origami: Generalized Framework for Cross-Validation. DOI: 10.5281/zenodo.835602.
Davison, A. C. and Hinkley, D. V. (1997) Bootstrap Methods and Their Application. Cambridge University Press.
Dawid, A. P. (2000) Causal inference without counterfactuals. Journal of the American Statistical Association, 95, 407–424. Taylor & Francis.
Didelez, V., Dawid, P. and Geneletti, S. (2006) Direct and indirect effects of sequential treatments. In: Proceedings of the 22nd annual conference on uncertainty in artificial intelligence, 2006, pp. 138–146.
Dı́az, I. and Hejazi, N. S. (2020) Causal mediation analysis for stochastic interventions. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 82, 661–683. Wiley Online Library. DOI: 10.1111/rssb.12362.
Dı́az, I. and van der Laan, M. J. (2011) Super learner based conditional density estimation with application to marginal structural models. The International Journal of Biostatistics, 7, 1–20. De Gruyter.
Dı́az, I. and van der Laan, M. J. (2012) Population intervention causal effects based on stochastic interventions. Biometrics, 68, 541–549. Wiley Online Library.
Dı́az, I. and van der Laan, M. J. (2013) Sensitivity analysis for causal inference under unmeasured confounding and measurement error problems. The International Journal of Biostatistics, 9, 149–160. De Gruyter. DOI: 10.1515/ijb-2013-0004.
Dı́az, I. and van der Laan, M. J. (2018) Stochastic treatment regimes. In Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies, pp. 167–180. Springer Science & Business Media.
Dı́az, I., Hejazi, N. S., Rudolph, K. E., et al. (2020) Non-parametric efficient causal mediation with intermediate confounders. Biometrika. Oxford University Press. DOI: 10.1093/biomet/asaa085.
Donoho, D. (2017) 50 years of data science. Journal of Computational and Graphical Statistics, 26, 745–766. Taylor & Francis.
Dudoit, S. and van der Laan, M. J. (2005) Asymptotics of cross-validated risk estimation in estimator selection and performance assessment. Statistical Methodology, 2, 131–154. Elsevier.
Fisher, R. A. (1946) Statistical Methods for Research Workers. 10th ed. Oliver; Boyd.
Gruber, S., Phillips, R. V., Lee, H., et al. (2022) Evaluating and improving real-world evidence with targeted learning. arXiv preprint arXiv:2208.07283.
Gruber, S., Phillips, R. V., Lee, H., et al. (2023) Targeted Learning: Toward a future informed by real-world evidence. Statistics in Biopharmaceutical Research. Taylor & Francis. DOI: 10.1080/19466315.2023.2182356.
Haneuse, S. and Rotnitzky, A. (2013) Estimation of the effect of interventions that modify the received treatment. Statistics in Medicine, 32, 5260–5277. Wiley Online Library.
Hejazi, N. S. (2021) Semiparametric statistical methods for causal inference with stochastic treatment regimes. PhD thesis. University of California, Berkeley. Available at: https://www.stat.berkeley.edu/~nhejazi/publications/thesis-phd-biostat.pdf.
Hejazi, N. S., van der Laan, M. J., Janes, H. E., et al. (2020) Efficient nonparametric inference on the effects of stochastic interventions under two-phase sampling, with applications to vaccine efficacy trials. Biometrics. Wiley Online Library. DOI: 10.1111/biom.13375.
Hejazi, N. S., Coyle, J. R. and van der Laan, M. J. (2020) hal9001: Scalable highly adaptive lasso regression in R. Journal of Open Source Software. The Open Journal. DOI: 10.21105/joss.02526.
Hejazi, N. S., Benkeser, D. C. and van der Laan, M. J. (2022) haldensify: Highly Adaptive Lasso Conditional Density Estimation. https://github.com/nhejazi/haldensify. DOI: 10.5281/zenodo.3698329.
Hejazi, N. S., Rudolph, K. E., van der Laan, M. J., et al. (2022) Nonparametric causal mediation analysis for stochastic interventional (in)direct effects. Biostatistics, (in press). Oxford University Press. DOI: 10.1093/biostatistics/kxac002.
Hernán, M. A. and Robins, J. M. (2022) Causal Inference: What If. CRC Press.
Holland, P. W. (1986) Statistics and causal inference. Journal of the American Statistical Association, 81, 945–960. Taylor & Francis.
Imai, K., Keele, L. and Yamamoto, T. (2010) Identification, inference and sensitivity analysis for causal mediation effects. Statistical science, 51–71. JSTOR.
Imbens, G. W. and Rubin, D. B. (2015) Causal Inference in Statistics, Social, and Biomedical Sciences. Cambridge University Press.
Kennedy, E. H. (2016) Semiparametric theory and empirical processes in causal inference. In Statistical Causal Inferences and Their Applications in Public Health Research, pp. 141–167. Springer.
Kennedy, E. H. (2019) Nonparametric causal effects based on incremental propensity score interventions. Journal of the American Statistical Association, 114, 645–656. Taylor & Francis.
Lok, J. J. (2016) Defining and estimating causal direct and indirect effects when setting the mediator to specific values is not feasible. Statistics in Medicine, 35, 4008–4020. Wiley Online Library.
Luedtke, A. and van der Laan, M. J. (2016) Super-learning of an optimal dynamic treatment rule. International Journal of Biostatistics, 12, 305–332.
Luedtke, A. R. and van der Laan, M. J. (2016) Optimal individualized treatments in resource-limited settings. The International Journal of Biostatisics, 12, 283–303. De Gruyter. DOI: 10.1515/ijb-2015-0007.
Montoya, L. M., van der Laan, M. J., Skeem, J. L., et al. (2023) Estimators for the value of the optimal dynamic treatment rule with application to criminal justice interventions. The International Journal of Biostatistics, 19, 239–259. De Gruyter. DOI: 10.1515/ijb-2020-0128.
Montoya, L. M., van der Laan, M. J., Luedtke, A. R., et al. (2023) The optimal dynamic treatment rule superlearner: Considerations, performance, and application to criminal justice interventions. The International Journal of Biostatistics, 19, 217–238. De Gruyter. DOI: 10.1515/ijb-2020-0127.
Munafò, M. R., Nosek, B. A., Bishop, D. V., et al. (2017) A manifesto for reproducible science. Nature Human Behaviour, 1, 0021. Nature Publishing Group.
Murphy, S. A. (2003) Optimal dynamic treatment regimes. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65, 331–355. Wiley Online Library.
Naimi, A. I. and Balzer, L. B. (2018) Stacked generalization: An introduction to super learning. European Journal of Epidemiology, 33, 459–464. Springer.
Nature Editorial (Anonymous) (2015a) How scientists fool themselves — and how they can stop. Nature, 526. Springer Nature.
Nature Editorial (Anonymous) (2015b) Let’s think about cognitive bias. Nature, 526. Springer Nature. DOI: 10.1038/526163a.
Neyman, J. (1938) Contribution to the theory of sampling human populations. Journal of the American Statistical Association, 33, 101–116. Taylor & Francis.
Nguyen, T. Q., Schmid, I. and Stuart, E. A. (2019) Clarifying causal mediation analysis for the applied researcher: Defining effects based on what we want to learn. arXiv preprint arXiv:1904.08515.
Nosek, B. A., Ebersole, C. R., DeHaven, A. C., et al. (2018) The preregistration revolution. Proceedings of the National Academy of Sciences, 115, 2600–2606. National Academy of Sciences.
Pearl, J. (1995) Causal diagrams for empirical research. Biometrika, 82, 669–688. Oxford University Press.
Pearl, J. (2001) Direct and indirect effects. arXiv preprint arXiv:1301.2300.
Pearl, J. (2009) Causality: Models, Reasoning, and Inference. Cambridge University Press.
Pearl, J. (2010) Brief report: On the consistency rule in causal inference: ‘Axiom, definition, assumption, or theorem?’ Epidemiology, 872–875. JSTOR.
Peng, R. (2015) The reproducibility crisis in science: A statistical counterattack. Significance, 12, 30–32. Wiley Online Library.
Petersen, M. L., Sinisi, S. E. and van der Laan, M. J. (2006) Estimation of direct causal effects. Epidemiology, 276–284. JSTOR.
Phillips, R. V., Laan, M. J. van der, Lee, H., et al. (2023) Practical considerations for specifying a super learner. International Journal of Epidemiology. Oxford University Press. DOI: 10.1093/ije/dyad023.
Polley, E. C. and van der Laan, M. J. (2010) Super learner in prediction. Division of Biostatistics, University of California, Berkeley; bepress.
Popper, K. (1934) The Logic of Scientific Discovery. Routledge.
Pullenayegum, E. M., Platt, R. W., Barwick, M., et al. (2016) Knowledge translation in biostatistics: A survey of current practices, preferences, and barriers to the dissemination and uptake of new statistical methods. Statistics in medicine, 35, 805–818. Wiley Online Library.
R Core Team (2021) : A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Available at: https://www.R-project.org/.
Robins, J. (1986) A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Mathematical Modelling, 7, 1393–1512. DOI: https://doi.org/10.1016/0270-0255(86)90088-6.
Robins, J. and Rotnitzky, A. (2014) Discussion of ‘dynamic treatment regimes: Technical challenges and applications’. Electron. J. Statist., 8, 1273–1289. The Institute of Mathematical Statistics; the Bernoulli Society. DOI: 10.1214/14-EJS908.
Robins, J. M. (1986) A new approach to causal inference in mortality studies with sustained exposure periods — application to control of the healthy worker survivor effect. Mathematical Modelling, 7, 1393–1512.
Robins, J. M. (2004) Optimal structural nested models for optimal sequential decisions. In: Proceedings of the second seattle symposium in biostatistics: Analysis of correlated data, 2004, pp. 189–326. Springer New York. DOI: 10.1007/978-1-4419-9076-1_11.
Robins, J. M. and Greenland, S. (1992) Identifiability and exchangeability for direct and indirect effects. Epidemiology, 143–155. JSTOR.
Robins, J. M. and Richardson, T. S. (2010) Alternative graphical causal models and the identification of direct effects. Causality and psychopathology: Finding the determinants of disorders and their cures, 103–158. Oxford: Oxford University Press.
Rubin, D. B. (1978) Bayesian inference for causal effects: The role of randomization. The Annals of statistics, 34–58. JSTOR.
Rubin, D. B. (1980) Randomization analysis of experimental data: The fisher randomization test comment. Journal of the American Statistical Association, 75, 591–593. JSTOR.
Rubin, D. B. (2005) Causal inference using potential outcomes: Design, modeling, decisions. Journal of the American Statistical Association, 100, 322–331. Taylor & Francis.
Rudolph, K. E., Sofrygin, O., Zheng, W., et al. (2017) Robust and flexible estimation of stochastic mediation effects: A proposed method and example in a randomized trial setting. Epidemiologic Methods, 7. De Gruyter.
Spirtes, P., Glymour, C. N., Scheines, R., et al. (2000) Causation, Prediction, and Search. MIT press.
Stark, P. B. and Saltelli, A. (2018) Cargo-cult statistics and scientific crisis. Significance, 15, 40–43. Wiley Online Library.
Stock, J. H. (1989) Nonparametric policy analysis. Journal of the American Statistical Association, 84, 567–575. Taylor & Francis Group.
Stromberg, A. et al. (2004) Why write statistical software? The case of robust statistical methods. Journal of Statistical Software, 10, 1–8.
Sutton, R. S., Barto, A. G., et al. (1998) Introduction to Reinforcement Learning. MIT press Cambridge.
Szucs, D. and Ioannidis, J. (2017) When null hypothesis significance testing is unsuitable for research: A reassessment. Frontiers in Human Neuroscience, 11, 390. Frontiers.
Tchetgen Tchetgen, E. J. (2013) Inverse odds ratio-weighted estimation for causal mediation analysis. Statistics in Medicine, 32, 4567–4580. Wiley Online Library.
Tchetgen Tchetgen, E. J. and Shpitser, I. (2012) Semiparametric theory for causal mediation analysis: Efficiency bounds, multiple robustness, and sensitivity analysis. Annals of Statistics, 40, 1816–1845. DOI: 10.1214/12-AOS990.
Tchetgen Tchetgen, E. J. and VanderWeele, T. J. (2014) On identification of natural direct effects when a confounder of the mediator is directly affected by exposure. Epidemiology, 25, 282. NIH Public Access.
Textor, J., Hardt, J. and Knüppel, S. (2011) DAGitty: A graphical tool for analyzing causal diagrams. Epidemiology, 22, 745. LWW.
Tofail, F., Fernald, L. C., Das, K. K., et al. (2018) Effect of water quality, sanitation, hand washing, and nutritional interventions on child development in rural bangladesh (WASH benefits bangladesh): A cluster-randomised controlled trial. The Lancet Child & Adolescent Health, 2, 255–268. Elsevier.
Tukey, J. W. (1962) The future of data analysis. The Annals of Mathematical Statistics, 33, 1–67. JSTOR.
van der Laan, M. J. and Dudoit, S. (2003) Unified cross-validation methodology for selection among estimators and a general cross-validated adaptive epsilon-net estimator: Finite sample oracle inequalities and examples. Division of Biostatistics, University of California, Berkeley; bepress.
van der Laan, M. J. and Luedtke, A. (2015) Targeted learning of the mean outcome under an optimal dynamic treatment rule. Journal of Causal Inference, 3, 61–95.
van der Laan, M. J. and Rose, S. (2011) Targeted Learning: Causal Inference for Observational and Experimental Data. Springer Science & Business Media.
van der Laan, M. J. and Rose, S. (2018) Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies. Springer Science & Business Media.
van der Laan, M. J. and Starmans, R. J. (2014) Entering the era of data science: Targeted learning and the integration of statistics and computational data analysis. Advances in Statistics, 2014. Hindawi.
van der Laan, M. J., Dudoit, S. and Keles, S. (2004) Asymptotic optimality of likelihood-based cross-validation. Statistical Applications in Genetics and Molecular Biology, 3, 1–23. De Gruyter.
van der Laan, M. J., Polley, E. C. and Hubbard, A. E. (2007) Super Learner. Statistical Applications in Genetics and Molecular Biology, 6. De Gruyter.
van der Vaart, A. W., Dudoit, S. and van der Laan, M. J. (2006) Oracle inequalities for multi-fold cross validation. Statistics & Decisions, 24, 351–371. Oldenbourg Wissenschaftsverlag.
VanderWeele, T. (2015) Explanation in Causal Inference: Methods for Mediation and Interaction. Oxford University Press.
VanderWeele, T. J., Vansteelandt, S. and Robins, J. M. (2014) Effect decomposition in the presence of an exposure-induced mediator-outcome confounder. Epidemiology, 25, 300.
Vansteelandt, S. and Daniel, R. M. (2017) Interventional effects for mediation analysis with multiple mediators. Epidemiology, 28, 258.
Vansteelandt, S. and VanderWeele, T. J. (2012) Natural direct and indirect effects on the exposed: Effect decomposition under weaker assumptions. Biometrics, 68, 1019–1027. Wiley Online Library.
Vansteelandt, S., Bekaert, M. and Lange, T. (2012) Imputation strategies for the estimation of natural direct and indirect effects. Epidemiologic Methods, 1, 131–158. De Gruyter.
Wickham, H. (2014) Advanced r. Chapman; Hall/CRC.
Wright, S. (1934) The method of path coefficients. The Annals of Mathematical Statistics, 5, 161–215. JSTOR.
Young, J. G., Hernán, M. A. and Robins, J. M. (2014) Identification, estimation and approximation of risk under interventions that depend on the natural value of treatment using observational data. Epidemiologic methods, 3, 1–19. De Gruyter.
Zhang, B., A Tsiatis, A., Davidian, M., et al. (2016) Estimating optimal treatment regimes from a classification perspective. Stat, 5, 278–278. DOI: 10.1002/sta4.124.
Zhao, Y., Zeng, D., Rush, A. J., et al. (2012) Estimating individualized treatment rules using outcome weighted learning. Journal of the American Statistical Association, 107, 1106–1118. Taylor & Francis. DOI: 10.1080/01621459.2012.695674.
Zheng, W. and van der Laan, M. J. (2011) Cross-validated targeted minimum-loss-based estimation. In Targeted Learning: Causal Inference for Observational and Experimental Data (eds M. J. van der Laan and S. Rose), pp. 459–474. Springer. DOI: 10.1007/978-1-4419-9782-1_27.
Zheng, W. and van der Laan, M. J. (2012) Targeted maximum likelihood estimation of natural direct effects. International Journal of Biostatistics, 8. DOI: 10.2202/1557-4679.1361.