\(\DeclareMathOperator{\expit}{expit}\) \(\DeclareMathOperator{\logit}{logit}\) \(\DeclareMathOperator*{\argmin}{\arg\!\min}\) \(\newcommand{\indep}{\perp\!\!\!\perp}\) \(\newcommand{\coloneqq}{\mathrel{=}}\) \(\newcommand{\R}{\mathbb{R}}\) \(\newcommand{\E}{\mathbb{E}}\) \(\newcommand{\M}{\mathcal{M}}\) \(\renewcommand{\P}{\mathbb{P}}\) \(\newcommand{\I}{\mathbb{I}}\) \(\newcommand{\1}{\mathbbm{1}}\)


Avin, Chen, Ilya Shpitser, and Judea Pearl. 2005. “Identifiability of Path-Specific Effects.” In IJCAI International Joint Conference on Artificial Intelligence, 357–63.

Baker, Monya. 2016. “Is There a Reproducibility Crisis? A Nature Survey Lifts the Lid on How Researchers View the Crisis Rocking Science and What They Think Will Help.” Nature 533 (7604): 452–55.

Bembom, Oliver, and Mark J van der Laan. 2007. “A Practical Illustration of the Importance of Realistic Individualized Treatment Rules in Causal Inference.” Electronic Journal of Statistics 1: 574–96.

Bengtsson, Henrik. 2021. “A Unifying Framework for Parallel and Distributed Processing in R Using Futures.” The R Journal. https://doi.org/10.32614/RJ-2021-048.

Benkeser, David, and Jialu Ran. 2021. “Nonparametric Inference for Interventional Effects with Multiple Mediators.” Journal of Causal Inference. https://doi.org/10.1515/jci-2020-0018.

Benkeser, David, and Mark J van der Laan. 2016. “The Highly Adaptive Lasso Estimator.” In 2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA). IEEE. https://doi.org/10.1109/dsaa.2016.93.

Breiman, Leo. 1996. “Stacked Regressions.” Machine Learning 24 (1): 49–64.

———. 2001. “Random Forests.” Machine Learning 45 (1): 5–32.

Buckheit, Jonathan B, and David L Donoho. 1995. “Wavelab and Reproducible Research.” In Wavelets and Statistics, 55–81. Springer.

Chakraborty, Bibhas, and Erica EM Moodie. 2013. Statistical Methods for Dynamic Treatment Regimes: Reinforcement Learning, Causal Inference, and Personalized Medicine (Statistics for Biology and Health). Springer.

Coyle, Jeremy R, and Nima S Hejazi. 2018. “Origami: A Generalized Framework for Cross-Validation in R.” Journal of Open Source Software 3 (21). https://doi.org/10.21105/joss.00512.

Coyle, Jeremy R, Nima S Hejazi, Ivana Malenica, and Rachael V Phillips. n.d. origami: Generalized Framework for Cross-Validation (version 1.0.5). https://doi.org/10.5281/zenodo.835602.

Coyle, Jeremy R, Nima S Hejazi, Ivana Malenica, Rachael V Phillips, Benjamin F Arnold, Andrew Mertens, Jade Benjamin-Chung, et al. 2021. “Targeting Learning: Robust Statistics for Reproducible Research.” arXiv. https://arxiv.org/abs/2006.07333.

Coyle, Jeremy R, Nima S Hejazi, Rachael V Phillips, Lars WP van der Laan, and Mark J van der Laan. 2022. hal9001: The Scalable Highly Adaptive Lasso. https://doi.org/10.5281/zenodo.3558313.

Davison, Anthony Christopher, and David Victor Hinkley. 1997. Bootstrap Methods and Their Application. Cambridge University Press.

Dawid, A Philip. 2000. “Causal Inference Without Counterfactuals.” Journal of the American Statistical Association 95 (450): 407–24.

Didelez, Vanessa, Philip Dawid, and Sara Geneletti. 2006. “Direct and Indirect Effects of Sequential Treatments.” In Proceedings of the 22nd Annual Conference on Uncertainty in Artificial Intelligence, 138–46.

Dı́az, Iván, and Nima S Hejazi. 2020. “Causal Mediation Analysis for Stochastic Interventions.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 82 (3): 661–83. https://doi.org/10.1111/rssb.12362.

Dı́az, Iván, Nima S Hejazi, Kara E Rudolph, and Mark J van der Laan. 2020. “Non-Parametric Efficient Causal Mediation with Intermediate Confounders.” Biometrika. https://doi.org/10.1093/biomet/asaa085.

Dı́az, Iván, and Mark J van der Laan. 2013. “Sensitivity Analysis for Causal Inference Under Unmeasured Confounding and Measurement Error Problems.” The International Journal of Biostatistics 9 (2): 149–60.

Dı́az, Iván, and Mark J van der Laan. 2011. “Super Learner Based Conditional Density Estimation with Application to Marginal Structural Models.” The International Journal of Biostatistics 7 (1): 1–20.

———. 2012. “Population Intervention Causal Effects Based on Stochastic Interventions.” Biometrics 68 (2): 541–49.

———. 2018. “Stochastic Treatment Regimes.” In Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies, 167–80. Springer Science & Business Media.

Donoho, David. 2017. “50 Years of Data Science.” Journal of Computational and Graphical Statistics 26 (4): 745–66.

Dudoit, Sandrine, and Mark J van der Laan. 2005. “Asymptotics of Cross-Validated Risk Estimation in Estimator Selection and Performance Assessment.” Statistical Methodology 2 (2): 131–54.

Fisher, Ronald Aylmer. 1946. Statistical Methods for Research Workers. 10th ed. Oliver; Boyd.

Gruber, Susan, Rachael V Phillips, Hana Lee, John Concato, and Mark van der Laan. 2022. “Evaluating and Improving Real-World Evidence with Targeted Learning.” arXiv Preprint arXiv:2208.07283.

Gruber, Susan, Rachael V Phillips, Hana Lee, Martin Ho, John Concato, and Mark J van der Laan. 2022. “Targeted Learning: Towards a Future Informed by Real-World Evidence.” arXiv Preprint arXiv:2205.08643.

Haneuse, Sebastian, and Andrea Rotnitzky. 2013. “Estimation of the Effect of Interventions That Modify the Received Treatment.” Statistics in Medicine 32 (30): 5260–77.

Hejazi, Nima S. 2021. “Semiparametric Statistical Methods for Causal Inference with Stochastic Treatment Regimes.” PhD thesis, University of California, Berkeley. https://www.stat.berkeley.edu/~nhejazi/publications/thesis-phd-biostat.pdf.

Hejazi, Nima S, David C Benkeser, and Mark J van der Laan. 2022. haldensify: Highly Adaptive Lasso Conditional Density Estimation. https://github.com/nhejazi/haldensify. https://doi.org/10.5281/zenodo.3698329.

Hejazi, Nima S, Jeremy R Coyle, and Mark J van der Laan. 2020. “hal9001: Scalable Highly Adaptive Lasso Regression in R.” Journal of Open Source Software. https://doi.org/10.21105/joss.02526.

Hejazi, Nima S, Kara E Rudolph, Mark J van der Laan, and Iván Dı́az. 2022. “Nonparametric Causal Mediation Analysis for Stochastic Interventional (in)direct Effects.” Biostatistics (in press). https://doi.org/10.1093/biostatistics/kxac002.

Hejazi, Nima S, Mark J van der Laan, Holly E Janes, Peter B Gilbert, and David C Benkeser. 2020. “Efficient Nonparametric Inference on the Effects of Stochastic Interventions Under Two-Phase Sampling, with Applications to Vaccine Efficacy Trials.” Biometrics. https://doi.org/10.1111/biom.13375.

Hernán, Miguel A, and James M Robins. 2022. Causal Inference: What If. CRC Press.

Holland, Paul W. 1986. “Statistics and Causal Inference.” Journal of the American Statistical Association 81 (396): 945–60.

Imai, Kosuke, Luke Keele, and Teppei Yamamoto. 2010. “Identification, Inference and Sensitivity Analysis for Causal Mediation Effects.” Statistical Science, 51–71.

Imbens, Guido W, and Donald B Rubin. 2015. Causal Inference in Statistics, Social, and Biomedical Sciences. Cambridge University Press.

Kennedy, Edward H. 2016. “Semiparametric Theory and Empirical Processes in Causal Inference.” In Statistical Causal Inferences and Their Applications in Public Health Research, 141–67. Springer.

———. 2019. “Nonparametric Causal Effects Based on Incremental Propensity Score Interventions.” Journal of the American Statistical Association 114 (526): 645–56.

Lok, Judith J. 2016. “Defining and Estimating Causal Direct and Indirect Effects When Setting the Mediator to Specific Values Is Not Feasible.” Statistics in Medicine 35 (22): 4008–20.

Luedtke, Alexander R, and Mark J van der Laan. 2016. “Optimal Individualized Treatments in Resource-Limited Settings.” International Journal of Biostatisics 12 (1): 283–303.

Luedtke, Alex, and Mark J van der Laan. 2016. “Super-Learning of an Optimal Dynamic Treatment Rule.” International Journal of Biostatistics 12 (1): 305–32.

Montoya, Lina, Jennifer Skeem, Mark van der Laan, and Maya Petersen. 2021. “Performance and Application of Estimators for the Value of an Optimal Dynamic Treatment Rule.” http://arxiv.org/abs/2101.12333.

Montoya, Lina, Mark J van der Laan, Alexander Luedtke, Jennifer Skeem, Jeremy Coyle, and Maya Petersen. 2021. “The Optimal Dynamic Treatment Rule SuperLearner: Considerations, Performance, and Application.” http://arxiv.org/abs/2101.12326.

Munafò, Marcus R, Brian A Nosek, Dorothy VM Bishop, Katherine S Button, Christopher D Chambers, Nathalie Percie Du Sert, Uri Simonsohn, Eric-Jan Wagenmakers, Jennifer J Ware, and John PA Ioannidis. 2017. “A Manifesto for Reproducible Science.” Nature Human Behaviour 1 (1): 0021.

Murphy, Susan A. 2003. “Optimal Dynamic Treatment Regimes.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65 (2): 331–55.

Naimi, Ashley I, and Laura B Balzer. 2018. “Stacked Generalization: An Introduction to Super Learning.” European Journal of Epidemiology 33 (5): 459–64.

Nature Editorial (Anonymous). 2015a. “How Scientists Fool Themselves — and How They Can Stop.” Nature 526 (7572).

———. 2015b. “Let’s Think About Cognitive Bias.” Nature 526 (7572). https://doi.org/10.1038/526163a.

Neyman, Jerzy. 1938. “Contribution to the Theory of Sampling Human Populations.” Journal of the American Statistical Association 33 (201): 101–16.

Nguyen, Trang Quynh, Ian Schmid, and Elizabeth A Stuart. 2019. “Clarifying Causal Mediation Analysis for the Applied Researcher: Defining Effects Based on What We Want to Learn.” arXiv Preprint arXiv:1904.08515.

Nosek, Brian A, Charles R Ebersole, Alexander C DeHaven, and David T Mellor. 2018. “The Preregistration Revolution.” Proceedings of the National Academy of Sciences 115 (11): 2600–2606.

Pearl, Judea. 1995. “Causal Diagrams for Empirical Research.” Biometrika 82 (4): 669–88.

———. 2001. “Direct and Indirect Effects.” arXiv Preprint arXiv:1301.2300.

———. 2009. Causality: Models, Reasoning, and Inference. Cambridge University Press.

———. 2010. “Brief Report: On the Consistency Rule in Causal Inference: ‘Axiom, Definition, Assumption, or Theorem?’.” Epidemiology, 872–75.

Peng, Roger. 2015. “The Reproducibility Crisis in Science: A Statistical Counterattack.” Significance 12 (3): 30–32.

Petersen, Maya L, Sandra E Sinisi, and Mark J van der Laan. 2006. “Estimation of Direct Causal Effects.” Epidemiology, 276–84.

Phillips, Rachael V, Mark J van der Laan, Hana Lee, and Susan Gruber. 2022. “Practical Considerations for Specifying a Super Learner.” https://doi.org/10.48550/ARXIV.2204.06139.

Polley, Eric C, and Mark J van der Laan. 2010. “Super Learner in Prediction.” Division of Biostatistics, University of California, Berkeley; bepress.

Popper, Karl. 1934. The Logic of Scientific Discovery. Routledge.

Pullenayegum, Eleanor M, Robert W Platt, Melanie Barwick, Brian M Feldman, Martin Offringa, and Lehana Thabane. 2016. “Knowledge Translation in Biostatistics: A Survey of Current Practices, Preferences, and Barriers to the Dissemination and Uptake of New Statistical Methods.” Statistics in Medicine 35 (6): 805–18.

R Core Team. 2021. “R: A Language and Environment for Statistical Computing.” Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Robins, James. 1986. “A New Approach to Causal Inference in Mortality Studies with a Sustained Exposure Period—Application to Control of the Healthy Worker Survivor Effect.” Mathematical Modelling 7 (9): 1393–1512. https://doi.org/https://doi.org/10.1016/0270-0255(86)90088-6.

Robins, James M. 1986. “A New Approach to Causal Inference in Mortality Studies with Sustained Exposure Periods — Application to Control of the Healthy Worker Survivor Effect.” Mathematical Modelling 7: 1393–1512.

———. 2004. “Optimal Structural Nested Models for Optimal Sequential Decisions.” In Proceedings of the Second Seattle Symposium in Biostatistics: Analysis of Correlated Data, 189–326. Springer New York. https://doi.org/10.1007/978-1-4419-9076-1_11.

Robins, James M, and Sander Greenland. 1992. “Identifiability and Exchangeability for Direct and Indirect Effects.” Epidemiology, 143–55.

Robins, James M, and Thomas S Richardson. 2010. “Alternative Graphical Causal Models and the Identification of Direct Effects.” Causality and Psychopathology: Finding the Determinants of Disorders and Their Cures, 103–58.

Robins, James, and Andrea Rotnitzky. 2014. “Discussion of ‘Dynamic Treatment Regimes: Technical Challenges and Applications’.” Electron. J. Statist. 8 (1): 1273–89. https://doi.org/10.1214/14-EJS908.

Rubin, Donald B. 1978. “Bayesian Inference for Causal Effects: The Role of Randomization.” The Annals of Statistics, 34–58.

———. 1980. “Randomization Analysis of Experimental Data: The Fisher Randomization Test Comment.” Journal of the American Statistical Association 75 (371): 591–93.

———. 2005. “Causal Inference Using Potential Outcomes: Design, Modeling, Decisions.” Journal of the American Statistical Association 100 (469): 322–31.

Rudolph, Kara E, Oleg Sofrygin, Wenjing Zheng, and Mark J van der Laan. 2017. “Robust and Flexible Estimation of Stochastic Mediation Effects: A Proposed Method and Example in a Randomized Trial Setting.” Epidemiologic Methods 7 (1).

Spirtes, Peter, Clark N Glymour, Richard Scheines, David Heckerman, Christopher Meek, Gregory Cooper, and Thomas Richardson. 2000. Causation, Prediction, and Search. MIT press.

Stark, Philip B, and Andrea Saltelli. 2018. “Cargo-Cult Statistics and Scientific Crisis.” Significance 15 (4): 40–43.

Stock, James H. 1989. “Nonparametric Policy Analysis.” Journal of the American Statistical Association 84 (406): 567–75.

Stromberg, Arnold, and others. 2004. “Why Write Statistical Software? The Case of Robust Statistical Methods.” Journal of Statistical Software 10 (5): 1–8.

Sutton, Richard S, Andrew G Barto, and others. 1998. Introduction to Reinforcement Learning. Vol. 135. MIT press Cambridge.

Szucs, Denes, and John Ioannidis. 2017. “When Null Hypothesis Significance Testing Is Unsuitable for Research: A Reassessment.” Frontiers in Human Neuroscience 11: 390.

Tchetgen Tchetgen, Eric J. 2013. “Inverse Odds Ratio-Weighted Estimation for Causal Mediation Analysis.” Statistics in Medicine 32 (26): 4567–80.

Tchetgen Tchetgen, Eric J, and Ilya Shpitser. 2012. “Semiparametric Theory for Causal Mediation Analysis: Efficiency Bounds, Multiple Robustness, and Sensitivity Analysis.” Annals of Statistics 40 (3): 1816–45. https://doi.org/10.1214/12-AOS990.

Tchetgen Tchetgen, Eric J, and Tyler J VanderWeele. 2014. “On Identification of Natural Direct Effects When a Confounder of the Mediator Is Directly Affected by Exposure.” Epidemiology 25 (2): 282.

Textor, Johannes, Juliane Hardt, and Sven Knüppel. 2011. “DAGitty: A Graphical Tool for Analyzing Causal Diagrams.” Epidemiology 22 (5): 745.

Tofail, Fahmida, Lia CH Fernald, Kishor K Das, Mahbubur Rahman, Tahmeed Ahmed, Kaniz K Jannat, Leanne Unicomb, et al. 2018. “Effect of Water Quality, Sanitation, Hand Washing, and Nutritional Interventions on Child Development in Rural Bangladesh (Wash Benefits Bangladesh): A Cluster-Randomised Controlled Trial.” The Lancet Child & Adolescent Health 2 (4): 255–68.

Tukey, John W. 1962. “The Future of Data Analysis.” The Annals of Mathematical Statistics 33 (1): 1–67.

van der Laan, Mark J, and Sandrine Dudoit. 2003. “Unified Cross-Validation Methodology for Selection Among Estimators and a General Cross-Validated Adaptive Epsilon-Net Estimator: Finite Sample Oracle Inequalities and Examples.” Division of Biostatistics, University of California, Berkeley; bepress.

van der Laan, Mark J, Sandrine Dudoit, and Sunduz Keles. 2004. “Asymptotic Optimality of Likelihood-Based Cross-Validation.” Statistical Applications in Genetics and Molecular Biology 3 (1): 1–23.

van der Laan, Mark J, and Alex Luedtke. 2015. “Targeted Learning of the Mean Outcome Under an Optimal Dynamic Treatment Rule.” Journal of Causal Inference 3 (1): 61–95.

van der Laan, Mark J, Eric C Polley, and Alan E Hubbard. 2007. “Super Learner.” Statistical Applications in Genetics and Molecular Biology 6 (1).

van der Laan, Mark J, and Sherri Rose. 2011. Targeted Learning: Causal Inference for Observational and Experimental Data. Springer Science & Business Media.

———. 2018. Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies. Springer Science & Business Media.

van der Laan, Mark J, and Richard JCM Starmans. 2014. “Entering the Era of Data Science: Targeted Learning and the Integration of Statistics and Computational Data Analysis.” Advances in Statistics 2014.

van der Vaart, Aad W, Sandrine Dudoit, and Mark J van der Laan. 2006. “Oracle Inequalities for Multi-Fold Cross Validation.” Statistics & Decisions 24 (3): 351–71.

VanderWeele, Tyler. 2015. Explanation in Causal Inference: Methods for Mediation and Interaction. Oxford University Press.

VanderWeele, Tyler J, Stijn Vansteelandt, and James M Robins. 2014. “Effect Decomposition in the Presence of an Exposure-Induced Mediator-Outcome Confounder.” Epidemiology 25 (2): 300.

Vansteelandt, Stijn, Maarten Bekaert, and Theis Lange. 2012. “Imputation Strategies for the Estimation of Natural Direct and Indirect Effects.” Epidemiologic Methods 1 (1): 131–58.

Vansteelandt, Stijn, and Rhian M Daniel. 2017. “Interventional Effects for Mediation Analysis with Multiple Mediators.” Epidemiology 28 (2): 258.

Vansteelandt, Stijn, and Tyler J VanderWeele. 2012. “Natural Direct and Indirect Effects on the Exposed: Effect Decomposition Under Weaker Assumptions.” Biometrics 68 (4): 1019–27.

Wickham, Hadley. 2014. Advanced R. Chapman; Hall/CRC.

Wright, Sewall. 1934. “The Method of Path Coefficients.” The Annals of Mathematical Statistics 5 (3): 161–215.

Young, Jessica G, Miguel A Hernán, and James M Robins. 2014. “Identification, Estimation and Approximation of Risk Under Interventions That Depend on the Natural Value of Treatment Using Observational Data.” Epidemiologic Methods 3 (1): 1–19.

Zhang, Baqun, Anastasios A Tsiatis, Marie Davidian, Min Zhang, and Eric Laber. 2016. “Estimating Optimal Treatment Regimes from a Classification Perspective.” Stat 5 (1): 278–78. https://doi.org/10.1002/sta4.124.

Zhao, Yingqi, Donglin Zeng, A John Rush, and Michael R Kosorok. 2012. “Estimating Individualized Treatment Rules Using Outcome Weighted Learning.” Journal of the American Statistical Association 107 (499): 1106–18. https://doi.org/10.1080/01621459.2012.695674.

Zheng, Wenjing, and Mark J van der Laan. 2010. “Asymptotic Theory for Cross-validated Targeted Maximum Likelihood Estimation.” U.C. Berkeley Division of Biostatistics Working Paper Series.

———. 2012. “Targeted Maximum Likelihood Estimation of Natural Direct Effects.” International Journal of Biostatistics 8 (1). https://doi.org/10.2202/1557-4679.1361.