References
Baker, M. (2016) Is there a reproducibility crisis? A nature survey lifts the lid on how researchers view the crisis rocking science and what they think will help. Nature, 533, 452–455. Nature Publishing Group.
Bengtsson, H. (2021) A unifying framework for parallel and distributed processing in r using futures. The R Journal. DOI: 10.32614/RJ-2021-048.
Benkeser, D. and van der Laan, M. J. (2016) The highly adaptive lasso estimator. In: 2016 IEEE international conference on data science and advanced analytics (DSAA), 2016. IEEE. DOI: 10.1109/dsaa.2016.93.
Breiman, L. (1996) Stacked regressions. Machine learning, 24, 49–64. Springer.
Breiman, L. (2001) Random forests. Machine Learning, 45, 5–32. Springer.
Buckheit, J. B. and Donoho, D. L. (1995) Wavelab and reproducible research. In Wavelets and Statistics, pp. 55–81. Springer.
Coyle, J. R. and Hejazi, N. S. (2018) Origami: A generalized framework for cross-validation in r. Journal of Open Source Software, 3. The Open Journal. DOI: 10.21105/joss.00512.
Coyle, J. R., Hejazi, N. S., Malenica, I., et al. (2021) Targeting Learning: Robust statistics for reproducible research. arXiv. Available at: https://arxiv.org/abs/2006.07333.
Coyle, J. R., Hejazi, N. S., Phillips, R. V., et al. (2022) hal9001: The Scalable Highly Adaptive Lasso. DOI: 10.5281/zenodo.3558313.
Coyle, J. R., Hejazi, N. S., Malenica, I., et al. (n.d.)
origami
: Generalized framework for cross-validation. DOI: 10.5281/zenodo.835602.
Davison, A. C. and Hinkley, D. V. (1997) Bootstrap Methods and Their Application. Cambridge University Press.
Dı́az, I. and van der Laan, M. J. (2011) Super learner based conditional density estimation with application to marginal structural models. The International Journal of Biostatistics, 7, 1–20. De Gruyter.
Dı́az, I. and van der Laan, M. J. (2012) Population intervention causal effects based on stochastic interventions. Biometrics, 68, 541–549. Wiley Online Library.
Dı́az, I. and van der Laan, M. J. (2018) Stochastic treatment regimes. In Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies, pp. 167–180. Springer Science & Business Media.
Dudoit, S. and van der Laan, M. J. (2005) Asymptotics of cross-validated risk estimation in estimator selection and performance assessment. Statistical Methodology, 2, 131–154. Elsevier.
Haneuse, S. and Rotnitzky, A. (2013) Estimation of the effect of interventions that modify the received treatment. Statistics in medicine, 32, 5260–5277. Wiley Online Library.
Hejazi, N. S., Coyle, J. R. and van der Laan, M. J. (2020) hal9001: Scalable highly adaptive lasso regression in R. Journal of Open Source Software. The Open Journal. DOI: 10.21105/joss.02526.
Hejazi, N. S., Benkeser, D. C. and van der Laan, M. J. (2022) haldensify: Highly Adaptive Lasso Conditional Density Estimation. https://github.com/nhejazi/haldensify. DOI: 10.5281/zenodo.3698329.
Munafò, M. R., Nosek, B. A., Bishop, D. V., et al. (2017) A manifesto for reproducible science. Nature Human Behaviour, 1, 0021. Nature Publishing Group.
Naimi, A. I. and Balzer, L. B. (2018) Stacked generalization: An introduction to super learning. European Journal of Epidemiology, 33, 459–464. Springer.
Nature Editorial (Anonymous) (2015a) How scientists fool themselves — and how they can stop. Nature, 526. Springer Nature.
Nature Editorial (Anonymous) (2015b) Let’s think about cognitive bias. Nature, 526. Springer Nature. DOI: 10.1038/526163a.
Nosek, B. A., Ebersole, C. R., DeHaven, A. C., et al. (2018) The preregistration revolution. Proceedings of the National Academy of Sciences, 115, 2600–2606. National Academy of Sciences.
Peng, R. (2015) The reproducibility crisis in science: A statistical counterattack. Significance, 12, 30–32. Wiley Online Library.
Phillips, R. V., van der Laan, M. J., Lee, H., et al. (2022) Practical considerations for specifying a super learner. arXiv. DOI: 10.48550/ARXIV.2204.06139.
Polley, E. C. and van der Laan, M. J. (2010) Super learner in prediction. Division of Biostatistics, University of California, Berkeley; bepress.
Pullenayegum, E. M., Platt, R. W., Barwick, M., et al. (2016) Knowledge translation in biostatistics: A survey of current practices, preferences, and barriers to the dissemination and uptake of new statistical methods. Statistics in medicine, 35, 805–818. Wiley Online Library.
R Core Team (2021) : A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. Available at: https://www.R-project.org/.
Stark, P. B. and Saltelli, A. (2018) Cargo-cult statistics and scientific crisis. Significance, 15, 40–43. Wiley Online Library.
Stromberg, A. et al. (2004) Why write statistical software? The case of robust statistical methods. Journal of Statistical Software, 10, 1–8.
Szucs, D. and Ioannidis, J. (2017) When null hypothesis significance testing is unsuitable for research: A reassessment. Frontiers in Human Neuroscience, 11, 390. Frontiers.
Tofail, F., Fernald, L. C., Das, K. K., et al. (2018) Effect of water quality, sanitation, hand washing, and nutritional interventions on child development in rural bangladesh (WASH benefits bangladesh): A cluster-randomised controlled trial. The Lancet Child & Adolescent Health, 2, 255–268. Elsevier.
van der Laan, M. J. and Dudoit, S. (2003) Unified cross-validation methodology for selection among estimators and a general cross-validated adaptive epsilon-net estimator: Finite sample oracle inequalities and examples. Division of Biostatistics, University of California, Berkeley; bepress.
van der Laan, M. J. and Rose, S. (2011) Targeted Learning: Causal Inference for Observational and Experimental Data. Springer Science & Business Media.
van der Laan, M. J. and Starmans, R. J. (2014) Entering the era of data science: Targeted learning and the integration of statistics and computational data analysis. Advances in Statistics, 2014. Hindawi.
van der Laan, M. J., Dudoit, S. and Keles, S. (2004) Asymptotic optimality of likelihood-based cross-validation. Statistical Applications in Genetics and Molecular Biology, 3, 1–23.
van der Laan, M. J., Polley, E. C. and Hubbard, A. E. (2007) Super Learner. Statistical Applications in Genetics and Molecular Biology, 6.
van der Vaart, A. W., Dudoit, S. and van der Laan, M. J. (2006) Oracle inequalities for multi-fold cross validation. Statistics & Decisions, 24, 351–371. Oldenbourg Wissenschaftsverlag.
Wickham, H. (2014) Advanced r. Chapman; Hall/CRC.
Young, J. G., Hernán, M. A. and Robins, J. M. (2014) Identification, estimation and approximation of risk under interventions that depend on the natural value of treatment using observational data. Epidemiologic methods, 3, 1–19. De Gruyter.