Not just "big" data: Importance of sample size, measurement error, and uninformative predictors for developing prognostic models for digital interventions.

Citation metadata

Publisher: Elsevier Science Publishers
Document Type: Report; Brief article
Length: 326 words

Document controls

Main content

Abstract :

Keywords Treatment prediction; Machine learning; Precision medicine; Clinical trials; Simulations Highlights * Prediction of complex, non-linear treatment prognosis requires thousands of subjects. * Moderate measurement error erases benefits of complex machine learning models. * Digital interventions may yield an effective way to study prognostic treatment models. Abstract There is strong interest in developing a more efficient mental health care system. Digital interventions and predictive models of treatment prognosis will likely play an important role in this endeavor. This article reviews the application of popular machine learning models to the prediction of treatment prognosis, with a particular focus on digital interventions. Assuming that the prediction of treatment prognosis will involve modeling a complex combination of interacting features with measurement error in both the predictors and outcomes, our simulations suggest that to optimize complex prediction models, sample sizes in the thousands will be required. Machine learning methods capable of discovering complex interactions and nonlinear effects (e.g., decision tree ensembles such as gradient boosted machines) perform particularly well in large samples when the predictors and outcomes have virtually no measurement error. However, in the presence of moderate measurement error, these methods provide little or no benefit over regularized linear regression, even with very large sample sizes (N = 100,000) and a non-linear ground truth. Given these sample size requirements, we argue that the scalability of digital interventions, especially when used in combination with optimal measurement practices, provides one of the most effective ways to study treatment prediction models. We conclude with suggestions about how to implement these algorithms into clinical practice. Author Affiliation: Department of Psychology and Institute for Mental Health Research, University of Texas at Austin, USA * Corresponding author. University of Texas at Austin, Institute for Mental Health Research, 108 E Dean Keeton St, Austin, TX 78712, USA. Article History: Received 18 June 2021; Revised 11 March 2022; Accepted 5 April 2022 Byline: Mary E. McNamara [molly.mcnamara@utexas.edu] (*), Mackenzie Zisser, Christopher G. Beevers, Jason Shumake [shumake@utexas.edu] (**)

Source Citation

Source Citation   

Gale Document Number: GALE|A703074267