Paper

Speak and You Shall Predict: Evidence That Speech at Initial Cocaine Abstinence Is a Biomarker of Long-Term Drug Use Behavior

Abstract

Background: Valid scalable biomarkers for predicting longitudinal clinical outcomes in psychiatric research are crucial for optimizing intervention and prevention efforts. Here, we recorded spontaneous speech from initially abstinent individuals with cocaine use disorder (iCUDs) for use in predicting drug use outcomes. Methods: At baseline, 88 iCUDs provided 5-minute speech samples describing the positive consequences of quitting drug use and negative consequences of using drugs. Outcomes, including withdrawal, craving, abstinence days, and recent cocaine use, were assessed at 3-month intervals for up to 1 year (57 iCUDs were included in the analyses). Predictive modeling compared natural language processing (NLP) techniques, specifically sentence embeddings with established inventories as targets, with models utilizing standard demographic and baseline psychometric variables. Results: At short time intervals, maximal predictive power was obtained with non-NLP models that also incorporated the same drug use measures (as the outcomes) obtained at baseline, potentially reflecting their slow rate of change, which could be estimated by linear functions. However, for longer-term predictions, speech samples alone demonstrated statistically significant results, with Spearman r ≥ 0.46 and 80% accuracy for predicting abstinence. Therefore, speech samples may capture nonlinear dynamics over extended intervals more effectively than traditional measures. These results need to be replicated in larger and independent samples. Conclusions: Compared with the common outcome measures used in clinical trials, speech-based measures could be leveraged as better predictors of longitudinal drug use outcomes in initially abstinent iCUDs, as potentially generalizable to other subgroups with cocaine addiction, and to additional substance use disorders and related comorbidity.