Detailed Record



An interpretable time series machine learning method for varying forecast and nowcast lengths in wastewater-based epidemiology


Abstract Wastewater-based epidemiology has emerged as a viable tool for monitoring disease prevalence in a population. This paper details a time series machine learning (TSML) method for predicting COVID-19 cases from wastewater and environmental variables. The TSML method utilizes a number of techniques to create an interpretable, hypothesis-driven framework for machine learning that can handle different nowcast and forecast lengths. Some of the techniques employed include: • Feature engineering to construct interpretable features, like site-specific lead times, hypothesized to be potential predictors of COVID-19 cases. • Feature selection to identify features with the best predictive performance for the tasks of nowcasting and forecasting. • Prequential evaluation to prevent data leakage while evaluating the performance of the machine learning algorithm.
Authors M. L. Lai University of WyomingORCID , Shaun S. Wulff University of WyomingORCID , Yongtao Cao ORCID , Timothy J. Robinson University of WyomingORCID , R.R.L.U.I. Rajapaksha
Journal Info Elsevier BV | MethodsX , vol: 11 , pages: 102382 - 102382
Publication Date 12/1/2023
ISSN 2215-0161
TypeKeyword Image article
Open Access gold Gold Access
DOI https://doi.org/10.1016/j.mex.2023.102382
KeywordsKeyword Image Transfer Learning (Score: 0.520224)