Detailed Record



An interpretable time series machine learning method for varying forecast and nowcast lengths in wastewater-based epidemiology


Abstract Wastewater-based epidemiology has emerged as a viable tool for monitoring disease prevalence in a population. This paper details a time series machine learning (TSML) method for predicting COVID-19 cases from wastewater and environmental variables. The TSML method utilizes a number of techniques to create an interpretable, hypothesis-driven framework for machine learning that can handle different nowcast and forecast lengths. Some of the techniques employed include:•Feature engineering to construct interpretable features, like site-specific lead times, hypothesized to be potential predictors of COVID-19 cases.•Feature selection to identify features with the best predictive performance for the tasks of nowcasting and forecasting.•Prequential evaluation to prevent data leakage while evaluating the performance of the machine learning algorithm.
Authors M. L. Lai University of WyomingORCID , Shaun S. Wulff University of WyomingORCID , Yongtao Cao ORCID , Timothy J. Robinson University of WyomingORCID , R.R.L.U.I. Rajapaksha
Journal Info Elsevier BV | MethodsX , vol: 11 , pages: 102382 - 102382
Publication Date 12/1/2023
ISSN 2215-0161
TypeKeyword Image article
Open Access gold Gold Access
DOI https://doi.org/10.1016/j.mex.2023.102382
KeywordsKeyword Image Transfer Learning (Score: 0.520224)