Predicting the risk of GenX contamination in private well water using a machine-learned Bayesian network model

By Javad Roostaei, Sarah Colley, Riley Mulhern, Andrew A. May, and Jacqueline MacDonald Gibson
J. of Haz. Mat.
January 19, 2021
DOI: 10.1016/j.jhazmat.2021.125075

Per- and polyfluoroalkyl substances (PFAS) are emerging contaminants that pose significant challenges in mechanistic fate and transport modeling due to their diverse and complex chemical characteristics. Machine learning provides a novel approach for predicting the spatial distribution of PFAS in the environment. We used spatial location information to link PFAS measurements from 1,207 private drinking water wells around a fluorochemical manufacturing facility to a mechanistic model of PFAS air deposition and to publicly available data on soil, land use, topography, weather, and proximity to multiple PFAS sources. We used the resulting linked data set to train a Bayesian network model to predict the risk that GenX, a member of the PFAS class, would exceed a state provisional health goal (140 ng/L) in private well water. The model had high accuracy (ROC curve index for five-fold cross-validation of 0.85, 90% CI 0.84-0.87). Among factors significantly associated with GenX risk in private wells, the most important was the historic rate of atmospheric deposition of GenX from the fluorochemical manufacturing facility. The model output was used to generate spatial risk predictions for the study area to aid in risk assessment, environmental investigations, and targeted public health interventions.

View on ScienceDirect