Automating wastewater characteristic parameter quantitation using neural architecture search in AutoML systems on spectral reflectance data

Document Type

Article

Publication Title

Scientific Reports

Abstract

Wastewater (WW) analyses are required to establish optimal operation of treatment facilities, and rapid treatment assessment can greatly improve operational efficiency. WWTPs function through complex biochemical processes that exhibit high variability and are difficult to predict. This work presents the first in-depth study towards an original application of Auto Machine Learning (AutoML) models to accurately predict quality parameters of wastewater. This study aims to demonstrate the efficiency of neural network (NN) regression models built by applying the Neural Architectural Space (NAS) applying different search algorithms such as random search, grid search, Bayesian Optimization, and Hyperband search to predict the concentrations of various wastewater (WW) parameters such as biochemical oxygen demand (BOD), chemical oxygen demand (COD), ammonia (NH3-N), total dissolved solids (TDS), total alkalinity (TA), and total hardness (TH). The models are trained on data based on WW spectral reflectance data in the visible-near infrared range (400–2000 nm). The input variables consisted of spectral reflectance intensity of wastewater, while the output variables consisted of levels of six wastewater (WW) parameters, namely BOD, COD, NH3-N, TDS, TA, and TH. Different search algorithms were employed to identify the neural network (NN) architectures that provided single and multiple target variable (two to six target variables) predictions to the analysis. Experimental results indicate that the NN architecture optimized via Bayesian Optimization outperforms others, achieving the lowest mean absolute error (MAE) and a high coefficient of determination (R2). For single-target predictions on the validation set, the model attained R2 values of 0.9770 for BOD, 0.9860 for COD, 0.9800 for NH3-N, 0.9776 for TDS, 0.9847 for TA, and 0.9639 for TH. For two target predictions on the validation set, the model obtained R2 values of 0.988 for BOD and 0.993 for COD. Similarly R2 for predictions for three target variables are 0.961 for BOD, 0.962 for COD and 0.946 for NH3-N; for four target predictions 0.972 for BOD, 0.970 for COD, 0.955 for NH3-N and 0.972 for TDS; for five predictions 0.966 for BOD, 0.966 for COD, 0.951 for NH3-N, 0.966 for TDS and 0.966 for TA; and for six target predictions 0.987 for BOD, 0.99 for COD, 0.982 for NH3-N, 0.99 for TDS, 0.99 for TA and 0.718 for TH. Bayesian optimization search-based NN predicted the TH value with an R2 of 0.718. The Hyperband search NN obtained better results for TH prediction during six target predictions i.e. 0.9772 for BOD, 0.976 for COD, 0.975 for NH3-N, 0.976 for TDS, 0.976 for TA, and 0.899 for TH. This work automates the neural network architecture optimization process using NAS techniques. These methods systematically explore the hyperparameter space, removing the need for manual trial-and-error tuning across different datasets and target prediction tasks (both single and multi-target). The resulting neural network models are further interpreted using the LIME Explainable Artificial Intelligence (XAI) method.

DOI

10.1038/s41598-025-21069-4

Publication Date

12-1-2025

This document is currently not available here.

Share

COinS