Journal of Agricultural Sciences, cilt.30, sa.1, ss.118-130, 2024 (SCI-Expanded)
The presence of Salmonella in agricultural waters may be a source of produce contamination. Recently, the performances of various algorithms have been tested for the prediction of indicator bacteria population and pathogen occurrence in agricultural water sources. The purpose of this study was to evaluate the performance of meta-heuristic optimization algorithms for feature selection to increase the Salmonella occurrence prediction success of commonly used algorithms in agricultural waters. Previously collected datasets from six agricultural ponds in Central Florida included the population of indicator microorganisms, physicochemical water attributes, and weather station measurements. Salmonella presence was also reported with PCR-confirmed method in data set. Features were selected by using binary meta-heuristic optimization methods including differential evolution optimization (DEO), grey wolf optimization (GWO), Harris hawks optimization (HHO) and particle swarm optimization (PSO). Each meta-heuristic method was run 100 times for the extraction of features before classification analysis. Selected features after optimization were used in the K-nearest neighbor algorithm (kNN), support vector machine (SVM) and decision tree (DT) classification methods. Microbiological indicators were ranked as the first or second features by all optimization algorithms. Generic Escherichia coli was selected as the first feature 81 and 91 times out of 100 using GWO and DEO, respectively. The meta-heuristic optimization algorithms for the feature selection process followed by machine learning classification methods yielded a prediction accuracy between 93.57 and 95.55%. Meta-heuristic optimization algorithms had a positive effect on improving Salmonella prediction success in agricultural waters despite spatio-temporal variations. This study indicates that the development of computer-based tools with improved meta-heuristic optimization algorithms can help growers to assess risk of Salmonella occurrence in specific agricultural water sources with the increased prediction success.