Food and Bioprocess Technology, cilt.19, sa.4, 2026 (SCI-Expanded, Scopus)
Abstract: Food spoilage prediction is a critical challenge in food safety and quality management, particularly for meat products exhibiting complex microbiological and biochemical dynamics. This study presents an explainable machine learning framework for predicting sausage spoilage intensity using volatile organic compound (VOC) profiles and physicochemical parameters, enhanced through Generative Adversarial Network (GAN)-based data augmentation. The proposed framework integrates interpretable machine learning models, random forest, gradient boosting, logistic regression, multi-layer perceptron, and a voting classifier with the TVAESynthesizer generative model to address data scarcity and imbalance in experimental food datasets. SHapley Additive exPlanations (SHAP) were employed to quantify the contribution of individual VOCs and physicochemical variables to spoilage classification, thereby enhancing model transparency and biological interpretability. Results revealed that GAN-augmented datasets substantially improved predictive performance compared to models trained on original data. For poultry sausages, the gradient boosting and random forest models achieved an accuracy of 0.92, while for pork sausages, both models reached an accuracy of 0.89. In addition, fold-wise regeneration of synthetic data during cross-validation yielded highly stable model performance, with Random Forest and Gradient Boosting achieving accuracies and F1-scores above 0.90 for poultry sausages, and consistently robust peak accuracies around 0.89 for pork sausages, confirming the reliability of the GAN-augmented training strategy. SHAP analysis revealed that Sampling Time and pH are the dominant predictors of spoilage for both poultry and pork sausages, with alcohol-related volatile compounds such as 1-propanol, 2-butanone, and 2-butanol driving predictions in poultry, and ethyl acetate, methanethiol, dimethyl sulfide, and hexanal playing a major role in pork spoilage classification. Overall, integrating generative modeling with explainable AI significantly improves both predictive accuracy and interpretability. The proposed framework offers a sustainable, data efficient, and interpretable solution for real time, non-destructive monitoring of meat freshness and quality. Graphic Abstract: (Figure presented.)