Hybrid Intelligence For Precision Milk Quality Assessment Integrating Naïve Bayes And K-Means Clustering
Abstract
Milk quality is a fundamental factor in the food industry, directly impacting consumer health and product market value. Conventional manual assessment is often inefficient for large-scale operations, requiring rapid, accurate automated solutions. This study proposes a hybrid machine learning approach integrating the Naïve Bayes algorithm for quality classification (Low, Medium, High) and K-Means Clustering for diagnostic analysis based on physicochemical similarities. Using a dataset of 1,059 samples with seven sensory attributes (pH, temperature, taste, odor, fat, turbidity, and color), the data were cleaned, normalized, and modeled. Evaluation results demonstrate that the Naïve Bayes method achieved an accuracy of 85.38%. Notably, the model achieved 100% precision in identifying low-quality milk, making it a highly reliable food safety filter. Concurrently, K-Means with k=5 (selected as the optimal value after testing k=2 - 7) successfully segmented the data into five distinct diagnostic clusters. Centroid analysis revealed that temperature and odor are the primary factors for distinguishing the root causes of quality degradation. This study concludes that combining classification and clustering methods significantly enhances quality control efficiency by providing both quality labels and diagnostic insights into the factors driving milk spoilage.
Downloads
References
H. Priyashantha, "World dairy system sustainability: a milk quality perspective," Frontiers in Sustainable Resource Management, vol. 4, Jun. 2025, doi: 10.3389/fsrma.2025.1572962.
A. Moghaddamjoo and M. Allam, "Techniques in Array Processing by Means of Transformations," Control and dynamic systems, vol. 69, pp. 133–180, 1995.
W.-K. Chen, Linear Networks and Systems: Algorithms and Computer-Aided Implementations. Belmont, CA: Wadsworth Publishing Company, 1993.
M. Siddiky, "Dairying in South Asian region: opportunities, challenges and way forward," SAARC Journal of Agriculture, vol. 15, p. 173, Jul. 2017, doi: 10.3329/sja.v15i1.33164.
V. Nimbalkar, H. K. Verma, and J. Singh, "Dairy Farming Innovations for Productivity Enhancement," in New Advances in the Dairy Industry, M. S. Qureshi, Ed. London: IntechOpen, 2021, ch. 5, doi: 10.5772/intechopen.101373.
A. A. Gabriel et al., "Fates of pathogenic bacteria in time-temperature-abused and Holder-pasteurized human donor-, infant formula-, and full cream cow's milk," Food Microbiology, vol. 89, p. 103450, Aug. 2020, doi: 10.1016/j.fm.2020.103450.
L. Bass, P. Clements, and R. Kazman, "Software Architecture in Practice 2nd Edition," Jan. 2003.
T. J. van Weert and R. K. Munro, Eds., Informatics and the Digital Society: Social, Ethical and Cognitive Issues, vol. 121. Boston: Springer Science & Business Media, 2003, doi: 10.1007/978-0-387-35663-1.
M. W. Dixon, "Application of neural networks to solve the routing problem in communication networks," Doctoral Thesis, Div. Sci. Eng., Murdoch Univ., Perth, Australia, 2004.
D. Putri, G. Forda Nama, and W. Sulistiono, "Analisis Sentimen Kinerja Dewan Perwakilan Rakyat (DPR) Pada Twitter Menggunakan Metode Naive Bayes Classifier," Jurnal Informatika dan Teknik Elektro Terapan, vol. 10, Jan. 2022, doi: 10.23960/jitet.v10i1.2262.
A. Ismanto, F. Ardhani, and M. Marhamah, "The Effect of Traditional Transportation Using Cool Box on Quality of Fresh Milk and Frozen Milk from Peternakan Sapi Terpadu Sangatta to Samarinda East Kalimantan," Buletin Peternakan, vol. 42, no. 3, pp. 241–245, Aug. 2018, doi: 10.21059/buletinpeternak.v42i3.31559.
C. Guajardo et al., "MILK QUALITY AND DAIRY PRODUCT DEVELOPMENT OF A NORMANDE COW HERD IN THE REGION OF ÑUBLE, CHILE," Chilean journal of agricultural & animal sciences, vol. 36, pp. 190–197, Dec. 2020, doi: 10.29393/CHJAAS36-17MQCG80017.
G. Wanjala, "Microbiological quality and safety of raw and pasteurized milk marketed in and around Nairobi region," AFRICAN JOURNAL OF FOOD, AGRICULTURE, NUTRITION AND DEVELOPEMENT, vol. 17, pp. 11518–11532, Mar. 2017, doi: 10.18697/ajfand.77.15320.
J. Chauvin et al., "Advanced Optical Technologies in Food Quality and Waste Management," in Innovation in the Food Sector Through the Valorization of Food and Agro-Food By-Products, A. N. de Barros and I. Gouvinhas, Eds. London: IntechOpen, 2021, ch. 6, doi: 10.5772/intechopen.97624.
K. Chhetri, "Applications of Artificial Intelligence and Machine Learning in Food Quality Control and Safety Assessment," Food Engineering Reviews, vol. 16, pp. 1–21, Dec. 2023, doi: 10.1007/s12393-023-09363-1.
A. Siddique et al., "Big data analytics in food industry: a state-of-the-art literature review," npj Science of Food, vol. 9, no. 1, p. 36, Mar. 2025, doi: 10.1038/s41538-025-00394-y.
S. Wolfert et al., "Navigating the Twilight Zone: Pathways towards digital transformation of food systems," Sep. 2021, doi: 10.18174/552346.
B. Budi, F. A. Andhika, and T. Mahardika, "PENILAIAN KUALITAS UDARA DAN ANALISIS POLUSI BERBASIS ALGORITMA NAIVE BAYES DAN KLUSTRERISASI DATA DENGAN K-MEANS," Jurnal Informatika dan Teknik Elektro Terapan (JITET), vol. 13, no. 3S1, pp. 408–415, 2025, doi: 10.23960/jitet.v13i3S1.7630.
M. Khan, V. Thorup, and Z. Luo, "Delineating Mastitis Cases in Dairy Cows: Development of an IoT-Enabled Intelligent Decision Support System for Dairy Farms," IEEE Transactions on Industrial Informatics, vol. PP, Apr. 2024, doi: 10.1109/TII.2024.3384594.
O. Kashongwe et al., "Influence of Preprocessing Methods of Automated Milking Systems Data on Prediction of Mastitis with Machine Learning Models," AgriEngineering, vol. 6, no. 3, pp. 3427–3442, 2024, doi: 10.3390/agriengineering6030195.
G. Vishwakarma, A. Sonpal, and J. Hachmann, "Metrics for benchmarking and uncertainty quantification: Quality, applicability, and a path to best practices for machine learning in chemistry," Trends in Chemistry, vol. 3, no. 2, pp. 146–156, 2021, doi: 10.1016/j.trechm.2020.12.004.
C. Shrijayan, "Milk Quality Prediction Dataset," Kaggle, 2024. [Online]. Available: https://www.kaggle.com/datasets/cpluzshrijayan/milkquality
F. Ramadhani, Al-Khowarizmi, and I. P. Sari, "Improving the Performance of Naïve Bayes Algorithm by Reducing the Attributes of Dataset Using Gain Ratio and Adaboost," in 2021 International Conference on Computer Science and Engineering (IC2SE), 2021, vol. 1, pp. 1–5, doi: 10.1109/IC2SE52832.2021.9792027.
A. Karahoca, Data Mining Applications in Engineering and Medicine. London: IntechOpen, 2012, doi: 10.5772/2616.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.



