Modeling Red Wine Quality Based on Physicochemical Tests: A Data Mining Approach
DOI:
https://doi.org/10.55927/fjmr.v1i1.414Keywords:
wine, random forest, naive bayes, generalized linear model, data mining, descriptive, predictive analytic, prescriptive analyticAbstract
Classification of the quality of red wine is done in the hope of making it easier to assess the quality of red wine. Data used for this research is the wine quality data set with 4898 number of instances, obtained from UCI machine learning repository. Classification of the quality of red wine this study was carried out by comparing the three algorithms of data mining, that is random forest, naive bayes and generalized linear model. From the results of this study comparing the three algorithms, the generalized linear model showed the highest accuracy among the other algorithms. It was tested with a generalized linear model with 68.75% accuracy results, this algorithm is ideal for classifying the quality of red wine. In addition, a secondary random forest gives 67.81% accuracy results, while Naive Bayes gives 61.25% accuracy results. Studies conducted to classify the quality of red wine based on its composition use a generalized linear model for the optimal algorithm.
Downloads
References
Ali, J., Khan, R., Ahmad, N., & Maqsood, I. (2012). Random forests and decision trees. IJCSI International Journal of Computer Science Issues, 9(5), 272–278.
Ariel Massera, Mariela Assof, Santiago Sari, Ivan Ciklis, Laura Mercado, Viviana Jofre, M. C. (2021). Effect of low temperature
fermentation on the yeast-derived volatile aroma composition and sensory profile in Merlot wines. 142. https://www.sciencedirect.com/science/article/abs/pii/S002364382100222X?via%3Dihub
Blackhurst, D., Pietersen, R., & Marais, D. (2011). Marinating beef with South African red wine may protect against lipid peroxidation during cooking. African Journal of Food Science, January.
Breiman, L. (2016). A Hybrid Data Mining Approach for Intrusion Detection on Imbalanced NSL-KDD Dataset. Statistics Department University of California Berkeley, CA 94720, 7(6), 1–33. https://doi.org/10.14569/ijacsa.2016.070603
Butzke C. E.Vogt E. E.Chacón-Rodríguez L. (2012). Effects of heat exposure on wine quality during transport and storage. https://doi.org/10.1080/09571264.2011.646254
Chauhan, R., & Kaur, H. (2015). Predictive Analytics and Data Mining. Business Intelligence, June, 359–374. https://doi.org/10.4018/978-1-4666-9562-7.ch019
Chen, H., Hu, S., Hua, R., & Zhao, X. (2021). Improved naive Bayes classification algorithm for traffic risk management. Eurasip Journal on Advances in Signal Processing, 2021(1). https://doi.org/10.1186/s13634-021-00742-6
Chung, H., Son, J., Park, E., Kim, E., & Lim, S. (2008). Journal of Food Composition and Analysis Effect of vibration and storage on some physico-chemical properties of a commercial red wine. 21, 655–659. https://doi.org/10.1016/j.jfca.2008.07.004
Cortez, P., Cerdeira, A., Almeida, F., Matos, T., & Reis, J. (2009). Modeling wine preferences by data mining from physicochemical properties. Decision Support Systems, 47(4), 547–553. https://doi.org/10.1016/j.dss.2009.05.016
Danner, L., Scientific, T. C., Ristic, R., & Johnson, T. (2016). Context and wine quality effects on consumers ’ mood , emotions , liking and willingness to pay for Australian Shiraz wines Context and wine quality effects on consumers ’ mood , emotions , liking and willingness to pay for Australian Shiraz wines. November 2018. https://doi.org/10.1016/j.foodres.2016.08.006
Gentle, J. E., Härdle, W. K., & Mori, Y. (2012). Handbook of computational statistics: Concepts and methods: Second Edition. Handbook of Computational Statistics: Concepts and Methods: Second Edition, February 2004, 1–1192. https://doi.org/10.1007/978-3-642-21551-3
Gim, J., Lee, S., &Joo, W. (2018). A study of prescriptive analysis framework for human care services based on CKAN cloud. Journal of Sensors, 2018. https://doi.org/10.1155/2018/6167385
Gupta, M., & C, V. (2021). A Study and Analysis of Machine Learning Techniques in Predicting Wine Quality. International Journal of Recent Technology and Engineering (IJRTE), 10(1), 314–319. https://doi.org/10.35940/ijrte.a5854.0510121
HUGH JOHNSON and JANCIS ROBINSON, M. B. (2013). The World Atlas of Wine, 7th edition. https://doi.org/10.1017/jwe.2013.39
Hussain, M. (2014). Descriptive statistics - presenting your results I Descriptive statistics - presenting your results I Qualitative Variable Analysis : Types of Statistical Analyses : Descriptive Statistical Analysis : Journal of the Pakistan Medical Association, July 2012, 8–11.
J., R. J. R. J. H. R. (2007). Vintage: The Story of Wine. https://doi.org/10.2307/4612162
Kullarni, V. Y., & Sinha, P. K. (2013). Random Forest Classifier: A Survey and Future Research Directions. International Journal of Advanced Computing, 36(1), 1144–1156.
Kumar, V., & L., M. (2018). Predictive Analytics: A Review of Trends and Techniques. International Journal of Computer Applications, 182(1), 31–37. https://doi.org/10.5120/ijca2018917434
Li, H. et al. A. (2018). www.econstor.eu. 178. https://doi.org/10.1016/j.wep.2018.10.002
Mattivi, F., Arapitsas, P., &Perenzoni, D. (2015). Influence of Storage Conditions on the Composition of Influence of Storage Conditions on the Composition of Red Wines. August 2014. https://doi.org/10.1021/bk-2015-1203.ch003
Robinson, A. L., Mueller, M., Heymann, H., Ebeler, S. E., Boss, P. K., Solomon, P. S., &Trengove, R. D. (2010). Effect of Simulated Shipping Conditions on Sensory Attributes and Volatile Composition of Commercial White and Red Wines. 3, 337–347.
Syaputri, A. W., &Irwandi, E. (2020). Na e Ba e A g i h f C a ificai S de Maj S eciai a i f. 1(1), 1–5.
Thach, L., & Feb, M. W. (2020). Statistics on US Wine Market for 2019. i, 1–8.
Wibawa, A. P., Kurniawan, A. C., Murti, D. M. P., Adiperkasa, R. P., Putra, S. M., Kurniawan, S. A., &Nugraha, Y. R. (2019). Naïve Bayes Classifier for Journal Quartile Classification. International Journal of Recent Contributions from Engineering, Science & IT (IJES), 7(2), 91. https://doi.org/10.3991/ijes.v7i2.10659
World Health Organization. (n.d.). Global status report on alcohol and health. https://www.who.int/substance_abuse/publications/global_alcohol_report/msbgsruprofiles.pdf
Xhemali, D., J. Hinde, C., & G. Stone, R. (2009). Naive Bayes vs. Decision Trees vs. Neural Networks in the Classification of Training Web Pages. International Journal of Computer Science, 4(1), 16–23. http://cogprints.org/6708/
Yasar, A., &Saritas, M. M. (2019). Performance Analysis of ANN and Naive Bayes Classification Algorithm for Data Classification. International Journal of Intelligent Systems and Applications in Engineering, 7(2), 88–91. https://doi.org/10.18201/ijisae.2019252786
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Maeve Eunicia, Richie Skyszygfrid, Tiara Vitri, Vicky Caren

This work is licensed under a Creative Commons Attribution 4.0 International License.




























