Modeling Red Wine Quality Based on Physicochemical Tests: A Data Mining Approach

Authors

  • Maeve Eunicia Universitas Pelita Harapan, Kampus Medan
  • Richie Skyszygfrid Universitas Pelita Harapan, Kampus Medan
  • Tiara Vitri Universitas Pelita Harapan, Kampus Medan
  • Vicky Caren Universitas Pelita Harapan, Kampus Medan

DOI:

https://doi.org/10.55927/fjmr.v1i1.414

Keywords:

wine, random forest, naive bayes, generalized linear model, data mining, descriptive, predictive analytic, prescriptive analytic

Abstract

Classification of the quality of red wine is done in the hope of making it easier to assess the quality of red wine. Data used for this research is the wine quality data set with 4898 number of instances, obtained from UCI machine learning repository. Classification of the quality of red wine this study was carried out by comparing the three algorithms of data mining, that is random forest, naive bayes and generalized linear model. From the results of this study comparing the three algorithms, the generalized linear model showed the highest accuracy among the other algorithms. It was tested with a generalized linear model with 68.75% accuracy results, this algorithm is ideal for classifying the quality of red wine. In addition, a secondary random forest gives 67.81% accuracy results, while Naive Bayes gives 61.25% accuracy results. Studies conducted to classify the quality of red wine based on its composition use a generalized linear model for the optimal algorithm.

Downloads

Download data is not yet available.

References

Ali, J., Khan, R., Ahmad, N., & Maqsood, I. (2012). Random forests and decision trees. IJCSI International Journal of Computer Science Issues, 9(5), 272–278.

Ariel Massera, Mariela Assof, Santiago Sari, Ivan Ciklis, Laura Mercado, Viviana Jofre, M. C. (2021). Effect of low temperature

fermentation on the yeast-derived volatile aroma composition and sensory profile in Merlot wines. 142. https://www.sciencedirect.com/science/article/abs/pii/S002364382100222X?via%3Dihub

Blackhurst, D., Pietersen, R., & Marais, D. (2011). Marinating beef with South African red wine may protect against lipid peroxidation during cooking. African Journal of Food Science, January.

Breiman, L. (2016). A Hybrid Data Mining Approach for Intrusion Detection on Imbalanced NSL-KDD Dataset. Statistics Department University of California Berkeley, CA 94720, 7(6), 1–33. https://doi.org/10.14569/ijacsa.2016.070603

Butzke C. E.Vogt E. E.Chacón-Rodríguez L. (2012). Effects of heat exposure on wine quality during transport and storage. https://doi.org/10.1080/09571264.2011.646254

Chauhan, R., & Kaur, H. (2015). Predictive Analytics and Data Mining. Business Intelligence, June, 359–374. https://doi.org/10.4018/978-1-4666-9562-7.ch019

Chen, H., Hu, S., Hua, R., & Zhao, X. (2021). Improved naive Bayes classification algorithm for traffic risk management. Eurasip Journal on Advances in Signal Processing, 2021(1). https://doi.org/10.1186/s13634-021-00742-6

Chung, H., Son, J., Park, E., Kim, E., & Lim, S. (2008). Journal of Food Composition and Analysis Effect of vibration and storage on some physico-chemical properties of a commercial red wine. 21, 655–659. https://doi.org/10.1016/j.jfca.2008.07.004

Cortez, P., Cerdeira, A., Almeida, F., Matos, T., & Reis, J. (2009). Modeling wine preferences by data mining from physicochemical properties. Decision Support Systems, 47(4), 547–553. https://doi.org/10.1016/j.dss.2009.05.016

Danner, L., Scientific, T. C., Ristic, R., & Johnson, T. (2016). Context and wine quality effects on consumers ’ mood , emotions , liking and willingness to pay for Australian Shiraz wines Context and wine quality effects on consumers ’ mood , emotions , liking and willingness to pay for Australian Shiraz wines. November 2018. https://doi.org/10.1016/j.foodres.2016.08.006

Gentle, J. E., Härdle, W. K., & Mori, Y. (2012). Handbook of computational statistics: Concepts and methods: Second Edition. Handbook of Computational Statistics: Concepts and Methods: Second Edition, February 2004, 1–1192. https://doi.org/10.1007/978-3-642-21551-3

Gim, J., Lee, S., &Joo, W. (2018). A study of prescriptive analysis framework for human care services based on CKAN cloud. Journal of Sensors, 2018. https://doi.org/10.1155/2018/6167385

Gupta, M., & C, V. (2021). A Study and Analysis of Machine Learning Techniques in Predicting Wine Quality. International Journal of Recent Technology and Engineering (IJRTE), 10(1), 314–319. https://doi.org/10.35940/ijrte.a5854.0510121

HUGH JOHNSON and JANCIS ROBINSON, M. B. (2013). The World Atlas of Wine, 7th edition. https://doi.org/10.1017/jwe.2013.39

Hussain, M. (2014). Descriptive statistics - presenting your results I Descriptive statistics - presenting your results I Qualitative Variable Analysis : Types of Statistical Analyses : Descriptive Statistical Analysis : Journal of the Pakistan Medical Association, July 2012, 8–11.

J., R. J. R. J. H. R. (2007). Vintage: The Story of Wine. https://doi.org/10.2307/4612162

Kullarni, V. Y., & Sinha, P. K. (2013). Random Forest Classifier: A Survey and Future Research Directions. International Journal of Advanced Computing, 36(1), 1144–1156.

Kumar, V., & L., M. (2018). Predictive Analytics: A Review of Trends and Techniques. International Journal of Computer Applications, 182(1), 31–37. https://doi.org/10.5120/ijca2018917434

Li, H. et al. A. (2018). www.econstor.eu. 178. https://doi.org/10.1016/j.wep.2018.10.002

Mattivi, F., Arapitsas, P., &Perenzoni, D. (2015). Influence of Storage Conditions on the Composition of Influence of Storage Conditions on the Composition of Red Wines. August 2014. https://doi.org/10.1021/bk-2015-1203.ch003

Robinson, A. L., Mueller, M., Heymann, H., Ebeler, S. E., Boss, P. K., Solomon, P. S., &Trengove, R. D. (2010). Effect of Simulated Shipping Conditions on Sensory Attributes and Volatile Composition of Commercial White and Red Wines. 3, 337–347.

Syaputri, A. W., &Irwandi, E. (2020). Na e Ba e A g i h f C a ificai S de Maj S eciai a i f. 1(1), 1–5.

Thach, L., & Feb, M. W. (2020). Statistics on US Wine Market for 2019. i, 1–8.

Wibawa, A. P., Kurniawan, A. C., Murti, D. M. P., Adiperkasa, R. P., Putra, S. M., Kurniawan, S. A., &Nugraha, Y. R. (2019). Naïve Bayes Classifier for Journal Quartile Classification. International Journal of Recent Contributions from Engineering, Science & IT (IJES), 7(2), 91. https://doi.org/10.3991/ijes.v7i2.10659

World Health Organization. (n.d.). Global status report on alcohol and health. https://www.who.int/substance_abuse/publications/global_alcohol_report/msbgsruprofiles.pdf

Xhemali, D., J. Hinde, C., & G. Stone, R. (2009). Naive Bayes vs. Decision Trees vs. Neural Networks in the Classification of Training Web Pages. International Journal of Computer Science, 4(1), 16–23. http://cogprints.org/6708/

Yasar, A., &Saritas, M. M. (2019). Performance Analysis of ANN and Naive Bayes Classification Algorithm for Data Classification. International Journal of Intelligent Systems and Applications in Engineering, 7(2), 88–91. https://doi.org/10.18201/ijisae.2019252786

Downloads

Published

2022-05-21

How to Cite

Maeve Eunicia, Richie Skyszygfrid, Tiara Vitri, & Vicky Caren. (2022). Modeling Red Wine Quality Based on Physicochemical Tests: A Data Mining Approach. Formosa Journal of Multidisciplinary Research, 1(1), 89–110. https://doi.org/10.55927/fjmr.v1i1.414

Issue

Section

Articles