Application of Machine Learning Models in Water Quality Classification in Lake Maninjau: Random Forest as the Optimal Solution
(1)   Indonesia
(2) Universitas Negeri Padang  Indonesia
(3) Universitas Islam Negeri Mahmud Yunus Batusangkar  Indonesia
Corresponding Author
DOI : https://doi.org/10.24036/et.v12i1.128941
Full Text: Language : id
Abstract
This research develops a machine learning model to classify water quality in Lake Maninjau using data from the Ministry of Environment and Forestry's Onlimo application. The dataset includes parameters such as temperature, pH, DO, conductivity, TDS, salinity, turbidity, nitrate and ammonium. Four machine learning algorithms were tested: Logistic Regression, SVM, Gradient Boosting, and Random Forest. As a result, Random Forest shows the best performance with an average accuracy of 87.33% and a standard deviation of 6.97%, and a test accuracy of 90.63%. This model is effective in monitoring and managing water quality, supporting authorities in water resource management decision making. This research also shows how the integration of machine learning and IoT can provide practical solutions in environmental monitoring.
References
Aish, A. M., Zaqoot, H. A., Sethar, W. A. & Aish, D. A. (2023). Prediction of groundwater quality index in the Gaza coastal aquifer using supervised machine learning techniques. Water Practice & Technology, 18(3), 501–521. https://doi.org/10.2166/wpt.2023.028
Alomani, S. M., Alhawiti, N. I. & Alhakamy, A. (2022). Prediction of Quality of Water According to a Random Forest Classifier. International Journal of Advanced Computer Science and Applications, 13(6). https://doi.org/10.14569/IJACSA.2022.01306105
Ankrah, B., Brew, L. & Acquah, J. (2024). Multi-Class Classification of Genetic Mutation Using Machine Learning Models. Computational Journal of Mathematical and Statistical Sciences, 3(2), 280–315. https://doi.org/10.21608/cjmss.2024.267064.1040
Baek, S.-S., Pyo, J. & Chun, J. A. (2020). Prediction of Water Level and Water Quality Using a CNN-LSTM Combined Deep Learning Approach. Water, 12(12), 3399. https://doi.org/10.3390/w12123399
Damayanti, A. A., Wahjono, H. D. & Santoso, A. D. (2022). Pemantauan Kualitas Air Secara Online dan Analisis Status Mutu Air di Danau Toba, Sumatera Utara. Jurnal Sumberdaya Alam Dan Lingkungan, 9(3), 113–120. https://doi.org/10.21776/ub.jsal.2022.009.03.4
Dogo, E. M., Nwulu, N. I., Twala, B. & Aigbavboa, C. O. (2020). Empirical Comparison of Approaches for Mitigating Effects of Class Imbalances in Water Quality Anomaly Detection. IEEE Access, 8, 218015–218036. https://doi.org/10.1109/ACCESS.2020.3038658
Haekal, M. & Wibowo, W. C. (2023). Prediksi Kualitas Air Sungai Menggunakan Metode Pembelajaran Mesin: Studi Kasus Sungai Ciliwung. Jurnal Teknologi Lingkungan, 24(2), 273–282. https://doi.org/10.55981/jtl.2023.795
Hassan, Md. M., Hassan, Md. M., Akter, L., Rahman, Md. M., Zaman, S., Hasib, K. Md., Jahan, N., Smrity, R. N., Farhana, J., Raihan, M. & Mollick, S. (2021). Efficient Prediction of Water Quality Index (WQI) Using Machine Learning Algorithms. Human-Centric Intelligent Systems, 1(3–4), 86. https://doi.org/10.2991/hcis.k.211203.001
Hayder, G., Kurniawan, I. & Mustafa, H. M. (2020). Implementation of Machine Learning Methods for Monitoring and Predicting Water Quality Parameters. Biointerface Research in Applied Chemistry, 11(2), 9285–9295. https://doi.org/10.33263/BRIAC112.92859295
Islam Khan, Md. S., Islam, N., Uddin, J., Islam, S. & Nasir, M. K. (2022a). Water quality prediction and classification based on principal component regression and gradient boosting classifier approach. Journal of King Saud University - Computer and Information Sciences, 34(8), 4773–4781. https://doi.org/10.1016/j.jksuci.2021.06.003
Keputusan Menteri Negara Lingkungan Hidup Nomor 115. (2003). Keputusan Menteri Negara Lingkungan Hidup Nomor 115 Tahun 2003. https://dokumen.tips/documents/kepmen-no-115-tahun-2003.html?page=1
Mohammed, A. & Kora, R. (2023). A comprehensive review on ensemble deep learning: Opportunities and challenges. Journal of King Saud University - Computer and Information Sciences, 35(2), 757–774. https://doi.org/10.1016/j.jksuci.2023.01.014
Ningsih, L., Jaman, J. H., Salam, N. I. & Haikal, M. (2024). Perbandingan Kinerja Algoritma Klasifikasi Status Mutu Air. Indonesian Journal of Multidisciplinary on Social and Technology, 2(1), 72–76. https://doi.org/10.31004/ijmst.v2i1.298
Patel, J., Amipara, C., Ahanger, T. A., Ladhva, K., Gupta, R. K., Alsaab, H. O., Althobaiti, Y. S. & Ratna, R. (2022). A Machine Learning-Based Water Potability Prediction Model by Using Synthetic Minority Oversampling Technique and Explainable AI. Computational Intelligence and Neuroscience, 2022, 1–15. https://doi.org/10.1155/2022/9283293
S, S., Tamatgar, N., Dilli, R. & M, K. (2024). Deployment of Random Forest Algorithm for prediction of ammonia in river water. Proceedings of the 2024 13th International Conference on Software and Computer Applications, 18–23. https://doi.org/10.1145/3651781.3651811
Sami, O., Elsheikh, Y. & Almasalha, F. (2021). The Role of Data Pre-processing Techniques in Improving Machine Learning Accuracy for Predicting Coronary Heart Disease. International Journal of Advanced Computer Science and Applications, 12(6). https://doi.org/10.14569/IJACSA.2021.0120695
Saraswat, P. & Raj, S. (2022). DATA PRE-PROCESSING TECHNIQUES IN DATA MINING: A REVIEW. International Journal of Innovative Research in Computer Science & Technology, 122–125. https://doi.org/10.55524/ijircst.2022.10.1.22
Sudarso, J., Tri Suryono, T. S., P. Yoga, G., Imroatusshoolikhah, I., Ibrahim, A., Laela Sari, L. S., Muhammad Badjoeri, M. B. & Octavianto Samir, O. S. (2021). Effect of Anthropogenic Activity on Benthic Macroinvertebrate Functional Feeding Groups in Small Streams of West Sumatra, Indonesia. Sains Malaysiana, 51(11), 3551–3566. https://doi.org/10.17576/jsm-2022-5111-04
Suh, Y. S., Shin, S. K., Baang, D., Seo, S. M. & Lee, J. B. (2021). A Brief Review of Non-linear Support Vector Machine for Machine Learning Programming. https://www.kns.org/files/pre_paper/46/21A-011-%EC%84%9C%EC%9A%A9%EC%84%9D.pdf
Victoriano, J. M., Lacatan, L. L. & Vinluan, A. A. (2020). Predicting River Pollution Using Random Forest Decision Tree with GIS Model: A Case Study of MMORS, Philippines. International Journal of Environmental Science and Development, 11(1), 36–42. https://doi.org/10.18178/ijesd.2020.11.1.1222
Wolfram, J., Stehle, S., Bub, S., Petschick, L. L. & Schulz, R. (2021). Water quality and ecological risks in European surface waters – Monitoring improves while water quality decreases. Environment International, 152, 106479. https://doi.org/10.1016/j.envint.2021.106479
Wright, V. (2019). Machine Learning: Using the Logistic Regression Model to Predict Coronary Heart Disease. https://www.wrightanalytics-mn.com/pages/Machine_Learning_Using_the_Logistic_Regression_Model_to_Predict_Coronary_Heart_Final.pdf
Zhang, Z., Zhao, Y., Canes, A., Steinberg, D. & Lyashevska, O. (2019). Predictive analytics with gradient boosting in clinical medicine. Annals of Translational Medicine, 7(7), 152–152. https://doi.org/10.21037/atm.2019.03.29
Article Metrics
Abstract Views : 61 timesPDF (Bahasa Indonesia) Downloaded : 15 times
Refbacks
- There are currently no refbacks.