Forecasting PM2.5 Concentrations in Nakhon Sawan Province Using Machine Learning, Deep Learning, and Prophet-Based Hybrid Models

Authors

  • Thiraphat Meesumrarn Department of Computer Science and Information Technology, Faculty of Science and Technology, Nakhon Sawan Rajabhat University
  • Chom Panta Department of Mathematics and Statistics, Faculty of Science and Technology, Nakhon Sawan Rajabhat University
  • Phassakorn Worra-arj Department of Computer Science and Information Technology, Faculty of Science and Technology, Nakhon Sawan Rajabhat University
  • Withoon Sonthipak Department of Computer Science and Information Technology, Faculty of Science and Technology, Nakhon Sawan Rajabhat University

Keywords:

PM2.5, Time-Series Forecasting, Machine Learning, Deep Learning, Prophet-Based Hybrid Models

Abstract

Particulate matter with an aerodynamic diameter of less than 2.5 microns (PM2.5) in Nakhon Sawan Province is an important environmental and public health issue. PM2.5 concentrations vary seasonally and are influenced by several air pollution and meteorological factors. This study aimed to compare the performance of baseline models, machine learning models, deep learning models, and Prophet-based hybrid models for daily PM2.5 forecasting in Nakhon Sawan Province. The study used secondary data collected from January 2020 to December 2025, consisting of PM2.5 concentrations and ten air pollution and meteorological variables. The data were divided chronologically into a training set from 2020 to 2023, a validation set from 2024, and a test set from 2025. Forecasting performance was evaluated for 1-, 3-, and 7-day-ahead horizons by comparing Persistence, Moving Average, Prophet, LSTM, GRU, XGBoost, Prophet-LSTM, Prophet-GRU, and Prophet-XGBoost models. The models were assessed using RMSE, MAE, MAPE, and , together with the classification of days when PM2.5 concentrations exceeded 50 µg/m³. The results showed that GRU achieved the best performance for 1- and 3-day-ahead forecasting, with RMSE values of 12.2476 and 18.4001, respectively. In contrast, Prophet achieved the best performance for 7-day-ahead forecasting, with an RMSE of 20.5216. These findings indicate that the most suitable forecasting model may vary depending on the forecasting horizon. Comparing multiple groups of models provides a clearer understanding of the strengths and limitations of each approach. The findings can support the development of forecasting tools and local PM2.5 monitoring systems.

References

วิฑูร สนธิปักษ์, ชม ปานตา, ภาสกร วรอาจ, และ ถิรภัทร มีสำราญ. (2567). การพยากรณ์ปริมาณฝุ่นละออง PM2.5 ในจังหวัดนครสวรรค์ ด้วยเทคนิคการเรียนรู้ของเครื่อง. วารสารวิชาการวิทยาศาสตร์และวิทยาศาสตร์ประยุกต์, 8(16), 25-31. https://ph03.tci-thaijo.org/index.php/ajsas/article/view/3464/2625

Abuouelezz, W., Ali, N., Aung, Z., Altunaiji, A., Shah, S. B., & Gliddon, D. (2025). Exploring PM2.5 and PM10 ML forecasting models: A comparative study in the UAE. Scientific Reports, 15(1), 9797. https://doi.org/10.1038/s41598-025-94013-1

Bai, X., Zhang, N., Cao, X., & Chen, W. (2024). Prediction of PM2.5 concentration based on a CNN-LSTM neural network algorithm. PeerJ, 12, e17811. https://doi.org/10.7717/peerj.17811

Cáceres-Tello, J., & Galán-Hernández, J. J. (2024). Analysis and prediction of PM2.5 pollution in Madrid: The use of Prophet–Long Short-Term Memory hybrid models. AppliedMath, 4(4), 1428–1452. https://doi.org/10.3390/appliedmath4040076

Chang, W., Chen, X., He, Z., & Zhou, S. (2023). A prediction hybrid framework for air quality integrated with W-BiLSTM(PSO)-GRU and XGBoost methods. Sustainability, 15(22), 16064. https://doi.org/10.3390/su152216064

Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794). https://doi.org/10.1145/2939672.2939785

Cho, K., van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder–decoder for statistical machine translation. arXiv. https://doi.org/10.48550/arXiv.1406.1078

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780. https://doi.org/10.1162/neco.1997.9.8.1735

Huang, C.-J., & Kuo, P.-H. (2018). A deep CNN-LSTM model for particulate matter (PM2.5) forecasting in smart cities. Sensors, 18(7), 2220. https://doi.org/10.3390/s18072220

Karimian, H., Li, Q., Wu, C., Qi, Y., Mo, Y., Chen, G., Zhang, X., & Sachdeva, S. (2019). Evaluation of different machine learning approaches to forecasting PM2.5 mass concentrations. Aerosol and Air Quality Research, 19(6), 1400–1410. https://doi.org/10.4209/aaqr.2018.12.0450

Lee, C.-Y., Lee, J.-Y., Han, S.-H., Kang, J.-G., Lee, J.-B., & Choi, D.-R. (2025). Performance evaluation of PM2.5 forecasting using SARIMAX and LSTM in the Korean Peninsula. Atmosphere, 16(5), 524. https://doi.org/10.3390/atmos16050524

Lei, T. M. T., Ng, S. C. W., & Siu, S. W. I. (2023). Application of ANN, XGBoost, and Other ML Methods to Forecast Air Quality in Macau. Sustainability, 15(6), 5341. https://doi.org/10.3390/ su15065341

Li, T., Hua, M., & Wu, X. (2020). A hybrid CNN-LSTM model for forecasting particulate matter (PM2.5). IEEE Access, 8, 26933-26940. https://doi.org/10.1109/ACCESS.2020.2971348

Shen, J., Valagolam, D., & McCalla, S. (2020). Prophet forecasting model: A machine learning approach to predict the concentration of air pollutants (PM2.5, PM10, O3, NO2, SO2, CO) in Seoul, South Korea. PeerJ, 8, e9961. https://doi.org/10.7717/peerj.9961

Taylor, S. J., & Letham, B. (2018). Forecasting at scale. The American Statistician, 72(1), 37–45. https://doi.org/10.1080/00031305.2017.1380080

World Health Organization. (2021). WHO global air quality guidelines: Particulate matter (PM2.5 and PM10), ozone, nitrogen dioxide, sulfur dioxide and carbon monoxide. https://www.who.int/publications/i/item/9789240034228

Downloads

Published

2026-06-30

Issue

Section

บทความวิจัย