The use of artificial intelligence methods to predict income and expense trends from semi-unstructured data for accounting planning for pineapple farmers in Phitsanulok province
Main Article Content
Abstract
This study explores the use of artificial intelligence (AI) techniques, particularly Natural Language Processing (NLP), to predict income and expense trends for pineapple farmers in Ban Yang Subdistrict, Nakhon Thai District, Phitsanulok Province. The research focuses on extracting and analyzing semi-unstructured accounting data, primarily from PDF files, provided by 30 farmers with prior accounting experience. The data was processed using a custom NLP algorithm to classify financial records into payments and income categories. Time series forecasting models—Prophet, LSTM, and ARIMA—were applied to predict future trends. The results revealed significant differences in model performance, with ARIMA outperforming both LSTM and Prophet. ARIMA exhibited the lowest Mean Absolute Error (MAE) of 34.1084 and the highest R-squared (R²) of 0.9901, indicating superior prediction accuracy and explanatory power. LSTM showed a MAE of 71.0920 and an R² of 0.9511, demonstrating good accuracy but with higher MAE and lower R² than ARIMA. Prophet demonstrated the highest MAE of 603.8044 and the lowest R² of -3.2273, reflecting poor performance. Based on these findings, ARIMA is identified as the most suitable model for this dataset, with potential for further optimization and exploration of additional features to improve predictive accuracy.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
Ali, A. A. and Masmoudi, A. 2025. Building prediction models for the e-government development index (EGDI) in Iraq and KSA: a comparative ARIMA-based approach. Fusion: Practice and Applications (FPA). 19(2): 134–150.
Bhatt, H. S., Ramakrishnan, S., Raja, S., and Jawahar, C. V. 2024. Unlocking the potential of unstructured data in business documents through document intelligence. In Proceeding of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD) (pp. 505-509).
Blei, D. M., and Lafferty, J. D. 2007. A correlated topic model of Science. The 10th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.
Cao, S.S., Jiang, W., Lei, L., and Zhou, Q. 2024. Applied AI for finance and accounting: Alternative data and opportunities. Pacific-Basin Finance Journal. 84: 102307.
Cooperative Audit Department. (2009). Accounting form for pineapple farmers. Retrieved from https://www.cad.go.th/ewtadmin/ewt/cadweb_org/download/e_learning/acc_2009_06.pdf.
Dhawas, P., Ramteke, M. A., Thakur, A., Polshetwar, P. V., Salunkhe, R. V., and Bhagat, D. 2024. Big data analysis techniques: Data preprocessing techniques, data mining techniques, machine learning algorithm, visualization. In Big data analytics techniques for market intelligence (pp. 183-208). IGI Global.
Finkel, J. R., Grenager, T., and Manning, C. D. 2005. Incorporating non-local information into information extraction systems by Gibbs sampling. In Proceeding of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005).
Mikolov, T., Yih, W., and Zweig, G. 2013. Linguistic regularities in continuous space word representations. In Proceeding of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2013).
Nallapati, R., Zhou, B., and Mahajan, D. 2017. Abstractive text summarization using sequence-to-sequence RNNs and beyond. In Proceeding of the 33rd International Conference on Machine Learning (ICML 2017).
Pang, B., and Lee, L. 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval. 2(1-2): 1-135.
Singh, M. 2024. Accounting for a farm business: A conceptual study. Contemporary Social Sciences. 33(1): 66.
Tredinnick, L., and Laybats, C., 2024. Managing unstructured information. Business Information Review. 41(3): 90-93.
VanGessel, F. G., Perry, E., Mohan, S., Barham, O. M., and Cavolowsky, M. 2024. NLP for knowledge discovery and information extraction from energetics corpora. arXiv:2402.06964.