Development of Generative Chatbot towards Thai Rice Knowledge Using Retrieval-Augmented Generation
Keywords:
Generative Chatbot, Retrieval-Augmented Generation, Thai Rice Knowledge, advisory conversational systemAbstract
This paper presents the development of a generative chatbot to support Thai rice farmers by extending rice-relevant knowledge from reliable external sources. The developed chatbot system is based on a lightweight pretrained large language model and is improved through fine-tuning using collected documents related to Thai rice farming. In addition, approved Thai rice information is utilized as an external knowledge base within a retrieval-augmented generation framework to enable responses to domain-specific and technical questions related to Thai rice cultivation. Experimental evaluation compares chatbot systems with and without the application of retrieval-augmented generation. The results indicate that the generative chatbot incorporating retrieval-augmented generation performs better than the baseline system in terms of response relevance and factual reliability. User-driven evaluation shows that the RAG-based chatbot produces 91.43% acceptable responses and 8.57% partially acceptable responses, with no unacceptable responses observed. To further assess system reliability, a grounding-based evaluation was conducted using a controlled query set with document-level reference attribution. The proposed system achieves document-level grounding recall of 0.896 (micro-average) and 0.983 (macro-average), indicating effective utilization of the external knowledge base. These findings suggest that retrieval-augmented generation is an effective approach for developing generative chatbots in domain-specific and localized agricultural advisory contexts, where accuracy, grounding, and trustworthiness of information are essential.
References
Siripong A. Analysis of government policies on rice production in Thailand: a policy evaluation study. International Journal of Agriculture. 2024;9(1):9–21.
Chandolikar N, Dale C, Koli T, Singh M, Narkhede T. Agriculture assistant chatbot using artificial neural network. In: Proceedings of the 2022 International Conference on Advanced Computing Technologies and Applications (ICACTA). IEEE; 2022. p. 1–5. doi:10.1109/ICACTA54488.2022.9753433
Gupta S, Chen Y. Supporting inclusive learning using chatbots: A chatbot-led interview study. Journal of Information Systems Education. 2022;33(1):98–108.
[4] Suebsombut P, Sureephong P, Sekhari A, Chernbumroong S, Bouras A. Chatbot application to support smart agriculture in Thailand. In: Proceedings of the 2022 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunications Engineering (ECTI DAMT & NCON). IEEE; 2022. p. 364–367. doi:10.1109/ECTIDAMTNCON54251.2022.9720318
Radford A, Narasimhan K, Salimans T, Sutskever I. Improving language understanding by generative pre-training. arXiv preprint arXiv:1801.06146. 2018.
Lewis P, Perez E, Piktus A, Petroni F, Karpukhin V, Goyal N, et al. Retrieval-augmented generation for knowledge-intensive NLP tasks. arXiv preprint arXiv:2005.11401. 2020.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Advances in Neural Information Processing Systems. 2017;30.
Ding J, Li B, Xu C, Qiao Y, Zhang L. Diagnosing crop diseases based on domain-adaptive pre-training BERT of electronic medical records. Applied Intelligence. 2022. doi:10.1007/s10489-022-04346-x
Khare Y, Bagal V, Mathew M, Devi A, Deva Priyakumar U, Jawahar CV. MMBERT: Multimodal BERT pretraining for improved medical VQA. arXiv preprint arXiv:2104.01394. 2021.
Liu R, Jia C, Wei J, Xu G, Wang L, Vosoughi S. Mitigating political bias in language models through reinforced calibration. Proceedings of the AAAI Conference on Artificial Intelligence. 2021;35(17):14857–14866. doi:10.1609/aaai.v35i17.17602
Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. 2019.
Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, Dhariwal P, et al. Language models are few-shot learners. Advances in Neural Information Processing Systems. 2020;33.
Hanslo R. Deep learning transformer architecture for named entity recognition on low-resourced languages. arXiv preprint arXiv:2111.00830. 2021.
Wang Q, Li B, Xiao T, Zhu J, Li C, Wong DF, et al. Learning deep transformer models for machine translation. arXiv preprint arXiv:1906.01787. 2019.
Maryamah M, Irfani MM, Raharjo EBT, Rahmi NA, Ghani M, Raharjana IK. Chatbots in academia: A retrieval-augmented generation approach for improved efficient information access. In: Proceedings of the 16th International Conference on Knowledge and Smart Technology (KST). IEEE; 2024. p. 259–264. doi:10.1109/KST60798.2024.10499652
Robertson S, Zaragoza H. The probabilistic relevance framework: BM25 and beyond. Foundations and Trends in Information Retrieval. 2009;3(4):333–389. doi:10.1561/1500000019
Manning CD, Raghavan P, Schütze H. Introduction to Information Retrieval. Cambridge University Press; 2008. ISBN: 9780521865715
Aker JC. Dial “A” for agriculture: Using ICTs for agricultural extension. Agricultural Economics. 2011;42(6):631–647. doi:10.1111/j.1574-0862.2011.00545.x
Qiang C, Kuek SC, Dymond A, Esselaar S. Mobile applications for agricultural extension services. World Bank; 2012. (Report — DOI not assigned)
Farmer.Chat. Scaling AI-powered agricultural services for smallholder farmers. arXiv preprint. 2024.
Ibrahim M, et al. Evaluating ChatGPT responses to farmers’ questions on rice cultivation. Scientific Reports. 2024.
Arreerard R, Mander S, Piao S. A survey on Thai language resources and tools. In: Proceedings of the Language Resources and Evaluation Conference (LREC). 2022.
Boonying S. A smart water management in a paddy field using IoT technology and machine learning. Information Technology Journal KMUTNB. 2021;17(2):1–10.
Somsuphaprungyos S. Centralizing real-time data using remote sensing towards smart farming applications in a public area: A case study of Ayutthaya. Information Technology Journal KMUTNB. 2022;18(2):1–12.
Ruangrajitpakorn T, Prombut C, Supnithi T. Development of an ontology-based personalized web from a rice knowledge website. In: Proceedings of the 13th International Conference on Knowledge, Information and Creativity Support Systems (KICSS). IEEE; 2018. doi:10.1109/KICSS.2018.8950556
Temniranrat P, Kiratiratanapruk K, Kitvimonrat A, Sinthupinyo W, Patarapuwadol S. A system for automatic rice disease detection from rice paddy images serviced via a chatbot. Computers and Electronics in Agriculture. 2021;185:106156. doi:10.1016/j.compag.2021.106156
Suebsombut P, Sureephong P, Sekhari A, Chernbumroong S, Bouras A. Chatbot application to support smart agriculture in Thailand. arXiv preprint arXiv:2308.02524. 2023.
Hongtong N. Prototype chatbot for automatic responses on coconut pest and price prediction. 2023.
Rice Knowledge Bank. Thai Rice Knowledge Base. Department of Rice, Thailand. Available from: https://newwebs2.ricethailand.go.th/webmain/rkb3/
Team Q. Qwen2.5: A Party of Foundation Models!. Qwen Blog. 2024. Available from: https://qwenlm.github.io/blog/qwen2.5/
PyThaiNLP. ThaiQA SQuAD Dataset. GitHub repository. 2020. Available from: https://github.com/PyThaiNLP/thaiqa_squad
Hu EJ, Shen Y, Wallis P, Allen-Zhu Z, Li Y, Wang S, et al. LoRA: Low-Rank Adaptation of Large Language Models. arXiv preprint arXiv:2106.09685.
Downloads
Published
How to Cite
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.





