EFFICACY ASSESSMENT OF A RAG-BASED CHATBOT FOR EDUCATIONAL COMPUTER PROGRAMMING
Main Article Content
บทคัดย่อ
The purposes of the research were to 1) develop an educational Python chatbot utilizing the open-source Large Language Model, Gemma2 9B, combined with Retrieval-Augmented Generation (RAG) to provide personalized, non-intimidating support for novice programmers facing complex abstract concepts, 2) implement a secure, locally-hosted AI system using Ollama and Docker, managed via OpenWebUI, to ensure a context-aware knowledge base specifically tailored for Python programming instruction, and 3) evaluate the performance and efficacy of the chatbot through a multi-dimensional assessment focusing on semantic consistency, response accuracy via BERT Score, and linguistic fluency via BLEU Score.
The research findings showed that the RAG-based chatbot significantly outperformed the baseline (No-RAG) model across all evaluation dimensions. It achieved a high Consistency score of 0.96 compared to the baseline's 0.82, indicating highly stable and predictable responses. Furthermore, the system demonstrated robust semantic understanding with an impressive BERT Score (Precision: 0.90, Recall: 0.88, F1-score: 0.89), successfully capturing the relevance of educational content compared to the baseline's F1-score of 0.76. However, while the BLEU Score of 0.29 was an improvement over the baseline's 0.21, it highlighted limitations in generating nuanced, human-like linguistic responses. Ultimately, integrating external knowledge via RAG provides a reliable, accurate, and powerful solution for decentralized educational tools, though further optimization in prompt engineering is recommended to enhance natural language flow. The evaluation was conducted using a benchmark dataset consisting of 50 Python programming questions; therefore, the findings should be interpreted within the scope of this dataset.
Article Details

อนุญาตภายใต้เงื่อนไข Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
บทความที่ได้รับการตีพิมพ์และเป็นลิขสิทธิ์ของวารสารวิทยาศาสตร์และเทคโนโลยี มหาวิทยาลัยเซาธ์อีสท์บางกอก
เอกสารอ้างอิง
Y. Saxena, S. Chopra, and A. M. Tripathi, "Evaluating Consistency and Reasoning Capabilities of Large Language Models," in Proc. 2024 Second International Conference on Data Science and Information System (ICDSIS), 2024, pp. 1–5, doi: 10.1109/ICDSIS61070.2024.10594233.
K. Ko, T. Y. Nyein, K. K. Oo, T. Z. Oo, and T. T. Zin, "Retrieval Augmented Generation for Document Query Automation Using Open-Source LLMs," in Proc. 2024 5th International Conference on Advanced Information Technologies (ICAIT), 2024, pp. 1–6, doi: 10.1109/ICAIT65209.2024.10754919.
S. Xu, X. Huang, C. K. Lo, G. Chen, and M. S. Jong, "Evaluating the Performance of ChatGPT and GPT-4o in Coding Classroom Discourse Data: A Study of Synchronous Online Mathematics Instruction," Computers and Education: Artificial Intelligence, vol. 7, Art. no. 100325, Dec. 2024, doi: 10.1016/
j.caeai.2024.100325.
H. K. Chaubey, G. Tripathi, R. Ranjan, and S. K. Gopalaiyengar, "Comparative Analysis of RAG, Fine-Tuning, and Prompt Engineering in Chatbot Development," in Proc. 2024 International Conference on Future Technologies for Smart Society (ICFTSS), 2024, pp. 1–7, doi: 10.1109/ICFTSS61109.
10691338.
A. T. Neumann, Y. Yin, S. K. Sowe, and S. Decker, "An LLM-Driven Chatbot in Higher Education for Databases and Information Systems," IEEE Transactions on Education, early access, pp. 1–14, Oct. 2024, doi: 10.1109/TE.2024.3467912.
M. Van Poucke, "ChatGPT, the Perfect Virtual Teaching Assistant? Ideological Bias in Learner-Chatbot Interactions," Computers and Composition, vol. 73, Art. no. 102871, 2024, doi: 10.1016/j.compcom.
102871.
A. Šarčević, I. Tomičić, A. Merlin, and M. Horvat, "Enhancing Programming Education with Open-Source Generative AI Chatbots," in Proc. 47th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 2024, pp. 2367–2372, doi: 10.1109/MIPRO60963.2024.10569736.
M. Ahmed, M. Dorrah, A. Ashraf, Y. Adel, A. Elatrozy, B. E. Mohamed, and W. Gomaa, "CodeQA: Advanced Programming Question-Answering Using LLM Agent and RAG," in Proc. 2024 6th Novel Intelligent and Leading Emerging Sciences Conference (NILES), Giza, Egypt, 2024, pp. 494–499, doi: 10.1109/NILES63360.2024.10753267.
A. P. Aguilera-Hermida, "AI Chatbots in Education: The Importance of Accuracy," International Management Review, vol. 20, no. 1, p. 39, 2024.
G. Akçapınar and E. Sidan, "AI Chatbots in Programming Education: Guiding Success or Encouraging Plagiarism," Discover Artificial Intelligence, vol. 4, Art. no. 87, Nov. 2024, doi: 10.1007/s44163-024-00203-7.
L. O. Campbell and T. D. Cox, "Utilizing Generative AI in Higher Education Teaching and Learning," Journal of the Scholarship of Teaching and Learning, vol. 24, no. 4, pp. 162–173, Dec. 2024, doi: 10.14434/josotl.v24i4.36575.
S. Groothuijsen, A. van den Beemt, J. C. Remmers, and L. W. van Meeuwen, "AI Chatbots in Programming Education: Students' Use in a Scientific Computing Course and Consequences for Learning," Computers and Education: Artificial Intelligence, vol. 7, Art. no. 100290, 2024, doi: 10.1016/j.caeai.2024.100290.
W. Fan, Y. Ding, L. Ning, S. Wang, H. Li, D. Yin, T.-S. Chua, and Q. Li, "A Survey on RAG Meeting LLMs: Towards Retrieval-Augmented Large Language Models," in Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '24), Barcelona, Spain, Aug. 2024, pp. 6491–6501, doi: 10.1145/3637528.3671470.
E. Mullins, A. Portillo, K. Ruiz Rohena, and A. Piplai, "Enhancing Classroom Teaching with LLMs and RAG," in Proc. 25th Annu. Conf. Information Technology Education (SIGITE '24), Dec. 2024, pp. 145–146, doi: 10.1145/3686852.3687083.
Q. Wen, J. Liang, C. Sierra, R. Luckin, R. Tong, Z. Liu, P. Cui, and J. Tang, "AI for Education (AI4EDU): Advancing Personalized Education with LLM and Adaptive Learning," in Proc. 30th ACM SIGKDD Conf. Knowledge Discovery and Data Mining (KDD '24), Aug. 2024, pp. 6743–6744, doi: 10.1145/3637528.3671498