Predictive Analysis of Academic Achievement in Information Studies: A Comparative Study Using Educational Data Mining Techniques
Main Article Content
Abstract
Predicting students’ academic achievement in the initial stages is beneficial for designing effective training programs to enhance success rates. Extracting knowledge from student data is a fundamental aspect of Educational Data Mining (EDM). This study aims to analyze the predictive factors influencing the outcomes of graduates from the Information Studies program. The results not only contribute to improving student performance but also aid in constructing a better curriculum. A dataset is utilized within five classification models to categorize students into four target classes. The datasets are grouped into three types: Demographic information, course grades, and early-stage GPA. This study addresses the issue of the imbalanced dataset by applying the Synthetic Minority Over-sampling Technique (SMOTE). The findings indicate that early-stage GPA (90.5%) is the most significant predictor, particularly when applying the Naive Bayes classifier on a balanced dataset. In contrast, demographic information (58.0%) and core course grades (87.5%) show lower predictive influence. The findings support learning strategies and enhancing curriculum design to improve final academic outcomes.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
เนื้อหาข้อมูล
References
R. S. J. D. Baker and K. Yacef, “The state of educational data mining in 2009: A review and future visions,” J. Educ. Data Min., vol. 1, no. 1, pp. 3-16, Jan. 2009, https://doi.org/10.5281/zenodo.3554657
V. L. Migueis, A. Freitas, P. J. V. Garcia, and A. Silva, “Early segmentation of students according to their academic performance: A predictive modelling approach,” Decis. Support Syst., vol. 115, pp. 36-51, Nov. 2018, https://doi.org/10.1016/j.dss.2018.09.001
M. Anoopkumar and A. M. J. M. Z. Rahman, “A Review on Data Mining techniques and factors used in Educational Data Mining to predict student amelioration,” in Proc. 2016 Int. Conf. Data Min. Adv. Comput. (SAPIENCE), 2016, pp.122-133, https://doi.org/10.1109/SAPIENCE.2016.7684113.
W. Xing, “Exploring the influences of MOOC design features on student performance and persistence,” Distance Educ., vol. 40, no. 1, pp. 98-113, Dec. 2019, https://doi.org/ 10.1080/01587919.2018.1553560
J. D. Parker, M. J. Hogan, J. M. Eastabrook, A. Oke, and L. M. Wood, “Emotional intelligence and student retention: Predicting the successful transition from high school to university,” Pers. Individ. Differ., vol. 41, no. 7, pp. 1329-1336, Nov. 2006, https://doi.org/10.1016/j.paid.2006.04.022
A. Richard-Eaglin, “Predicting student success in nurse practitioner programs,” J. Am. Assoc. Nurse Pract., vol. 29, no. 10, pp. 600-605, Oct. 2017, https://doi.org/10.1002/2327-6924.12502
E. Alqurashi, “Predicting student satisfaction and perceived learning within online learning environments,” Distance Educ., vol. 40, no. 1, pp. 133-148, Dec. 2018, https://doi.org/10.1080/01587919.2018.1553562
B. Pérez, C. Castellanos, and D. Correal, “Predicting student drop-out rates using data mining techniques: A case study,” in Proc. Int. Conf. Comput. Sci. Appl., 2018, pp. 111-125, https://doi.org/10.1007/978-3-030-03023-0_10
A. M. Shahiri, W. Husain, and N. A. Rashid, “A review on predicting student’s performance using data mining techniques,” Procedia Comput. Sci., vol. 72, pp. 414-422, 2015, https://doi.org/10.1016/j.procs.2015.12.157
H. Almarabeh, “Analysis of students’ performance by using different data mining classifiers,” Int. J. Mod. Educ. Comput. Sci., vol. 9, no. 8, pp. 9-15, Aug. 2017, https://doi.org/10.5815/ijmecs.2017.08.02
A. Raheela, M. Agathe, A. A. Syed, and G. H. Najmi, “Analyzing undergraduate students’ performance using educational data mining,” Comput. Educ., vol. 113, pp. 177-194, Oct. 2017, https://doi.org/10.1016/j.compedu.2017.05.007
Y. Baashar et al., “Toward Predicting Students’s academic Performance Using Artificial Neural Networks (ANNs),” Appl. Sci., vol. 12, no. 3, pp. 1-16, Jan. 2022, https://doi.org/10.3390/app12031289
Y. Baashar et al., “Evaluation of postgraduate academic performance using artificial intelligence models,” Alexandria Eng. J., vol. 61, no. 12, pp. 9867-9878, Dec. 2022, https://doi.org/10.1016/j.aej.2022.03.021
H. Aldowah, H. Al-Samarraie, and W. M. Fauzy, “Educational data mining and learning analytics for 21st century higher education: A review and synthesis,” Telemat. Inform., vol. 37, pp. 13-49, Apr. 2019, https://doi.org/10.1016/j.tele.2019.01.007
B. Owaidat, “Exploring the Accuracy and Reliability of Machine Learning Approaches for Student Performance,” Appl. Comput. Sci., vol. 20, no. 3, pp. 67-84, Sep. 2024, https://doi.org/10.35784/acs-2024-29
S. Sarker, M. K. Paul, S. T. H. Thasin, and M. A. M. Hasan, “Analyzing students’ academic performance using educational data mining,” Comput. Educ.: Artif. Intell., vol. 7, p. 100263, Dec. 2024, https://doi.org/10.1016/j.caeai.2024.100263
J. Zimmermann, K. H. Brodersen, H. R. Heinimann, and J. M. Buhmann, “A model-based approach to predicting graduate-level performance using indicators of undergraduate-level performance,” J. Educ. Data Min., vol. 7, no. 3, pp. 151-176, Oct. 2015,https://doi.org/10.5281/zenodo.3554733
S. Alex, J. J. V. Nayahi, and S. Kaddoura, “Deep convolutional neural networks with genetic algorithm-based synthetic minority over-sampling technique for improved imbalanced data classification,” Appl. Soft Comput., vol. 156, p. 111491, May 2024,
https://doi.org/10.1016/j.asoc.2024.111491
S. A. Alex, “Classification of imbalanced data using SMOTE and autoencoder based deep convolutional neural network,” Int. J. Uncertain. Fuzziness Knowl. Based Syst., vol. 31, no. 3, pp. 437-469, 2023. https://doi.org/10.1142/ s0218488523500228
E. Alyahyan and D. Dustegor, “Predicting academic success in higher education: Literature review and best practices,” Int. J. Educ. Technol. High. Educ., vol. 17, no. 3, Feb. 2020, https://doi.org/10.1186/s41239-020-0177-7
S. Jayaprakash, “A Survey on academic progression of students in tertiary education using classification algorithms,” Int. J. Eng. Technol., vol. 8, no. 6, pp. 111-115, Feb. 2019.
W. Intayoad, C. Kamyod, and P. Temdee, “Synthetic minority over-sampling for improving imbalanced data in educational web usage mining,” ECTI Trans. Comput. Inf. Technol., vol. 12, no. 2, pp. 118-129, Feb. 2019, https://doi.org/10.37936/ecticit.2018122.133280
A. AL-Ashoor and S. Abdullah, “Examining techniques to solving imbalanced datasets in educational data mining systems,” Int. J. Comput., vol. 21, no. 2, pp. 205-213, Jun. 2022,https://doi.org/10.47839/ijc.21.2.2589
S. Aliga, A. S. Gaafar, and A. K. Hamoud, “Predicting students performance using supervised machine learning based on imbalanced dataset and wrapper feature selection,” Informatica, vol. 47, no. 1, pp. 11-20, 2023, https://doi.org/10.31449/inf.v47i1.4519
K. Sutha and J. J. Tamilselvi, “A review of feature selection algorithms for data mining techniques,” Int. J. Comput. Sci. Eng., vol. 7, no. 6, pp. 63-67, Jun. 2015.
J. Cai, J. Luo, S. Wang, and S. Yang, “Feature selection in machine learning: A new perspective,” Neurocomputing, vol. 300, pp. 70-79, Jul. 2018, https://doi.org/10.1016/j.neucom.2017.11.077
M. Zafari, A. S. Niaraki, S. M. Choi, and A. Esmaeily, “A practical model for the evaluation of high school student performance based on machine learning,” Applied Sciences, vol. 11, no. 23, p. 11534, Dec. 2021, https://doi.org/10.3390/app112311534
M. R. Islam, A. M. Nitu, M. A Marjan, M. P. Uddin, M. I. Afjal, and M. A. A. Mamun, “Enhancing tertiary students’ programming skills with an explainable educational data mining approach,” PLoS ONE, vol. 19, no. 9, p. e0307536, Sep. 2024, https://doi.org/10.1371/journal.pone.0307536