Dealing with Imbalanced Data for Forecasting Model of Higher Education Selection of Grade 9 Students in Muang District Nakhonsawan Province

Main Article Content

Chidchanok Sookwongtanon
Satriwit Chaleekorn
Rungruttikarn Moungmai

Abstract

Nowadays big data has become more and more active because most organisations need to review their organisational performance in the past years and to seek impact factors both strengths and weaknesses for problem solving, operation plan and organisational development including the future prediction. A problem of big data is imbalanced data resulting in less predictive accuracy. Therefore, the study aims to create a proper model of multinomial logistic regression and to compare the predictive accuracy of the models between imbalanced data and balanced data using random over-sampling method and random under-sampling for correcting the data balancing. A real dataset of higher education selection of grade 9 students in Nakhonsawan, 2,692 students, was applied. It was found that main impact factor on students’ decision was demographic. Moreover, the results revealed that the accuracy of forecasting models were more than 60% whereas the predictive accuracy of balanced data was better than another.

Article Details

How to Cite
Sookwongtanon, C., Chaleekorn, S. ., & Moungmai, R. (2021). Dealing with Imbalanced Data for Forecasting Model of Higher Education Selection of Grade 9 Students in Muang District Nakhonsawan Province. Rattanakosin Journal of Science and Technology, 2(3), 73–88. Retrieved from https://ph02.tci-thaijo.org/index.php/RJST/article/view/243479
Section
Research Articles

References

[1] Chomboon K. (2015). Classication Techique Minority Class On Imbalanced Dataset With Data Partitioning Method. Master thesis, B.Eng., Suranaree University, Nakhon Ratchasima.

[2] Kesornsit W., Lorchirachoonkul V. and Jitthavech J.(2018).Imbalanced Data Problem Solving in Classification of Diabetes Patients.Master thesis, M.S., Khon Kaen University, Khon Kaen.

[3] Chomnok P. and Yanwiset C. Secondary education. Retrieved May 28, 2020, from https://sites.google.com/site/fihakhwamru/kar-suksa-radab-mathymsuksa

[4] Institute for the Promotion of Teching Science and Technology.Different school sizes. Retrieved January 11, 2021, from https://pisathailand.ipst.ac.th/issue-2019-41/?fbclid=IwAR0-Q-XWo8GA83wUPJXvEWHr7DQVtpEJwq2A6XzI0rgAWAxiEg7-3d4KVJg

[5] Bureau of information and communication technology and Office of the permanent secretary ministry of education. (2007-2019). Statistics on further study selection of Mathayomsuksa 3 students. Retrieved May 28, 2020, from http://www.bict.moe.go.th/2020/index.php

[6] Nakhonsawan Provincial Statistical Office. (2018-2019). Nakhon Sawan Provincial Statistical Report. Retrieved May 30, 2020, from http://nksawan.nso.go.th/index.php?option=com_content&view=article&id=335&Itemid=507

[7] Tongpool P., Jamrueng P., Boonrit R. and Sinsomboonthong S. Performance Comparison in Predietion of Imbalanced Data in Data Mining Classification. Master thesis, King Mongkut’s Institute of Technology Ladkrabang, Bangkok.

[8] Woraphongsathorn T. (4/2/2018). Multiple Logistic Regression Analysis. Retrieved May 30, 2020, from http://oec.anamai.moph.go.th/download/OEC_2016/MEETTING2561/APRIL2561/2_5April2561/6-Multiple%20Logistic%20Regression%20Analysis.pdf

[9] Poosumpa S., Sathongkhao P. and Supprasert T. Forecasting Model for the Number of Grade 9 Students Who Intend to Study the Upper Secondary Level in Science, Muang District, Nakhonsawan Province