Multi-class imbalanced data problem solving for predicting internet service cancellation
Main Article Content
Abstract
Churn is common problem across various industry, especially in the telecommunication sector. In order to reduce customer churn rates and complete effectively, many telecom companies are recognizing the need to improve customer experience and service. Therefore, understanding the reasons behind customer churn is essential. In this research, we apply machine learning multi-class classification technique for predicting customer churn to the internet service provider industry. We propose a two-fold approach to address the issue: 1) Handling imbalanced data at the dataset level by utilizing the SMOTE technique to synthetically generate additional data, and 2) Addressing imbalanced data at the learning process level by adjusting the data weights using the XGBoost technique. The resulting XGBoost based classifier is able to predict accurately the majority classes and as well as the minority classes. The combination of SMOTE and XGBoost techniques demonstrates effectiveness in solving imbalanced problem in the case of multi-class datasets.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
P. Bhanuprakash and G.S. Nagaraja, "A Review on Churn Prediction Modeling in Telecom Environment". 2017 2nd IEEE International Conference on Computational Systems and Information Technology for Sustainable Solutions (CSITSS). 21-23 Dec. Bengaluru India : pp. 8-12, 2017.
A. Śniegula, A. Poniszewska-Marańda, M. Popović, "Study of machine learning methods for customer churn prediction in telecommunication company". The 21st International Conference on Information Integration and Web-based Applications & Services (iiWAS2019). 2-4 Dec. Munich Germany : pp. 640-644, 2019.
A. Gaur and R. Dubey, "Predicting Customer Churn Prediction In Telecom Sector Using Various Machine Learning Techniques". 2018 International Conference on Advanced Computation and Telecommunication (ICACAT). 28-29 Dec. Bhopal India : pp. 1-5, 2018.
F. Khan and S. S. Kozat, "Sequential churn prediction and analysis of cellular network users — A multi-class, multi-label perspective". 2017 25th Signal Processing and Communications Applications Conference (SIU). 15-18 May. Antalya Turkey : pp 1-4, 2017.
M. M. Khamsan and R. Maskat, "Handling highly imbalanced output class label: a case study on Fantasy Premier League (FPL) virtual player price changes prediction using machine learning," Malaysian Journal of Computing (MJoC)., Vol. 4(2), pp. 304-316, 2019.
พุทธิพร ธนธรรมเมธี และ เยาวเรศ ศิริสถิตย์กุล, "เทคนิคการจำแนกข้อมูลที่พัฒนาสำหรับชุดข้อมูลที่ไม่สมดุลของภาวะข้อเข่าเสื่อมในผู้สูงอายุ," วารสารวิทยาศาสตร์และเทคโนโลยี, ปีที่ 27 (ฉบับที่ 6), หน้า 1164-1178, 2019.
M. Bach, A. Werner and M. Palt, “The Proposal of Undersampling Method for Learning from Imbalanced Datasets,” Procedia Computer Science., Vol. 159, pp. 125-134, 2019.
A. More. (11 January 2023). Survey of resampling techniques for improving classification performance in unbalanced datasets, [Online]. Available : ResearchGate, https://www.researchgate.net/
H. R. Sanabila and W. Jatmiko, "Ensemble Learning on Large Scale Financial Imbalanced Data". 2018 International Workshop on Big Data and Information Security (IWBIS). 8-9 May. Jakarta Indonesia : pp. 93-98, 2018.
P. Goyal, R. Rani and K. Singh, "Comparative Analysis of Machine Learning and Ensemble Learning Classifiers for Parkinson’s Disease Detection". 2022 3rd International Conference on Computing, Analytics and Networks (ICAN), 18-19 Nov. Rajpura Punjab India : pp. 1-6, 2022.
Nasritha, K., Kerdprasop, K., & Kerdprasop, N., “Comparison of Sampling Techniques for Imbalanced Data Classification,” Journal of Applied Informatics and Technology., Vol. 1(1), pp. 20–37, 2018.
Virakul, R., & Waiyamai, K. “Feature Selection and Imbalanced Data Problem Solving in Classification of Banking Fraud Prevention”. 14th National Conference on Information Technology (NCIT-2022). 10-11 Nov. Nonthaburi Thailand : pp. 41-45, 2022.