Developing Multi-Label Classification Model for Improving Text Categorizing Problems a Case of Traffy* Fondue Platform
Main Article Content
Abstract
This paper develops an integrated deep-learning-based model for a multi-label text classification problem in order to enhance the efficiency of Traffy* Fondue which is a comprehensive platform for diverse issues of citizen’s complaints over the Bangkok metropolitan area. The dataset of Traffy* Fondue has been found to have inaccuracy problems of label categorization, especially in the 'Others' category, which results in delaying of coordination and problem-solving to address the complaint on time. To overcome this problem, this study develops an integrated deep-learning-based model for multi-label text classification problem. Five main methods have been applied to model text classification, namely, Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Bi-LSTM, fastText and WangchanBERTa. The performance of developed models has been tested against traditional algorithms. The result of modeling in the 'Others' dataset shows that the Bi-LSTM-, CNN+Bi-LSTM- and WangchanBERTa-based approaches are the best three models that outperform others with higher precision, recall, and F1 score. Our approach offers a promising solution to expedite issue resolution and improve coordination within Bangkok's civic management framework.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
References
S. Kim, M. Lee and J. Seok. Multi-label Text Classification of Economic Concepts from Economic News Articles using Natural Language Processing. Thirteenth International Conference on Ubiquitous and Future Networks (ICUFN), Barcelona, Spain, 2022, pp. 417-20,
L. Jia, J. Fan, D. Sun, Q. Gao and Y. Lu. Research on multi-label classification problems based on neural networks and label correlation. 41st Chinese Control Conference (CCC), Hefei, China, 2022, pp. 7298-302.
Q. Zhang, R. Zheng, Z. Zhao, B. Chai and J. Li. A TextCNN Based Approach for Multi-label Text Classification of Power Fault Data. IEEE 5th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), Chengdu, China, 2020, pp. 179-83.
B. alsukhni. Multi-Label Arabic Text Classification Based On Deep Learning. 12th International Conference on Information and Communication Systems (ICICS), Valencia, Spain, 2021, pp. 475-7.
S. Wang, Y. Yang and X. Meng. Research on Multi-Label Text Classification Based on Multi-Channel CNN and BiLSTM. International Conference on Artificial Intelligence of Things and Crowdsensing (AIoTCs), Nicosia, Cyprus, 2022, pp. 498-503.
D. Kim, J. Koo and U. -M. Kim. EnvBERT: Multi-Label Text Classification for Imbalanced, Noisy Environmental News Data. 15th
International Conference on Ubiquitous Information Management and Communication (IMCOM), Seoul, Korea (South), 2021, pp. 1-8.
M. Zang, S. Niu, Y. Gao and X. Chen.Long Text Multi-label Classification," 3rd International Conference on Neural Networks, Information and Communication Engineering (NNICE), Guangzhou, China, 2023, pp. 438-42.
C. Xin and Q. Zhanzhi. Research on Chinese Text Classification Based on WAE and SVM. 3rd International Conference on Natural Language Processing (ICNLP), Beijing, China, 2021, pp. 14-9.
F. Gürcan. Multi-Class Classification of Turkish Texts with Machine Learning Algorithms. 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), Ankara, Turkey, 2018, pp. 1-5.
B. Zhang. News Text Classification Algorithm Based on Machine Learning Technology. International Conference on Education, Network and Information Technology (ICENIT), Liverpool, United Kingdom, 2022, pp. 182-6.
T. Yao, Z. Zhai and B. Gao. Text Classification Model Based on fastText. IEEE International Conference on Artificial Intelligence and Information Systems (ICAIIS), Dalian, China, 2020, pp. 154-7.
Y. Zhang, C. Liu and W. Liu. A Study on Sentiment Analysis of Movie Reviews based on ALBERT-TextCNN-HAN. 3rd International Symposium on Computer Technology and Information Science (ISCTIS), Chengdu, China, 2023, pp. 758-63.
Y. Lai and L. Zhang. Government affairs message text classification based on RoBerta and TextCNN. 5th International Conference on Communications, Information System and Computer Engineering (CISCE), Guangzhou, China, 2023, pp. 258-62.
M. Gao and J. Li. Chinese short text classification method based on word embedding and Long Short-Term Memory Neural Network. International Conference on Artificial Intelligence, Big Data and Algorithms (CAIBDA), Xi'an, China, 2021, pp. 91-5.
S. Ariwibowo, A. S. Girsang and Diana. Hate Speech Text Classification Using Long Short-Term Memory (LSTM). IEEE International Conference of Computer Science and Information Technology (ICOSNIKOM), Laguboti, North Sumatra, Indonesia, 2022, pp. 1-6.
T. Yao, Z. Zhai and B. Gao. Text Classification Model Based on fastText. IEEE International Conference on Artificial Intelligence and Information Systems (ICAIIS), Dalian, China, 2020, pp. 154-7.
Lalita Lowphansirikul, Charin Polpanumas, Nawat Jantrakulchai and Sarana Nutanong, (2021) Wangchanberta: Pretraining transformer-based thai language models. arXiv preprint arXiv:2101.09635.
Digital Innovation Group for Cities, National Science and Technology Development Agency (NSTDA). Team Chadchart Traffic Dataset. Available: https://share.traffy.in.th/teamchadchart