Developing Multi-Label Classification Model for Improving Text Categorizing Problems a Case of Traffy* Fondue Platform

Authors

  • Phopthorn Kaewvichit Data Analytics and Data Science, National Institute of Development Administration (NIDA), Bangkok 10240, Thailand
  • Akkaranan Pongsathornwiwat Logistics Management, National Institute of Development Administration (NIDA), Bangkok 10240, Thailand

Keywords:

Deep learning, Multi-label classification, Natural language processing, Public complaint data, Text categorization

Abstract

This paper develops an integrated deep-learning-based model for a multi-label text classification problem in order to enhance the efficiency of Traffy* Fondue which is a comprehensive platform for diverse issues of citizen’s complaints over the Bangkok metropolitan area. The dataset of Traffy* Fondue has been found to have inaccuracy problems of label categorization, especially in the 'Others' category, which results in delaying of coordination and problem-solving to address the complaint on time. To overcome this problem, this study develops an integrated deep-learning-based model for multi-label text classification problem. Five main methods have been applied to model text classification, namely, Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Bi-LSTM, fastText and WangchanBERTa. The performance of developed models has been tested against traditional algorithms. The result of modeling in the 'Others' dataset shows that the Bi-LSTM-, CNN+Bi-LSTM- and WangchanBERTa-based approaches are the best three models that outperform others with higher precision, recall, and F1 score. Our approach offers a promising solution to expedite issue resolution and improve coordination within Bangkok's civic management framework.

References

S. Kim, M. Lee and J. Seok. Multi-label Text Classification of Economic Concepts from Economic News Articles using Natural Language Processing. Thirteenth International Conference on Ubiquitous and Future Networks (ICUFN), Barcelona, Spain, 2022, pp. 417-20,

L. Jia, J. Fan, D. Sun, Q. Gao and Y. Lu. Research on multi-label classification problems based on neural networks and label correlation. 41st Chinese Control Conference (CCC), Hefei, China, 2022, pp. 7298-302.

Q. Zhang, R. Zheng, Z. Zhao, B. Chai and J. Li. A TextCNN Based Approach for Multi-label Text Classification of Power Fault Data. IEEE 5th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), Chengdu, China, 2020, pp. 179-83.

B. alsukhni. Multi-Label Arabic Text Classification Based On Deep Learning. 12th International Conference on Information and Communication Systems (ICICS), Valencia, Spain, 2021, pp. 475-7.

S. Wang, Y. Yang and X. Meng. Research on Multi-Label Text Classification Based on Multi-Channel CNN and BiLSTM. International Conference on Artificial Intelligence of Things and Crowdsensing (AIoTCs), Nicosia, Cyprus, 2022, pp. 498-503.

D. Kim, J. Koo and U. -M. Kim. EnvBERT: Multi-Label Text Classification for Imbalanced, Noisy Environmental News Data. 15th

International Conference on Ubiquitous Information Management and Communication (IMCOM), Seoul, Korea (South), 2021, pp. 1-8.

M. Zang, S. Niu, Y. Gao and X. Chen.Long Text Multi-label Classification," 3rd International Conference on Neural Networks, Information and Communication Engineering (NNICE), Guangzhou, China, 2023, pp. 438-42.

C. Xin and Q. Zhanzhi. Research on Chinese Text Classification Based on WAE and SVM. 3rd International Conference on Natural Language Processing (ICNLP), Beijing, China, 2021, pp. 14-9.

F. Gürcan. Multi-Class Classification of Turkish Texts with Machine Learning Algorithms. 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), Ankara, Turkey, 2018, pp. 1-5.

B. Zhang. News Text Classification Algorithm Based on Machine Learning Technology. International Conference on Education, Network and Information Technology (ICENIT), Liverpool, United Kingdom, 2022, pp. 182-6.

T. Yao, Z. Zhai and B. Gao. Text Classification Model Based on fastText. IEEE International Conference on Artificial Intelligence and Information Systems (ICAIIS), Dalian, China, 2020, pp. 154-7.

Y. Zhang, C. Liu and W. Liu. A Study on Sentiment Analysis of Movie Reviews based on ALBERT-TextCNN-HAN. 3rd International Symposium on Computer Technology and Information Science (ISCTIS), Chengdu, China, 2023, pp. 758-63.

Y. Lai and L. Zhang. Government affairs message text classification based on RoBerta and TextCNN. 5th International Conference on Communications, Information System and Computer Engineering (CISCE), Guangzhou, China, 2023, pp. 258-62.

M. Gao and J. Li. Chinese short text classification method based on word embedding and Long Short-Term Memory Neural Network. International Conference on Artificial Intelligence, Big Data and Algorithms (CAIBDA), Xi'an, China, 2021, pp. 91-5.

S. Ariwibowo, A. S. Girsang and Diana. Hate Speech Text Classification Using Long Short-Term Memory (LSTM). IEEE International Conference of Computer Science and Information Technology (ICOSNIKOM), Laguboti, North Sumatra, Indonesia, 2022, pp. 1-6.

T. Yao, Z. Zhai and B. Gao. Text Classification Model Based on fastText. IEEE International Conference on Artificial Intelligence and Information Systems (ICAIIS), Dalian, China, 2020, pp. 154-7.

Lalita Lowphansirikul, Charin Polpanumas, Nawat Jantrakulchai and Sarana Nutanong, (2021) Wangchanberta: Pretraining transformer-based thai language models. arXiv preprint arXiv:2101.09635.

Digital Innovation Group for Cities, National Science and Technology Development Agency (NSTDA). Team Chadchart Traffic Dataset. Available: https://share.traffy.in.th/teamchadchart

Downloads

Published

2024-06-25

How to Cite

Phopthorn Kaewvichit, & Akkaranan Pongsathornwiwat. (2024). Developing Multi-Label Classification Model for Improving Text Categorizing Problems a Case of Traffy* Fondue Platform. Science & Technology Asia, 29(2), 138–147. Retrieved from https://ph02.tci-thaijo.org/index.php/SciTechAsia/article/view/254682

Issue

Section

Articles