Developing Multi-Label Classification Model for Improving Text Categorizing  Problems a Case of Traffy*  Fondue Platform

Phopthorn Kaewvichit; Akkaranan Pongsathornwiwat

PDF

Published: Jun 25, 2024

Keywords:

Deep learning Multi-label classification Natural language processing Public complaint data Text categorization

Phopthorn Kaewvichit

Data Analytics and Data Science, National Institute of Development Administration (NIDA), Bangkok 10240, Thailand

Akkaranan Pongsathornwiwat

Logistics Management, National Institute of Development Administration (NIDA), Bangkok 10240, Thailand

Abstract

This paper develops an integrated deep-learning-based model for a multi-label text classification problem in order to enhance the efficiency of Traffy* Fondue which is a comprehensive platform for diverse issues of citizen’s complaints over the Bangkok metropolitan area. The dataset of Traffy* Fondue has been found to have inaccuracy problems of label categorization, especially in the 'Others' category, which results in delaying of coordination and problem-solving to address the complaint on time. To overcome this problem, this study develops an integrated deep-learning-based model for multi-label text classification problem. Five main methods have been applied to model text classification, namely, Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Bi-LSTM, fastText and WangchanBERTa. The performance of developed models has been tested against traditional algorithms. The result of modeling in the 'Others' dataset shows that the Bi-LSTM-, CNN+Bi-LSTM- and WangchanBERTa-based approaches are the best three models that outperform others with higher precision, recall, and F1 score. Our approach offers a promising solution to expedite issue resolution and improve coordination within Bangkok's civic management framework.

How to Cite

Phopthorn Kaewvichit, & Akkaranan Pongsathornwiwat. (2024). Developing Multi-Label Classification Model for Improving Text Categorizing Problems a Case of Traffy* Fondue Platform. Science & Technology Asia, 29(2), 138–147. retrieved from https://ph02.tci-thaijo.org/index.php/SciTechAsia/article/view/254682

Issue

Vol.29 No.2 (April-June 2024)

Section

Articles

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

References

S. Kim, M. Lee and J. Seok. Multi-label Text Classification of Economic Concepts from Economic News Articles using Natural Language Processing. Thirteenth International Conference on Ubiquitous and Future Networks (ICUFN), Barcelona, Spain, 2022, pp. 417-20,

L. Jia, J. Fan, D. Sun, Q. Gao and Y. Lu. Research on multi-label classification problems based on neural networks and label correlation. 41st Chinese Control Conference (CCC), Hefei, China, 2022, pp. 7298-302.

Q. Zhang, R. Zheng, Z. Zhao, B. Chai and J. Li. A TextCNN Based Approach for Multi-label Text Classification of Power Fault Data. IEEE 5th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), Chengdu, China, 2020, pp. 179-83.

B. alsukhni. Multi-Label Arabic Text Classification Based On Deep Learning. 12th International Conference on Information and Communication Systems (ICICS), Valencia, Spain, 2021, pp. 475-7.

S. Wang, Y. Yang and X. Meng. Research on Multi-Label Text Classification Based on Multi-Channel CNN and BiLSTM. International Conference on Artificial Intelligence of Things and Crowdsensing (AIoTCs), Nicosia, Cyprus, 2022, pp. 498-503.

D. Kim, J. Koo and U. -M. Kim. EnvBERT: Multi-Label Text Classification for Imbalanced, Noisy Environmental News Data. 15th

International Conference on Ubiquitous Information Management and Communication (IMCOM), Seoul, Korea (South), 2021, pp. 1-8.

M. Zang, S. Niu, Y. Gao and X. Chen.Long Text Multi-label Classification," 3rd International Conference on Neural Networks, Information and Communication Engineering (NNICE), Guangzhou, China, 2023, pp. 438-42.

C. Xin and Q. Zhanzhi. Research on Chinese Text Classification Based on WAE and SVM. 3rd International Conference on Natural Language Processing (ICNLP), Beijing, China, 2021, pp. 14-9.

F. Gürcan. Multi-Class Classification of Turkish Texts with Machine Learning Algorithms. 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT), Ankara, Turkey, 2018, pp. 1-5.

B. Zhang. News Text Classification Algorithm Based on Machine Learning Technology. International Conference on Education, Network and Information Technology (ICENIT), Liverpool, United Kingdom, 2022, pp. 182-6.

T. Yao, Z. Zhai and B. Gao. Text Classification Model Based on fastText. IEEE International Conference on Artificial Intelligence and Information Systems (ICAIIS), Dalian, China, 2020, pp. 154-7.

Y. Zhang, C. Liu and W. Liu. A Study on Sentiment Analysis of Movie Reviews based on ALBERT-TextCNN-HAN. 3rd International Symposium on Computer Technology and Information Science (ISCTIS), Chengdu, China, 2023, pp. 758-63.

Y. Lai and L. Zhang. Government affairs message text classification based on RoBerta and TextCNN. 5th International Conference on Communications, Information System and Computer Engineering (CISCE), Guangzhou, China, 2023, pp. 258-62.

M. Gao and J. Li. Chinese short text classification method based on word embedding and Long Short-Term Memory Neural Network. International Conference on Artificial Intelligence, Big Data and Algorithms (CAIBDA), Xi'an, China, 2021, pp. 91-5.

S. Ariwibowo, A. S. Girsang and Diana. Hate Speech Text Classification Using Long Short-Term Memory (LSTM). IEEE International Conference of Computer Science and Information Technology (ICOSNIKOM), Laguboti, North Sumatra, Indonesia, 2022, pp. 1-6.

T. Yao, Z. Zhai and B. Gao. Text Classification Model Based on fastText. IEEE International Conference on Artificial Intelligence and Information Systems (ICAIIS), Dalian, China, 2020, pp. 154-7.

Lalita Lowphansirikul, Charin Polpanumas, Nawat Jantrakulchai and Sarana Nutanong, (2021) Wangchanberta: Pretraining transformer-based thai language models. arXiv preprint arXiv:2101.09635.

Digital Innovation Group for Cities, National Science and Technology Development Agency (NSTDA). Team Chadchart Traffic Dataset. Available: https://share.traffy.in.th/teamchadchart

Article Sidebar

Main Article Content

Abstract

Article Details

References