COMPARISON OF THAI WORD SEGMENTATION ALGORITHMS FOR SENTIMENT ANALYSIS OF TIKTOK APPLICATION USERS Research Article
Main Article Content
Abstract
The purposes of the research were to 1) compare the performance of Thai word segmentation algorithms in the PyThaiNLP library: Maximum Matching (MM), New Maximum Matching (NewMM), and Longest Matching (Longest) , in combination with stopword removal using PyThaiNLP stopword and custom stopword defined by the researcher; and 2) evaluate the performance of models Naïve Bayes (NB), Logistic Regression (LR), Support Vector Machine (SVM), and Long Short-Term Memory (LSTM) for classifying user sentiments in TikTok user reviews. The study used accuracy, precision, recall, F1-score, and training time as performance indicators. The dataset consisted of 5,519 customer reviews from the Google Play Store. The results revealed that 1) the NewMM segmentation algorithm achieved the best results, with an accuracy of 83.52% and an F1-score of 83.06%. 2) The Naïve Bayes achieved the best performance, with an accuracy of 83.73% and an F1-score of 83.43%. 3) Not removing stopwords always led to the best performance. These findings demonstrate the importance of selecting appropriate word segmentation algorithms and methods of stopword treatment to maximize the performance of sentiment analysis for Thai text.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
This article is published and copyrighted by the Science and Technology Journal. South East Bangkok College
References
N. Phonphong, “Usage behavior and satisfaction of TikTok application of generation Y users in Bangkok,” M. S. thesis, Thammasat Univ., Bangkok, 2019, doi: 10.14457/TU.the.2019.1132. (in Thai)
Spring News, “TikTok usage statistics in 2024: Top countries ranked – Thailand ranks 9th,” Jan. 5, 2024. [Online]. Available: https://www.springnews.co.th/digital-business/digital-marketing/848802. [Accessed: Feb. 10, 2025]. (in Thai)
Predictive, “The key to building a successful application,” Feb. 15, 2024. [Online]. Available: https://
predictive.co.th/blog/mobile-development-with-data-strategy. [Accessed: Feb. 2, 2025]. (in Thai)
M. Chali, “Analysis and visualization of tourist behavior of hospitality service in Chiang Mai Province using sentiment analysis method,” M.S. thesis, Chiang Mai Univ., Chiang Mai, 2023. [Online]. Available: https://tdc.thailis.or.th/tdc/browse.php?option=show&browse_type=title&titleid=644515. [Accessed: Feb. 10, 2025]. (in Thai)
T. Polsawat, “Sentiment classification methodology for online consumer reviews using ontology based,” M.S. thesis, Khon Kaen Univ., Khon Kaen, 2020. [Online]. Available: https://tdc.thailis.or.th/
tdc/browse.php?option=show&browse_type=title&titleid=584241. [Accessed: Feb. 15, 2025]. (in Thai)
W. Sukcharoen and W. Jitsakul, “The comparison of Thai word segmentation for basic first aid with Deepcut algorithm and Newmm algorithm,” presented at NCCIT 2021, King Mongkut’s Univ. of Technology North Bangkok, Bangkok, pp. 264–269, 2021. (in Thai)
N. Lertchanvuth, “Analysis of tourist attractions using machine learning techniques: A case study of Yaowarat road, Thailand,” M.S. thesis, King Mongkut’s Inst. of Technology Ladkrabang, Bangkok, 2021. [Online]. Available: https://tdc.thailis.or.th/tdc/browse.php?option=show&browse_type=title&titleid=
[Accessed: Feb. 10, 2025]. (in Thai)
M. Manhem, S. Dolah, S. Chunkaew, and S. Mak-on, “Model for classifying review sentiments using decision tree techniques: A case study of a hotel reservation website,” Academic Journal of Science and Technology, vol. 1, no. 2, pp. 69–79, 2020, doi: 10.53845/ajst.v1i2.244056. (in Thai)
W. Romsaiyud, Machine Learning for Predictive Data Analysis and Applications. Nonthaburi, Thailand: Sukhothai Thammathirat Open Univ. Press, 2023.
JoMingyu, google-play-scraper (version unknown). Python package. Jun. 7, 2024. [Online]. Available:
https://github.com/JoMingyu/google-play-scraper. [Accessed: Feb. 5, 2025]. (in Thai)
P. Bowornlertsutee and W. Paireekreng, “The model of sentiment analysis for classifying the online shopping reviews,” Thai-Nichi Institute of Technology Journal of Applied Science and Technology, vol. 10, no. 1, pp. 71–79, 2021, doi: 10.61284/tni-jast.2021.246375. (in Thai)
A. Jitpittayaporn, “KMSM: Towards more efficient sampling technique for imbalanced classification,” M.S. thesis, Dhurakij Pundit Univ., Bangkok, 2021. [Online]. Available: https://tdc.thailis.or.th/tdc/browse.php?option=show&browse_type=title&titleid=656328. [Accessed: Feb. 10, 2025]. (in Thai)
T. Vichitsrikamol, “Complaint category recommendation system of state agencies using deep learning model,” M.S. thesis, King Mongkut’s Inst. of Technology Ladkrabang, Bangkok, Thailand, 2024. [Online]. Available: https://tdc.thailis.or.th/tdc/dccheck.php?Int_code=101&RecId=9400&obj_id=64822. [Accessed: Feb. 15, 2025]. (in Thai)
S. Sukkasem, “Analysis of sentiment for hotel data using the hybrid word wrap technique,” M.S. thesis, Thai-Nichi Inst. of Technology, 2020. [Online]. Available: https://tdc.thailis.or.th/tdc/browse.php?option=show&browse_type=title&titleid=534599. [Accessed: Feb. 10, 2025]. (in Thai)
W. Arjamnuaikitja, “Sentiment analysis model for digital marketing marketplace,” M.S. thesis, Thammasat Univ., Bangkok, 2024, doi: 10.14457/TU.the.2023.328. (in Thai)