Thai Celebrity Information Extraction Based on Association Rule Measures

Chinorot Wangtragulsang; Nattakarn Phaphoom; Phannachet Na Lamphun; Pisit Charnkeitkong; Jian Qu

PDF

Published: Dec 26, 2019

Keywords:

Information extraction Personal information extraction unstructured data weight estimation

Chinorot Wangtragulsang

Panyapiwat Institute of Management

Nattakarn Phaphoom

Panyapiwat Institute of Management

Phannachet Na Lamphun

Panyapiwat Institute of Management

Pisit Charnkeitkong

Panyapiwat Institute of Management

Jian Qu

Panyapiwat Institute of Management

Abstract

This study aims to develop a system to automatically extract and select celebrity information from websites, as traditionally celebrity information is gathered and selected by hand, which is rather time-consuming and often unable to stay updated due to large number of celebrities in Thailand, and potential ambiguity and conflict between information sources on the Internet. This study proposes a novel method that uses pattern matching and association rules to extract date of birth, height and weight of celebrities from websites. In addition, a weight estimation system based on height and BMI is developed in this study. It is found that our system is able to obtain more celebrity information than many Thai websites such as MThai.com. Also, the weight estimation system is able to estimate celebrities’ weights based on height and BMI index.

How to Cite

Wangtragulsang, C., Phaphoom, N., Na Lamphun, P., Charnkeitkong, P., & Qu, J. (2019). Thai Celebrity Information Extraction Based on Association Rule Measures. INTERNATIONAL SCIENTIFIC JOURNAL OF ENGINEERING AND TECHNOLOGY (ISJET), 3(2), 42–50. retrieved from https://ph02.tci-thaijo.org/index.php/isjet/article/view/179415

Issue

Vol. 3 No. 2 (2019): July-December

Section

Research Article

เนื้อหาข้อมูล

References

[1] X. Liu and H. Chen, “AZDrugMiner: an information extraction system for mining patient-reported adverse drug events in online patient forums,” International conference on smart health, pp. 134-150, 2013.

[2] D. Xu, Meizhuo Z, and et al., “Data-driven information extraction from Chinese electronic medical records,” PloS
one, vol. 10, no. 8, pp. 1-18, 2015.

[3] S. S. Sazali, N. A. Rahman, and Z. A. Bakar, “Information extraction: Evaluating named entity recognition from classical Malay documents,” in 2016 Third International Conference on Information Retrieval and Knowledge Management (CAMP), 2016, pp. 48-53.

[4] M. De Choudhury, S. Sharma, and E. Kiciman, “Characterizing dietary choices, nutrition, and language in food deserts via social media,” in Proceedings of the 19th ACM conference on computer-supported cooperative work & social computing, 2016, pp. 1157-1170.

[5] J. Li, G. Jiang, A. Xu, and Y. Wang, “The Automatic Extraction of Web Information Based on Regular Expression.,” JSW, vol. 12, pp. 180-188, 2017.
[6] Y. Chen, J. Zhou, and M. Guo, “A context-aware search system for internet of things based on hierarchical context
model,” Telecommunication Systems, vol. 62, no. 1, pp. 77-91, Mar. 2016.

[7] J. Li, A. Ritter, and E. Hovy, “Weakly supervised user profile extraction from twitter,” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), vol. 1, pp. 165-174, 2014.

[8] Y. Chen, S. Y. M. Lee, and C.-R. Huang, “A robust web personal name information extraction system,” Expert
Systems with Applications, vol. 39, no. 3, pp. 2690-2699, 2012.

[9] M. Aboaoga and M. J. Ab Aziz, “Arabic person names recognition by using a rule based approach,” Journal of
Computer Science, vol. 9, no. 7, pp. 922, 2013.

[10] J. Qu and Y. Lu, “Automatic identification and multi-translatable translation of vocabulary terms with a combined approach,” in 2016 Eighth International Conference on Advanced Computational Intelligence (ICACI), 2016, pp. 342-348.

[11] A. Imsombut and C. Sirikayon, “An alternative technique for populating thai tourism ontology from texts based on
machine learning,” in 2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS),
2016, pp. 1-4.

Article Sidebar

Main Article Content

Abstract

Article Details

References