Comparison of Keywords Extraction Techniques in Kickstarter and Indiegogo Projects

Main Article Content

Woottikarn Hongwiengchan
Jian Qu

Abstract

there are many fake projects on Kickstarter and Indiegogo, and they are usually hard to distinguish from real projects. This research is a pioneer study to try to find a way for helping humans to identify possible fake projects. We propose to extract keywords from the projects, the extracted keywords would give the user a better understanding of the project. We compared keyword extraction for crowdfunding projects by using RAKE, NLTK, LIAAD/YAKE, BERT, and Gensim models. We measured the keyword extraction performance of each model using the precision, recall, and F1 scores. According to the results, the NLTK model is the most efficient, with a precision of 54.40% and an F1 of 70.47%.

Article Details

How to Cite
Hongwiengchan, W., & Qu, J. . (2023). Comparison of Keywords Extraction Techniques in Kickstarter and Indiegogo Projects. INTERNATIONAL SCIENTIFIC JOURNAL OF ENGINEERING AND TECHNOLOGY (ISJET), 7(1), 41–47. Retrieved from https://ph02.tci-thaijo.org/index.php/isjet/article/view/245512
Section
Research Article

References

F. Bayatmakou, A. Ahmadi, and A. Mohebi, “Automatic QueryBased Keyword and Keyphrase Extraction,” in Proc. 2017 Artificial Intelligence and Signal Processing Conference, 2017, pp. 325-330.

E. Loper and S.Bird, “NLTK:The Natural LanguageToolkit,” in Proc. The ACL Interactive Poster and Demonstration Sessions, 2020, pp. 62-69.

S. Rose, D. Engel, and N. Cramer, Automatic Keyword Extraction from Individual Documents. John Wiley & Sons, Inc., New Jersey, 2010, pp. 1-20.

M. M. Haider, M. A. Hossin, H. R. Mahi et al., “Automatic Text Summarization Using Gensim Word2Vec and K-Means Clustering Algorithm,” in Proc. IEEE Region 10 Symposium, 2020, pp. 283-286.

R. Campos, V. Mangaravite, A. Pasquali et al., “YAKE! Keyword Extraction from Single Documents using Multiple Local Features,” Information Sciences, vol. 509, pp. 257-289. Jan. 2020.

N.Karacapilidis,N.Kanakaris,andN.Giarelis,“AComparative Assessment of State-of-the-Art Methods for Multilingual UnsupervisedKeyphraseExtraction,”inProc. IFIP International Conference on Artificial Intelligence Applications and Innovations, 2021, pp. 635-645.

C. Zhang, “Automatic Keyword Extraction from Documents using Conditional Random Fields,” Journal of Computational Information Systems, vol. 4, no. 3, pp. 1169-1180, Jun. 2008.

A. Onan, S. Korukoğlu, and H. Bulut, “Ensemble of Keyword Extraction Methods and Classifiers in Text Classification,” Expert Systems with Applications, vol. 57, pp. 232-247, Sep. 2006.

J. Qu, NL. Minh, A. Shimazu, “Web Based EnglishChinese OOV Term Translation Using Adaptive Rules and Recursive Feature Selection,” in Proc. The 25th Pacific Asia Conference on Language, Information and Computation, 2011, pp. 1-9.

T. Mikolov, K. Chen, G. Corrado et al., “Efficient Estimation of Word Representations in Vector Space,” arXiv, no. 1301.3781. pp. 1-10. Jan. 2013.

J. Devlin, M. W. Chang, K. Lee et al., “BERT Pre-training of Deep Bidirectional Transformers for Language Understanding,” arXiv, no. 1810.04805v2, pp. 1-16. May. 2019.

A. Vaswani, N. Shazeer, N. Parmar et al., “Attention is All You Need,” in Proc. Advances in Neural Information Processing Systems,” 2017, pp. 5998-6008.