Semantic Search for Information System Domain Bibliographic Data
Main Article Content
Abstract
- Searching through digital information by using keywords usually returns incorrect results and does not reflect user requirements. The researchers have developed semantic search or ontology that increases efficiency of information retrieval. However, one of the problems of semantic search is that development depends on the expert knowledge. To solve the problem above, this research proposes sub class creation via semi-automatic techniques and finding relation of class by association rules. The proposed method is composed of 4 steps: 1) feature selection using Chi-Square and Support Vector Machine Radial basis function (SVMR) in order to reduce phrases and send statistics value into a subclass to aid the semi-automatic ontology, 2) semantic terms mapping using association rule methods, 3) development of Semantic Search for Information System Domain Bibliographic Data, and 4) evaluation of the developed information retrieval system. The result of the proposed method produced precision at 97.07%.
Article Details
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
I/we certify that I/we have participated sufficiently in the intellectual content, conception and design of this work or the analysis and interpretation of the data (when applicable), as well as the writing of the manuscript, to take public responsibility for it and have agreed to have my/our name listed as a contributor. I/we believe the manuscript represents valid work. Neither this manuscript nor one with substantially similar content under my/our authorship has been published or is being considered for publication elsewhere, except as described in the covering letter. I/we certify that all the data collected during the study is presented in this manuscript and no data from the study has been or will be published separately. I/we attest that, if requested by the editors, I/we will provide the data/information or will cooperate fully in obtaining and providing the data/information on which the manuscript is based, for examination by the editors or their assignees. Financial interests, direct or indirect, that exist or may be perceived to exist for individual contributors in connection with the content of this paper have been disclosed in the cover letter. Sources of outside support of the project are named in the cover letter.
I/We hereby transfer(s), assign(s), or otherwise convey(s) all copyright ownership, including any and all rights incidental thereto, exclusively to the Journal, in the event that such work is published by the Journal. The Journal shall own the work, including 1) copyright; 2) the right to grant permission to republish the article in whole or in part, with or without fee; 3) the right to produce preprints or reprints and translate into languages other than English for sale or free distribution; and 4) the right to republish the work in a collection of articles in any other mechanical or electronic format.
We give the rights to the corresponding author to make necessary changes as per the request of the journal, do the rest of the correspondence on our behalf and he/she will act as the guarantor for the manuscript on our behalf.
All persons who have made substantial contributions to the work reported in the manuscript, but who are not contributors, are named in the Acknowledgment and have given me/us their written permission to be named. If I/we do not include an Acknowledgment that means I/we have not received substantial contributions from non-contributors and no contributor has been omitted.
References
2. วิไลพร เลิศมหาเกียรติ, ภูริวัตร คัมภีรภาพพัฒน์ และ อนิราช มิ่งขวัญ, “รูปแบบการแสดงผลการค้นคืนของเครื่องมือการสืบค้นสารนิเทศบนอินเตอร์เน็ต”, วารสารวิชาการพระจอมเกล้าพระนครเหนือ, ปีที่ 18 ฉบับที่ 1, มกราคม–เมษายน 2551, หน้า 89-98
3. สินีนาถ นาคโตและไพลิน เงินประเสริฐ, “การค้นหาเอกสารงานวิจัยเชิงความหมาย”, ปริญญานิพนธ์วิทยาศาสตร์บัณฑิต ภาควิชาวิทยาการคอมพิวเตอร์และสารสนเทศ คณะวิทยาศาสตร์ประยุกต์ สถาบันเทคโนโลยีพระจอมเกล้าพระนครเหนือ, 2549.
4. สิทธา อาภาศิริกุลและสาริน สุภาพึ่ง, “ระบบช่วยเลือกซื้อโทรศัพท์มือถือโดยใช้เว็บเชิงความหมาย”, ปริญญานิพนธ์วิทยาศาสตร์บัณฑิต ภาควิชาวิทยาการคอมพิวเตอร์และสารสนเทศ คณะวิทยาศาสตร์ประยุกต์ สถาบันเทคโนโลยีพระจอมเกล้าพระนครเหนือ, 2549.
5. W3C, “Resource Description Framework”, Available from: http://www.w3.org/XML/Schema
6. W3C, “Resource Description Framework”, Available from: http://www.w3.org/RDF/
7. W3C, “RDF Vocabulary Description Language 1.0: RDF Schema”, Available from: http://www.w3.org/TR/rdf-schema/
8. W3C, “OWL Web Ontology Language, Available from: http://www.w3.org/TR/owl-features/
9. X.H.Wang, et al. “Ontology Based Context Modeling and Reasoning using OWL”, Proceedings of the Second IEEE Annual Conference on Pervasive Computing and Communications Workshops (PERCOMW’04), 2004. pp. 18–22
10. C. Haruechaiyasak, et al. “Implementing News Article Category Browsing Based on Text Categorization Technique”, The 2008 IEEE/WIC/ACM International Conference on Web Intelligence (WI-08) workshop on Intelligent Web Interaction (IWI 2008), 2008. pp. 143-146
11. V. Nuipian, P. Meesad, and P. Boonrawd, “Improve Abstract Data with Feature Selection for Classification Techniques”, Advanced Materials Research, Vol. 403-408, November 2011. pp. 3699-3703
12. K. Thongklin, S. Vanichayobon and W. Wett, “Word Sense Disambiguation and Attribute Selection Using Gain Ratio and RBF Neural Network”, IEEE International Conference Innovation and Vision for the Future in Computing & Communication Technologies (RIVF’ 08), Vaidnam, 2008.
13. P.N. Tan, M. Steinbach, and K. Vipin, Introduction to Data Mining. United State of America : Addison Wesley, 2005.
14. J. R. Quinlan, “Induction of Decision Trees”, Machine Learning, Vol. 1, 1986. pp. 81-106 15. P. Saengsiri, et al. “Comparison of Hybrid Feature Selection Models on Gene Expression Data”, IEEE International Conference on ICT and Knowledge Engineering, 2010. pp.13-18
16. G. L. Bretthorst, “Bayesian Spectrum Analysis and Parameter Estimation in Lecture Notes in Statistics”, New York, Springer-Verlag, 1988.
17. D. Lewis, “Naïve (Bayes) at forty: The independence assumption in information retrieval”, Lecture Notes in Computer Science, 1998. pp. 4-15
18. B. Links, “A Detailed Introduction to K-Nearest Neighbor (KNN) Algorithm”, Available from: http://saravananthirumuruganathan.wordpress.com/2010/05/17/a-detailed-introduction-to-k-nearest-neighbor-knn-algorithm/
19. V. Vapnik, “The Nature of Statistical Learning Theory”, New York, Springer, 1995.
20. C.J. Burges, “A tutorial on support vector machines for pattern recognition”, Data Mining and Knowledge Discovery 2, 1998. pp. 121–167
21. สิริรัตน์ ประกฤติกรชัย, “การสร้างต้นแบบออนโทโลยีของพืชสมุนไพรไทย”, สารนิพนธ์วิทยาศาสตร์มหาบัณฑิต ภาควิชาวิทยาการคอมพิวเตอร์และสารสนเทศ คณะวิทยาศาสตร์ประยุกต์ มหาวิทยาลัยเทคโนโลยีพระจอมเกล้าพระนครเหนือ, 2550.
22. วรวิกานต์ ปัณณะรัส, “การประยุกต์เว็บเชิงความหมายในการสืบค้นความเชี่ยวชาญของนักวิจัย”, วิทยานิพนธ์วิทยาศาสตร์มหาบัณฑิต ภาควิชาวิทยาการคอมพิวเตอร์และสารสนเทศ คณะวิทยาศาสตร์ประยุกต์ มหาวิทยาลัยเทคโนโลยีพระจอมเกล้าพระนครเหนือ, 2552.
23. W3C, “SPARQL”, Available from: http://www.w3.org/TR/rdf-sparql-query/
24. Association for Computing Machinery : ACM, Available from: http://portal.acm.org/portal.cfm
25. P. Meesad, V. Nuipian and P. Boonrawd, “A Chi-Square-Test for Word Importance Differentiation in Text Classification”, Proceedings of 2011 International Conference on Information and Electronics Engineering (ICIEE 2011), 2011. pp. 110-114
26. G. Cui, et al. “Automatic Acquisition of Attributes for Ontology Construction”, Springer, 2009.
27. Protégé, Available from: http://protege.stanford.edu/