Web Translation of English Medical OOV Terms to Chinese with Data Mining Approach

Main Article Content

Jian Qu
Thanaruk Theeramunkong
Cholwich Nattee
Pakinee Aimmanee

Abstract

It has always been so difficult to successfully translate OOV terms form one language to others, especially for those languages without clear word boundaries, such like Chinese. Some Chinese translations of English medical OOV terms are not perfectly translated, most of which involved non-Chinese character in the translation. Since all existing approach for translating English OOV term into Chinese handles the retrieved snippets from the Internet by extracting only Chinese characters, as a result the non-Chinese characters part of a Chinese translation would be lost. We propose an rule based method by considering each input English OOV term and automatically adjust the candidate generating system accordingly, unlike most existing OOV term translation approaches, we use a data mining approach rather than a frequently used statistic based approach to select the final translation candidate. By testing our approach with two of the most difficult English medial OOV term sets ICD9-CM and ICD9, it outperforms the existing approach with a precision of 89.98% and is able to handle the Chinese translation that includes non-Chinese characters.

Keywords: out of vocabulary (OOV), association measures, sematic conditional probability(SCP), decision tree.

Article Details

How to Cite
Qu, J., Theeramunkong, T., Nattee, C., & Aimmanee, P. (2015). Web Translation of English Medical OOV Terms to Chinese with Data Mining Approach. Science & Technology Asia, 16(2), 26–40. Retrieved from https://ph02.tci-thaijo.org/index.php/SciTechAsia/article/view/41217
Section
Articles