การเปรียบเทียบประสิทธิภาพของโครงข่ายประสาทเทียมแบบคอนโวลูชัน เพื่อจำแนกอักขระขอมไทยในคัมภีร์โบราณทางพระพุทธศาสนา

Sornchai Laksanapeeti; Sukumal Kitisin

doi:10.37936/ectiard.2024-4-1.253100

PDF

Published: Apr 30, 2024

DOI: https://doi.org/10.37936/ectiard.2024-4-1.253100

Keywords:

Deep Learning Convolutional Neural Network Optical Character Recognition Thai-Khmer Character Buddhist scriptures

Sornchai Laksanapeeti

Kasetsart University

Sukumal Kitisin

Kasetsart University

Abstract

In Thailand, numerous significant Buddhist scriptures, like the Tripitaka, are preserved on paper or palm leaves inscribed with Khmer-Thai characters, posing a risk of future deterioration. Accurate transcriptions of these scriptures are crucial for delving into the historical evidence of Buddhism in Thailand. However, experts still face challenges in deciphering and transcribing Khmer-Thai characters into Thai script. This research endeavors to enhance this process by developing and comparing deep learning models employing Convolutional Neural Networks (CNNs) for Khmer-Thai character recognition. The study employs a learning framework comprising 10 distinct CNN models, including ResNet50 and ResNet50V2, utilizing datasets sourced from handwritten samples from the Dhammakayadi scriptures and Khmer-Thai alphabet textbooks. These datasets encompass 96 Khmer-Thai characters and are divided into training, validation, and test sets. Training is conducted over 10 epochs. Results indicate that among the 10 models developed, one exhibits the highest accuracy at 79.25%, with a training time of 250.72 seconds. In comparison, the model utilizing the ResNet50 architecture achieves an accuracy of 77.88%, requiring 1,514.13 seconds for training. These findings suggest that the CNN structure generated in this study holds promise for further refinement as a model for recognizing Thai-Khmer characters.

Issue

Vol. 4 No. 1 (2024): Jan - April 2024

Section

Research Articles

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

References

M. Lohakan, S. Airphaiboon, and M. Sangworasil. “Single-character segmentation for handprinted Thai word”. International Conference on Document Analysis and Recognition. 20-22 Sep. Bangalore India : pp. 661–664, 1999. doi: 10.1109/ICDAR.1999.791874.

C. Yingsaeree and A. Kawtrakul. “Rule-based middle-level character detection for simplifying Thai document layout analysis”. International Conference on Document Analysis and Recognition. 29 Aug-1 Sep. Seoul South Korea : pp. 888-892, 2005. doi: 10.1109/ICDAR.2005.204.

C. K. Nguyen, C. T. Nguyen, and N. Masaki. “Tens of Thousands of Nom Character Recognition by Deep Convolution Neural Networks”. International Workshop on Historical Document Imaging and Processing. 10-11 Nov. Kyoto Japan : pp. 37–41, 2017. doi: 10.1145/3151509.3151517.

K. S. Younis and A. A. Alkhatee., “A New Implementation of Deep Neural Networks for Optical Character Recognition and Face Recognition”. New Trends in Information Technology. 25-27 Apr. Amman Jordan : 2017.

A. Singh, S. Jangra, and G. Aggarwal. “EnvisionText: Enhancing Text Recognition Accuracy through OCR Extraction and NLP-based Correction”. International Conference on Cloud Computing, Data Science & Engineering. 18-19 Jan. Uttar Pradesh India : pp. 47–52, 2024. doi: 10.1109/Confluence60223.2024.10463478.

K. B and H. M. “Offline Handwritten Basic Telugu Optical Character Recognition (OCR) using Convolution Neural Networks (CNN)”. International Conference on Computing for Sustainable Global Development. 15-17 Mar. New Delhi India : pp. 1195–1200, 2023.

A. Kashyap and A. Kumara B. “OCR of Kannada Characters Using Deep Learning”. Trends in Electrical, Electronics, Computer Engineering Conference. 26-27 May. Bengaluru India : pp. 35–38, 2022. doi: 10.1109/TEECCON54414.2022.9854842.

S. K. Manocha and P. Tewari. “Comparative Study of Deep Learning Models for Devanagari OCR”. International Conference on Smart Generation Computing, Communication and Networking. 29-30 Oct. Pune India : pp. 1–7, 2021. doi: 10.1109/SMARTGENCON51891.2021.9645924.

A. Razdan, A. Raghavan, M. Panandikar, A. Pol, K. Hedaoo, and R. Patil. “Recognition of Handwritten Medical Prescription using CNN Bi-LSTM With Lexicon Search”. International Conference on Computing Communication and Networking Technologies. 6-8 Jul. New Delhi, India : pp. 1–6, 2023. doi: 10.1109/ICCCNT56998.2023.10307451.

N. M. Dipu, S. A. Shohan, and K. M. A. Salam. “Bangla Optical Character Recognition (OCR) Using Deep Learning Based Image Classification Algorithms”. International Conference on Computer and Information Technology. 18-20 Dec. Dhaka Bangladesh : pp. 1–5, 2021. doi: 10.1109/ICCIT54785.2021.9689864.

M. K. Gupta, S. Vikram, S. Dhawan, and A. Kumar. “Handwritten OCR for word in Indic Language using Deep Networks”. International Conference on Signal Processing and Integrated Networks. 23-24 Mar. Noida India : pp. 389–394, 2023. doi: 10.1109/SPIN57001.2023.10117106.

S. Patel, D. Sanghavi, and A. Nanade. “Modernizing Data Processing: CNNs and OCR for Automated Document Classification and Data Extraction”. Global Conference on Information Technologies and Communications. 1-3 Dec. Bengaluru India : pp. 1–8, 2023. doi: 10.1109/GCITC60406.2023.10426541.

T. T. Dao, C. T. Le, T. D. Ngo, and T. H. Le. “Detection and Recognition of Sino-Nom Characters on Woodblock-Printed Images”. in 2023 15th International Conference on Knowledge and Systems Engineering. 18-20 Oct. Ha Noi Vietnam : pp. 1–6, 2023. doi: 10.1109/KSE59128.2023.10299440.

P. S. M et al., “CNN Based Character Recognition and Classification in Tamil Palm Leaf Manuscripts”. International Conference on Communication, Computing and Internet of Things. 10-11 Mar. Chennai India : pp. 1–6, 2022. doi: 10.1109/IC3IOT53935.2022.9767866.

C.-C. Chang, A. Arora, L. P. Garcia Perera, D. Etter, D. Povey, and S. Khudanpur. “Optical Character Recognition with Chinese and Korean Character Decomposition”. International Conference on Document Analysis and Recognition Workshops. 22-25 Sep. Sydney Australia : pp. 134–139, 2019. doi: 10.1109/ICDARW.2019.40094.

Article Sidebar

Main Article Content

Abstract

Article Details

References