The data classification using Genetic Algorithm with k-length chromosome

Main Article Content

Siraphong Chokthanahirun
Supoj Hengpraprohm

Abstract

This research aims to study and develop the GA-based classifier using the Genetic Algorithm (GA) with k-length chromosome for 2-class data classification; and compare the accuracy with well-known basic methods. First of all, the data is normalized using the z-score technique. Then, the GA is used to select k informative features. Finally, the summation of data for the k features which have been selected is computed and compared with 0. If the summation is greater than 0, it is classified as class 1; otherwise, it is classified as class 2. Nine benchmark datasets are used to test the performance of the proposed method. The datasets are grouped as follows: three datasets with a small number of features (less than 100), three datasets with a medium number of features (100 – 1,000), and three datasets with a large number of features (more than 1,000). The 10-Fold Cross Validation technique is applied to compare the performance of the proposed GA-based method with the K-Nearest Neighbor (KNN), Decision Tree and Naïve Bayes. The experimental results show that the proposed GA-based method gives comparable performance to the KNN, Decision Tree and Naïve Bayes approaches in terms of accuracy and, therefore, confirms the validity of the purpose of this research.

Article Details

How to Cite
Chokthanahirun, S., & Hengpraprohm, S. (2016). The data classification using Genetic Algorithm with k-length chromosome. Interdisciplinary Research Review, 11(1), 43–49. Retrieved from https://ph02.tci-thaijo.org/index.php/jtir/article/view/52339
Section
Research Articles