The data classification using Genetic Algorithm with k-length chromosome
Main Article Content
Abstract
This research aims to study and develop the GA-based classifier using the Genetic Algorithm (GA) with k-length chromosome for 2-class data classification; and compare the accuracy with well-known basic methods. First of all, the data is normalized using the z-score technique. Then, the GA is used to select k informative features. Finally, the summation of data for the k features which have been selected is computed and compared with 0. If the summation is greater than 0, it is classified as class 1; otherwise, it is classified as class 2. Nine benchmark datasets are used to test the performance of the proposed method. The datasets are grouped as follows: three datasets with a small number of features (less than 100), three datasets with a medium number of features (100 – 1,000), and three datasets with a large number of features (more than 1,000). The 10-Fold Cross Validation technique is applied to compare the performance of the proposed GA-based method with the K-Nearest Neighbor (KNN), Decision Tree and Naïve Bayes. The experimental results show that the proposed GA-based method gives comparable performance to the KNN, Decision Tree and Naïve Bayes approaches in terms of accuracy and, therefore, confirms the validity of the purpose of this research.