摘要

Feature selection is a process of selecting optimal features that produce the most prognostic outcome. It is one of the essential steps in knowledge discovery. The crisis is that not all features are important. Most of the features may be redundant, and the rest may be irrelevant and noisy. This paper presents a novel feature selection approach to deal with issues of high dimensionality in the medical dataset. Medical datasets are habitually classified by a large number of measurements and a comparatively small number of patient records. Most of these measurements are irrelevant or noisy. This paper proposes a supervised feature selection method based on Rough Set Quick Reduct hybridized with Improved Harmony Search algorithm. Rough set theory is one of the most thriving methods used for feature selection. The Rough Set Improved Harmony Search Quick Reduct (RS-IHS-QR) algorithm is a relatively new population-based meta-heuristic optimization algorithm. This approach imitates the music improvisation process, where each musician improvises their instrument's pitch by searching for a perfect state of harmony. The quality of the reduced data is measured by the classification performance. The proposed algorithm is experimentally compared with the existing algorithms Rough Set Quick Reduct (RS-QR) and Rough Set Particle Swarm Optimization Quick Reduct (RS-PSO-QR). The number of features selected by the proposed method is comparatively low. The proposed algorithm reveals more than 90 % classification accuracy in most of the cases and the time taken to reduct the dataset also decreased than the existing methods. The experimental result demonstrates the efficiency and effectiveness of the proposed algorithm.

  • 出版日期2015-11