A Cosine-Similarity Mutual-Information Approach for Feature Selection on High Dimensional Datasets

作者:Dubey Vimal Kumar; Saxena Amit Kumar
来源:Journal of Information Technology Research, 2017, 10(1): 15-28.
DOI:10.4018/JITR.2017010102

摘要

<jats:p>A novel hybrid method based on Cosine Similarity and Mutual Information is presented to find out relevant feature subset. Initially, the supervised Cosine Similarity of each feature is calculated with respect to the class vector and then features are grouped based on the obtained cosine similarity values. From each group the best mutual informative feature is selected. The selected features subset is tested using the three classifiers namely Naïve Bayes (NB), K-Nearest Neighbor and Classification and Regression trees (CART) for getting classification accuracy. The proposed method is applied to various high dimensional datasets. Obtained results showed that the proposed method is capable of eliminating the redundant and irrelevant features.</jats:p>

  • 出版日期2017-3