An information theoretic approach for feature selection

Kumar Gulshan<sup>*</sup>; Kumar Krishan

doi:10.1002/sec.303

摘要

Feature selection methods play a significant role during classification of data having high dimensions of features. The methods select most relevant subset of features that describe data appropriately. Mutual information (MI) based upon information theory is one of the metrics used for measuring relevance of features. This paper analyses various feature selection methods for (1) reduction in number of features; (2) performance of Naive Bayes classification model trained on reduced set of features. Research gaps identified are: (1) computation of MI from the whole sample space instead of unclassified sample subspace; (2) consideration of relevance of features only or tradeoff between relevance and redundancy, but class conditional interaction of features is ignored. In this paper, we propose a general evaluation function using MI for feature selection. The proposed evaluation function is implemented which use dynamically computed MI values from unclassified instances. Effectiveness of the proposed feature selection method is done empirically by comparing classification results using KDD 1999 benchmarked dataset of intrusion detection. The results indicate practicability and effectiveness of the proposed method for applications concerned with high accuracy and stability of predictions.

出版日期2012-2

全文

访问全文

收藏分享被引(12) 浏览

更新时间：2024-04-24 14:19

An information theoretic approach for feature selection

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友