摘要

In this research, a hybrid model is developed by integrating a case-based data clustering method and a fuzzy decision tree for medical data classification. Two datasets from UCI Machine Learning Repository, i.e., liver disorders dataset and Breast Cancer Wisconsin (Diagnosis), are employed for benchmark test. Initially a case-based clustering method is applied to preprocess the dataset thus a more homogeneous data within each cluster will be attainted. A fuzzy decision tree is then applied to the data in each cluster and genetic algorithms (GAs) are further applied to construct a decision-making system based on the selected features and diseases identified. Finally, a set of fuzzy decision rules is generated for each cluster. As a result, the FDT model can accurately react to the test data by the inductions derived from the case-based fuzzy decision tree. The average forecasting accuracy for breast cancer of CBFDT model is 98.4% and for liver disorders is 81.6%. The accuracy of the hybrid model is the highest among those models compared. The hybrid model can produce accurate but also comprehensible decision rules that could potentially help medical doctors to extract effective conclusions in medical diagnosis.

  • 出版日期2011-1