摘要

The purpose of this study is to apply non-medical methods to classify two types of diffuse large B-cell lymphoma (DLBCL), which are the germinal-center type (GCB) and the activated B-cell type (ABC). The study materials are MicroRNAs (miRNAs) acquired from DLBCL patients. In order to achieve this goal, statistical methods (i.e. linguistic analysis) and engineering method (i.e. ensembled artificial neural networks (EANN)) have been independently used to do qualitative and quantitative analysis. On this basis, a novel noise elimination enhanced algorithm has been proposed to improve the efficiency of linguistic analysis, namely ensembled linguistic analysis. According to the results, the phylogenetic tree can achieve better performance than initial linguistic analysis. On the other hand, EANN model was established to perform the classification quantitatively, and sensitivity analysis (SA) for EANN was carried out to evaluate the significance ranking of the miRNAs and finally select the 5 most important miRNAs. Besides, classical linear and logistic regression models were developed for comparison with EANN classification results. The regression results were evidently worse than EANN model. This study proves that each lymphoma type has a distinctive pattern of miRNAs expression and the miRNAs expression pattern of ABC is more close to white noise than GCB. Both linguistic analysis and EANN model achieved accurate results; however the performance of EANN model for classification is much better. The 5 selected important miRNAs will be helpful for further study.