A comparative study of cell classifiers for image-based high-throughput screening

作者:Abbas Syed Saiden*; Dijkstra Tjeerd M H; Heskes Tom
来源:BMC Bioinformatics, 2014, 15(1): 342.
DOI:10.1186/1471-2105-15-342

摘要

Background: Millions of cells are present in thousands of images created in high-throughput screening (HTS). Biologists could classify each of these cells into a phenotype by visual inspection. But in the presence of millions of cells this visual classification task becomes infeasible. Biologists train classification models on a few thousand visually classified example cells and iteratively improve the training data by visual inspection of the important misclassified phenotypes. Classification methods differ in performance and performance evaluation time. We present a comparative study of computational performance of gentle boosting, joint boosting CellProfiler Analyst (CPA), support vector machines (linear and radial basis function) and linear discriminant analysis (LDA) on two data sets of HT29 and HeLa cancer cells. %26lt;br%26gt;Results: For the HT29 data set we find that gentle boosting, SVM (linear) and SVM (RBF) are close in performance but SVM (linear) is faster than gentle boosting and SVM (RBF). For the HT29 data set the average performance difference between SVM (RBF) and SVM (linear) is 0.42%. For the HeLa data set we find that SVM (RBF) outperforms other classification methods and is on average 1.41% better in performance than SVM (linear). %26lt;br%26gt;Conclusions: Our study proposes SVM (linear) for iterative improvement of the training data and SVM (RBF) for the final classifier to classify all unlabeled cells in the whole data set.

  • 出版日期2014-10-21