Modeling and Benchmark Data Set for the inhibition of c-Jun N-terminal Kinase-3

作者:Schattel Verena; Hinselmann Georg; Jahn Andreas; Zell Andreas; Laufer Stefan*
来源:Journal of Chemical Information and Modeling, 2011, 51(3): 670-679.
DOI:10.1021/ci100410h

摘要

The goal of this paper is to present and describe a novel 2D-. and 3D-QSAR (quantitative structure activity relationship) binary classification data set for the inhibition of c-Jun N-terminal kinase-3 with previously unpublished activities for a diverse set of compounds. JNK3 is an important pharmaceutical target because it is involved in many neurological disorders. Accordingly, the development of JNK3 inhibitors has gained increasing interest. 2D and 3D versions of the data set were used, consisting of 313 (70 actives) and 249 (60 actives) compounds, respectively. All compounds, for which activity was only determined for the racemate, were removed from the 3D data set. We investigated the diversity of the data sets by an agglomerative clustering with feature trees and show that the data set contains several different scaffolds. Furthermore, we show that the benchmarks can be tackled with standard supervised learning algorithms with a convincing performance. For the 2D problem, a random decision forest classifier achieves a Matthew's correlation coefficient of 0.744, the 3D problem could be modeled with a Matthew's correlation coefficient of 0.524 with 3D pharmacophores and a support vector machine. The performance of both data sets was evaluated within a nested 10-fold cross-validation. We therefore suggest that the data set is a reasonable basis for generating QSAR models for JNK3 because of its diverse composition and the performance of the classifiers presented in this study.

  • 出版日期2011-3
  • 单位上海生物信息技术研究中心