摘要

Principal component regression (PCR) and principal component-artificial neural network (PC-ANN) models were applied to prediction of the acidity constant for various benzoic acids and phenols (242 compounds) in water at 25 degrees C. A large number of theoretical descriptors were calculated for each molecule. The first fifty principal components (PC) were found to explain more than 95% of variances in the original data matrix. From the pool of these PC's, the eigenvalue ranking method was employed to select the best set of PC for PCR and PC-ANN models. The PC-ANN model with architecture 47-20-1 was generated using 47 principal components as inputs and its output is pK(a). For evaluation of the predictive power of the PCR and PC-ANN models, pK(a) values of 37 compounds in the prediction set were calculated. Mean percentage deviation (MPD) for PCR and PC-ANN models are 18.45 and 0.6448, respectively. These improvements are due to the fact that the pK(a) of the compounds demonstrate non-linear correlations with the principal components. Comparison of the results obtained by the models reveals superiority of the PC-ANN model relative to the PCR model.

  • 出版日期2008-5