摘要

In order to predict the activity of HIV protease inhibitors, constitutional and topological descriptors, in total 462, were calculated to characterize the structural and physicochemical properties for each molecule under study. The Kennard-Stone method and a random method were adopted to design the training set and the test set. Monte Carlo simulated annealing method was applied to the variable selection. Machine learning methods including support vector machine, artificial neural network, logistic regression, and k-nearest neighbor, were applied to the development of inhibitor models. It was shown that the support vector machine method outperforms the other methods and the final model developed using the SVM method gave a prediction accuracy of 98.24%.