摘要

A multivariate study involving flavonoids with activity against human colon carcinoma Caco-2 is reported. The data set is composed of 26 flavonoids which can be separated, depending on their EC50 value i.e. the half maximal effective concentration, into active and less active compounds. Our purpose was to transform the chemical structure of each compound into a set of numbers, and to correlate them with the biological activity establishing a qualitative/quantitative relationship between calculated molecular descriptors and antiproliferative activity. The geometries of the studied flavonoids were fully optimized employing the Density Functional Theory (DFT) with the hybrid functional B3LYP in conjunction with the 6-31G(d) basis set. For each optimized structure we computed a set of parameters (1,356 descriptors) which characterizes the molecule from structural, topological, steric, electronic, hydrophobic, etc. points of view. Using the Fisher's weight and the correlation matrix, we reduced the number of descriptors from 1,356 to 15.
Aiming to investigate which molecular descriptors would be more efficient in classifying flavonoid compounds according to their degree of anticancer activity, we applied two unsupervised learning methods, Principal Component Analysis (PCA) and Cluster Analysis (CA), and a supervised learning method, Stepwise Discriminant Analysis (SDA). Using the 15 selected descriptors as input, the PCA and SDA techniques supplied us with the following parameters as common relevant descriptors: C-16, EN, Mor08e, HATS8m. The reliability of the structure-activity relationship model is verified using the cross-validation technique, the percentage of correct classifications being of 92.31 %.

  • 出版日期2014