摘要

Chemical reactions always involve several molecules of two types, reactants and products. Existing data mining techniques, eg. Quantitative Structure Activity Relationship (QSAR) methods, deal with individual molecules only. In this article, we propose to use a Condensed Graph of Reaction (CGR) to merge all molecules involved in a reaction into one molecular graph. This allows one to consider reactions as pseudo-molecules and to develop QSAR models based on fragment descriptors. Then ISIDA (In SIlico Design and Analysis) fragment descriptors built from CGRs are used to generate models for the rate constant of S(N)2 reactions in water, using three usual attribute-value regression algorithms (linear regression, support vector machine, and regression trees). This approach is compared favorably to two state-of-the-art relational data mining techniques.

  • 出版日期2011-4