摘要

Fingerprints, bit representations of compound chemical structure, have been widely used in cheminformatics for many years. Although fingerprints with the highest resolution display satisfactory performance in virtual screening campaigns, the presence of a relatively high number of irrelevant bits introduces noise into data and makes their application more time-consuming. In this study, we present a new method of hybrid reduced fingerprint construction, the Average Information Content Maximization algorithm (AIC-MAX ALGORITHM), which selects the most informative bits from a collection of fingerprints. This methodology, applied to the ligands of five cognate serotonin receptors (5-HT2A, 5-HT2B, 5-HT2C, 5-HT5A, 5-HT6), proved that 100 bits selected from four non-hashed fingerprints reflect almost all structural information required for a successful in silico discrimination test. A classification experiment indicated that a reduced representation is able to achieve even slightly better performance than the state-of-the-art 10-times-longer fingerprints and in a significantly shorter time.

  • 出版日期2016-1-19