摘要

The base sequence autocorrelation (BSA) descriptors were used to describe structures of oligonucleotides and to develop accurate quantitative structure-retention relationship (QSRR) models of oligonucleotides in ion-pair reversed-phase high-performance liquid chromatography. Through the combination use of multiple linear regression (MLR) and genetic algorithm (GA), QSRR models were developed at temperatures of 30 degrees C, 40 degrees C, 50 degrees C, 60 degrees C and 80 degrees C, respectively. Satisfactory results were obtained for the single-temperature models (STM). Multi-temperature model (MTM) was also developed that can be used for predicting the retention time at any temperature. The correlation coefficients of retention time prediction for the test set based on the MTM model at 30 degrees C, 40 degrees C, 50 degrees C, 60 degrees C and 80 degrees C were 0.978, 0.982, 0.989, 0.988 and 0.996, respectively. The corresponding absolute average relative deviations (AARD) for the test set at each temperature were all less than 1%. The new strategy of feature representation and multi-temperatures modeling is a very promising tool for QSRR modeling with good predictive ability for predicting retention time of oligonucleotides at multiple temperatures under the studied condition.