摘要

A new representation technique for peptide sequences, namely SZOTT (scores vector of zero dimension, one dimension, two dimension, and three dimension), was derived from 1369 parameters of 20 coded amino acids using principle components analysis (PCA). It was then employed to express 71 peptide sequences with different lengths. Quantitative structure-retention modelings (QSRMs) were constructed by support vector machine (SVM) and partial least square (PLS). The results indicated that 71 peptide sequences could be preferably represented by SZOTT with many advantages, such as plentiful structural information and easy manipulation. Also simulative power for interior samples and predictive power for exterior samples by SVM were superior to those from PLS. SZOTT and SVM can be applied to develop QSRMs.