摘要

The extremely complicated nature of many biological problems makes them bear the features of fuzzy sets, such as with vague, imprecise, noisy, ambiguous, or input-missing information For instance, the current data in classifying protein structural classes are typically a fuzzy set. To deal with this kind of problem, the AAPCA (Amino Acid Principal Component Analysis) approach was introduced. In the AAPCA approach the 20-dimensional amino acid composition space is reduced to an orthogonal space with fewer dimensions, and the original base functions are converted into a set of orthogonal and normalized base functions. The advantage of such an approach is that it can minimize the random errors and redundant information in protein dataset through a principal component selection, remarkably improving the success rates in predicting protein structural classes. It is anticipated that the AAPCA approach can be used to deal with many other classification problems in proteins as well.