摘要

Host-origin classification and signatures of influenza A viruses were investigated based on the HA protein for tracking of the HA host of origin. Hidden Markov models (HMMs), decision trees and associative classification for each influenza A virus subtype and its major hosts (human, avian, swine) were generated. Features of the HA protein signatures that were host-and subtype-specific were sought. Host-associated signatures that occurred in different subtypes of the virus were identified. Evaluation of the classification models based on ROC curves and support and confidence ratings for the amino acid class-association rules was performed. Host classification based on the HA subtype achieved accuracies between 91.2% and 100% using decision trees after feature selection. Host-specific class association rules for avian-host origins gave better support and confidence ratings, followed by human and finally swine origin. This finding indicated the lower specificity of the swine host, perhaps pointing to its ability to mix different strains.

  • 出版日期2014-1-20

全文