摘要

Existing anti-virus methods make use of signatures to detect malicious codes. They are inefficient to detect various forms of computer viruses, especially new variants and unknown viruses. Inspired by biologic immune system, a novel artificial immune based signature extraction method is proposed. This method automatically identifies bit patterns that correlate with viruses using instruction frequency and file frequency, and then identifies higher-level genes that are associated with viruses, generating a detecting virus gene library using the negative selection algorithm which leads to a fairly low false positive rate compared with the traditional signature-based methods. The advantages of our proposed method are described as follows. In the feature extraction phase, the detecting virus gene library stores virus samples with variable number of variable length genes at individual level, and uses multiple genes coexistence in one virus to avoid the possible loss of information considerably, fully taking the advantages of relevance between viral instructions within a virus program; in the classification phase, suspicious programs are analyzed at individual level in contrast to the existing gene matching technique. Experimental results indicate that the proposed method yields high detection rates for obfuscated viruses with an averaged recognition rate of 94% in real-world conditions, the false positive rate can be maintained below 2%. The method has a good generalization ability, and is able to effectively and efficiently detect new variants of known virus and unknown viruses.

全文