摘要

Software filtering systems can be employed to detect and filter out pirated or counterfeit software on the Web sites and peer-to-peer networks. They determine whether a suspicious program is legal or not by comparing it with original programs in a database or in the market. To identify pirated or counterfeit software, software filtering systems need to measure software similarity when comparing a suspicious program with original ones. In this case, the comparison overhead might be very high because the suspicious program is compared with all programs in the database or market in the worst case. This paper proposes a software classification scheme for efficient software filtering systems. The scheme focuses specifically on the Windows portable executable files which have been prime targets for software pirates. The scheme extracts software characteristics from a suspicious program and classifies it into one of pre-defined categories quickly based on the characteristics. The suspicious program is compared only with the programs in the one of pre-defined categories in most cases; thus, the comparison overhead is reduced. We propose two classification methods. The first one extracts strings from GUI-related resources of a program and computes the relevance of the program to each category based on the pre-computed score of the strings. The second one extracts API call frequency from a program's execution codes and uses Random Forest technique to classify the program. Experimental results show that the proposed scheme can classify programs effectively and can reduce the comparison overhead significantly.

  • 出版日期2018-1