A scalable approach for detecting plagiarized mobile applications

作者:Oprisa Ciprian*; Gavrilut Dragos; Cabau George
来源:Knowledge and Information Systems, 2016, 49(1): 143-169.
DOI:10.1007/s10115-015-0903-y

摘要

Plagiarism cases are quite common in mobile applications ecosystems like the Android market. An application can be decompiled, modified and repackaged with a different author name. The modifications can affect the user's privacy or even contain malicious logic. If the original application is supported by advertisements, they are usually replaced so the ad revenue will go to the repackager. Such events can cause the legitimate author damage both in reputation and financially so they need to be detected. A plagiarism detection system is proposed that can detect plagiarized applications based on the features extracted from code. Two similarity functions are given along with techniques for finding similar applications in a large collection. The main issue with this search is that it cannot be performed sequentially, by comparing a given item with every other item in the collection. The built solution will improve the search time by comparing the searched item only with those with a high probability of being similar. The greatest advantage of our approach is scalability. The system's database can be built, updated and queried in reasonable time, even when large quantities of data are involved. Our experiments were conducted on a large collection of over one million samples and managed to identify a concerning number of plagiarism cases.

  • 出版日期2016-10