Approaches to the classification of high entropy file fragments

Penrose Philip; Macfarlane Richard; Buchanan William J<sup>*</sup>

doi:10.1016/j.diin.2013.08.004

摘要

In this paper we propose novel approaches to the problem of classifying high entropy file fragments. Although classification of file fragments is central to the science of Digital Forensics, high entropy types have been regarded as a problem. Roussev and Garfinkel (2009) argue that existing methods will not work on high entropy fragments because they have no discernible patterns to exploit. We propose two methods that do not rely on such patterns. The NIST statistical test suite is used to detect randomness in 4 KiB fragments. These test results were analysed using an Artificial Neural Network (ANN). Optimum results were 91% and 82% correct classification rates for encrypted and compressed fragments respectively. We also use the compressibility of a fragment as a measure of its randomness. Correct classification was 76% and 70% for encrypted and compressed fragments respectively. We show that newer more efficient compression formats are more difficult to classify. We have used subsets of the publicly available 'GovDocs1 Million File Corpus' so that any future research may make valid comparisons with the results obtained here.

出版日期2013-12

全文

访问全文

收藏分享被引(4) 浏览

更新时间：2018-01-19 12:43

Approaches to the classification of high entropy file fragments

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友