摘要

This paper presents FlowWalker, a novel dynamic taint analysis framework that aims to extract the complete taint data flow while eliminating the bottlenecks that occur in existing tools, with applications to file-format reverse engineering. The framework proposes a multi-taint-tag assembly-level taint propagation strategy. FlowWalker separates taint tracking operations from execution with an off-line structure, utilizes memory-mapped files to enhance I/O efficiency, processes taint paths during virtual execution playback, and uses parallelization and pipelining mechanisms to achieve speedup. Based on the semantic correlations implied by the taint path information, this paper presents an algorithm for extracting the structures of unknown file formats. According to test data, the overall program runtime ranges from 92.98% to 208.01% of the length of the underlying instrumentation alone, while the speed enhancement is 60% compared to another well-featured tool in Windows. Medium-complexity file formats are correctly partitioned, and the constant fields are extracted. Due to its efficiency and scalability, FlowWalker can address the needs of further security-related research.

  • 出版日期2015-6