A refined decompiler to generate C code with high readability

作者:Chen, Gengbiao; Qi, Zhengwei; Huang, Shiqiu; Ni, Kangqi; Zheng, Yudi; Binder, Walter; Guan, Haibing*
来源:Software: Practice and Experience , 2013, 43(11): 1337-1358.
DOI:10.1002/spe.2138

摘要

As a key part of reverse engineering, decompilation plays a very important role in software security and maintenance. A number of tools, such as Boomerang and IDA Hex_rays, have been developed to translate executable programs into source code in a relatively high-level language. Unfortunately, most existing decompilation tools suffer from low accuracy in identifying variables, functions, and composite structures, resulting in poor readability. To address these limitations, we present a practical decompiler called C-Decompiler for Windows C programs that (i) uses a shadow stack to perform refined data flow analysis, (ii) adopts inter-basic-block register propagation to reduce redundant variables, and (iii) recognizes library (i.e., Standard Template Library) functions by signatures. We evaluate and compare the decompilation quality of C-Decompiler with two existing tools, Boomerang and IDA Hex_rays, considering four aspects: function analysis, variable expansion rate, total percentage reduction, and cyclomatic complexity. Our experimental results show that on average, C-Decompiler has the highest total percentage reduction of 55.91%, lowest variable expansion rate of 55.79%, and the same cyclomatic complexityastheoriginal source code for each considered application. Furthermore, in our experiments, C-Decompiler is able to recognize functions with a lower false positive and false negative rate than the other decompilers. A case study and our evaluation results confirm that C-Decompiler is a practical tool to produce highly readable C-style code.