摘要

Design patterns are formalized best practices that address concerns related to high-level structures for applications being developed. The efficient recovery of design pattern instances significantly facilitates program comprehension and software reengineering. However, the recovery of design pattern instances is not a straightforward task. In this paper, we present a novel comprehensive approach to the recovery of instances of 23 GoF design patterns from source codes. The key point of the approach lies in that we consider different design pattern instances consist of some commonly recurring sub-patterns that are easier to be detected. In addition, we focus not only on the class relationship, but also on the characteristics of underlying method signatures in classes. We first transform the source codes and predefined GoF patterns into graphs, with the classes as nodes and the relationships as edges. We then identify the instances of sub-patterns that would be the possible constituents of pattern instances by means of subgraph discovery. The sub-pattern instances are further merged by the joint classes to see if the collective matches one of the predefined patterns. Finally, we compare the behavioral characteristics of method invocation with the predefined method signature templates of GoF patterns to obtain the final pattern instances directly. Compared with existing approaches, we integrate and improve some of the previous ideas and put forward a comprehensive and elaborative approach also based on our own ideas. We detect sub-patterns via graph isomorphism based on prime number composition and the joint classes to reduce the search space. Meanwhile, we employ the method signatures to investigate the behavioral features to avoid choosing the test cases with full code coverage. The results of the extensive experiments on recovering pattern instances from nine open source software systems demonstrate that our approach obtains the balanced high precision and recall.