摘要

Context: Software networks are directed graphs of static dependencies between source code entities (functions, classes, modules, etc.). These structures can be used to investigate the complexity and evolution of large-scale software systems and to compute metrics associated with software design. The extraction of software networks is also the first step in reverse engineering activities. Objective: The aim of this paper is to present SNEIPL, a novel approach to the extraction of software networks that is based on a language-independent, enriched concrete syntax tree representation of the source code. Method: The applicability of the approach is demonstrated by the extraction of software networks representing real-world, medium to large software systems written in different languages which belong to different programming paradigms. To investigate the completeness and correctness of the approach, class collaboration networks (CCNs) extracted from real-world Java software systems are compared to CCNs obtained by other tools. Namely, we used Dependency Finder which extracts entity-level dependencies from Java bytecode, and Doxygen which realizes language-independent fuzzy parsing approach to dependency extraction. We also compared SNEIPL to fact extractors present in language-independent reverse engineering tools. Results: Our approach to dependency extraction is validated on six real-world medium to large-scale software systems written in Java, Modula-2, and Delphi. The results of the comparative analysis involving ten Java software systems show that the networks formed by SNEIPL are highly similar to those formed by Dependency Finder and more precise than the comparable networks formed with the help of Doxygen. Regarding the comparison with language-independent reverse engineering tools, SNEIPL provides both language-independent extraction and representation of fact bases. Conclusion: SNEIPL is a language-independent extractor of software networks and consequently enables language-independent network-based analysis of software systems, computation of design software metrics, and extraction of fact bases for reverse engineering activities.

  • 出版日期2014-10