A comparison of automated training-by-example selection algorithms for Evidence Based Software Engineering

Hassler Edgar E; Hale David P<sup>*</sup>; Hale Joanne E

doi:10.1016/j.infsof.2018.02.001

摘要

Context: Study search and selection is central to conducting Evidence Based Software Engineering (EBSE) research, including Systematic Literature Reviews and Systematic Mapping Studies. Thus, selecting relevant studies and excluding irrelevant studies, is critical. Prior research argues that study selection is subject to researcher bias, and the time required to review and select relevant articles is a target for optimization.
Objective: This research proposes two training-by-example classifiers that are computationally simple, do not require extensive training or tuning, ensure inclusion/exclusion consistency, and reduce researcher study selection time: one based on Vector Space Models (VSM), and a second based on Latent Semantic Analysis (LSA).
Method: Algorithm evaluation is accomplished through Monte-Carlo Cross-Validation simulations, in which study subsets are randomly chosen from the corpus for training, with the remainder classified by the algorithm. The classification results are then assessed for recall (a measure of completeness), precision (a measure of exactness) and researcher efficiency savings (reduced proportion of corpus studies requiring manual review as a result of algorithm use). A second smaller simulation is conducted for external validation.
Results and conclusions: VSM algorithms perform better in recall; LSA algorithms perform better in precision. Recall improves with larger training sets with a higher proportion of truly relevant studies. Precision improves with training sets with a higher portion of irrelevant studies, without a significant impact from the training set size. The algorithms reduce the influence of researcher bias and are found to significantly improve researcher efficiency.
To improve recall, the findings recommend VSM and a large training set including as many truly relevant studies as possible. If precision and efficiency are most critical, the findings suggest LSA and a training set including a large proportion of truly irrelevant studies.

出版日期2018-6

全文

访问全文

收藏分享被引浏览

更新时间：2021-03-15 19:47

A comparison of automated training-by-example selection algorithms for Evidence Based Software Engineering

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友