An overview of inference methods in probabilistic classifier chains for multilabel classification

作者:Mena Deiner; Montanes Elena; Ramon Quevedo Jose; Jose del Coz Juan
来源:Wiley Interdisciplinary Reviews-Data Mining and Knowledge Discovery, 2016, 6(6): 215-230.
DOI:10.1002/widm.1185

摘要

This study presents a review of the recent advances in performing inference in probabilistic classifier chains for multilabel classification. The interest of performing such inference arises in an attempt of improving the performance of the approach based on greedy search (the well-known CC method) and simultaneously reducing the computational cost of an exhaustive search (the well-known PCC method). Unlike PCC and as CC, inference techniques do not explore all the possible solutions, but they increase the performance of CC, sometimes reaching the optimal solution in terms of subset 0/1 loss, as PCC does. The epsilon-approximate algorithm, the method based on a beam search and Monte Carlo sampling are those techniques. An exhaustive set of experiments over a wide range of datasets are performed to analyze not only to which extent these techniques tend to produce optimal solutions, otherwise also to study their computational cost, both in terms of solutions explored and execution time. Only e-approximate algorithm with epsilon=.0 theoretically guarantees reaching an optimal solution in terms of subset 0/1 loss. However, the other algorithms provide solutions close to an optimal solution, despite the fact they do not guarantee to reach an optimal solution. The epsilon-approximate algorithm is the most promising to balance the performance in terms of subset 0/1 loss against the number of solutions explored and execution time. The value of epsilon determines a degree to which one prefers to guarantee to reach an optimal solution at the expense of increasing the computational cost.

  • 出版日期2016-12

全文