Analogy-based classifiers for nominal or numerical data

作者:Bounhas Myriam*; Prade Henri; Richard Gilles
来源:International Journal of Approximate Reasoning, 2017, 91: 36-55.
DOI:10.1016/j.ijar.2017.08.010

摘要

Introduced a decade ago, analogy-based classification methods constitute a noticeable addition to the set of instance-based learning techniques. They provide valuable results in terms of accuracy on many classical datasets. They rely on the notion of analogical proportions which are statements of the form "A is to B as C is to D". Analogical proportions have been in particular formalized in Boolean and numerical settings. In both cases, one of the four components of the proportion can be computed from the three others, when the proportion holds. Analogical classifiers look for all triples of examples in the sample set that are in analogical proportion with the item to be classified on a maximal number of attributes and for which the corresponding analogical proportion equation on the class has a solution. In this paper when classifying a new item, we specially emphasize an approach where the whole set of triples that can be built from the sample set is not considered. We just focus on a small part of the candidate triples. Namely, in order to restrict the scope of the search, we first look for examples that are as similar as possible to the new item to be classified. We then only consider the pairs of examples presenting the same dissimilarity as between the new item and one of its closest neighbors. In this way, we implicitly build triples that are in analogical proportion on all attributes with the new item. Then the classification is made on the basis of an additive aggregation of the truth values corresponding to the pairs that can be analogically associated with the pairs made of the target item and one of its nearest neighbors. We then only deal with pairs leading to a solvable analogical equation for the class. This new algorithm provides results as good as previous analogical classifiers with a lower average complexity, both in nominal and numerical cases.

  • 出版日期2017-12