摘要

In this paper, a layer-wise mixture model (LMM) is developed to support hierarchical visual recognition, where a Bayesian approach is used to automatically adapt the visual hierarchy to the progressive improvements of the deep network along the time. Our LMM algorithm can provide an end-to-end approach for jointly learning: 1) the deep network for achieving more discriminative deep representations for object classes and their inter-class visual similarities; 2) the tree classifier for recognizing large numbers of object classes hierarchically; and 3) the visual hierarchy adaptation for achieving more accurate assignment and organization of large numbers of object classes. By learning the tree classifier, the deep network and the visual hierarchy adaptation jointly in an end-to-end manner, our LMM algorithm can achieve higher accuracy rates on hierarchical visual recognition. Our experiments are carried on ImageNet1K and ImageNet10K image sets, which have demonstrated that our LMM algorithm can achieve very competitive results on the accuracy rates as compared with the baseline methods.