A comparative study of video-based object recognition from an egocentric viewpoint

Shao Mang<sup>*</sup>; Tang Danhang; Liu Yang; Kim Tae Kyun

doi:10.1016/j.neucom.2015.07.023

摘要

Videos tend to yield a more complete description of their content than individual images. And egocentric vision often provides a more controllable and practical perspective for capturing useful information. In this study, we presented new insights into different object recognition methods for video-based rigid object instance recognition. In order to better exploit egocentric videos as training and query sources, diverse state-of-the-art techniques were categorised, extended and evaluated empirically using a newly collected video dataset, which consists of complex sculptures in clutter scenes. In particular, we investigated how to utilise the geometric and temporal cues provided by egocentric video sequences to improve the performance of object recognition. Based on the experimental results, we analysed the pros and cons of these methods and reached the following conclusions. For geometric cues, the 3D object structure learnt from a training video dataset improves the average video classification performance dramatically. By contrast, for temporal cues, tracking visual fixation among video sequences has little impact on the accuracy, but significantly reduces the memory consumption by obtaining a better signal-to-noise ratio for the feature points detected in the query frames. Furthermore, we proposed a method that integrated these two important cues to exploit the advantages of both.

出版日期2016-1-1

全文

访问全文

收藏分享被引(4) 浏览

更新时间：2021-03-23 19:59

A comparative study of video-based object recognition from an egocentric viewpoint

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友