Massive-scale learning of image and video semantic concepts

Smith J R<sup>*</sup>; Cao L; Codella N C F; Hill M L; Merler M; Nguyen Q B; Pring E; Uceda Sosa R A

doi:10.1147/JRD.2015.2398590

摘要

Rapid growth in the capture and generation of images and videos is driving the need for more efficient and effective systems for analyzing, searching, and retrieving this data. Specific challenges include supporting automatic content indexing at a large scale and accurately extracting a sufficiently large number of relevant semantic concepts to enable effective search. In this paper, we describe the development of a system for massive-scale visual semantic concept extraction and learning for images and video. The system models the visual semantic space using a hierarchical faceted classification scheme across objects, scenes, people, activities, and events and utilizes a novel machine learning approach that creates ensemble classifiers from automatically extracted visual features. The ensemble learning and extraction processes are easily parallelizable for distributed processing using Hadoop (R) and IBM InfoSphere (R) Streams, which enable efficient processing of large data sets. We report on various applications and quantitative and qualitative results for different image and video data sets.

出版日期2015-5
单位IBM

全文

访问全文

收藏分享被引(4) 浏览

更新时间：2021-04-11 10:06

Massive-scale learning of image and video semantic concepts

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友