摘要

Similarity measure of contents plays an important role in TV personalization, e.g., TV content group recommendation and similar TV content retrieval, which essentially are content clustering and example-based retrieval. We define similar TV contents to be those with similar semantic information, e.g., plot, background, genre, etc. Several similarity measure methods, notably vector space model based and category hierarchy model based similarity measure schemes, have been proposed for the purpose of data clustering and example-based retrieval. Each method has advantages and shortcomings of its own in TV content similarity measure. In this paper, we propose a hybrid approach for TV content similarity measure, which combines both vector space model and category hierarchy model. The hybrid measure proposed here makes the most of TV metadata information and takes advantage of the two similarity measurements. It measures TV content similarity from the semantic level other than the physical level. Furthermore, we propose an adaptive strategy for setting the combination parameters. The experimental results showed that using the hybrid similarity measure proposed here is superior to using either alone for TV content clustering and example-based retrieval.