Accuracy estimation of link-based similarity measures and its application

Zhang, Yinglong; Li, Cuiping<sup>*</sup>; Xie, Chengwang; Chen, Hong

doi:10.1007/s11704-015-4570-7

摘要

Link-based similarity measures play a significant role in many graph based applications. Consequently, measuring node similarity in a graph is a fundamental problem of graph data mining. Personalized PageRank (PPR) and Sim-Rank (SR) have emerged as the most popular and influential link-based similarity measures. Recently, a novel link-based similarity measure, penetrating rank (P-Rank), which enriches SR, was proposed. In practice, PPR, SR and P-Rank scores are calculated by iterative methods. As the number of iterations increases so does the overhead of the calculation. The ideal solution is that computing similarity within the minimum number of iterations is sufficient to guarantee a desired accuracy. However, the existing upper bounds are too coarse to be useful in general. Therefore, we focus on designing an accurate and tight upper bounds for PPR, SR, and P-Rank in the paper. Our upper bounds are designed based on the following intuition: the smaller the difference between the two consecutive iteration steps is, the smaller the difference between the theoretical and iterative similarity scores becomes. Furthermore, we demonstrate the effectiveness of our upper bounds in the scenario of top-k similar nodes queries, where our upper bounds helps accelerate the speed of the query. We also run a comprehensive set of experiments on real world data sets to verify the effectiveness and efficiency of our upper bounds.

出版日期2016-2
单位中国人民大学

全文

访问全文

收藏分享被引浏览

更新时间：2021-08-05 21:32

Accuracy estimation of link-based similarity measures and its application

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友