A Framework for Mining High Utility Web Access Sequences

Ahmed Chowdhury Farhan<sup>*</sup>; Tanbeer Syed Khairuzzaman; Jeong Byeong Soo

doi:10.4103/0256-4602.74506

摘要

Mining web access sequences (WASs) can discover very useful knowledge from web logs with broad applications. By considering non-binary occurrences of web pages as internal utilities in WASs, e.g., time spent by each user in a web page, more realistic information can be extracted. However, the existing utility-based approach has many limitations such as considering only forward references of web access sequences, not applicable for incremental mining, suffers in the level-wise candidate generation-and-test methodology, needs several database scans and does not show how to mine web access sequences with different impacts/significances for different web pages. In this paper, we propose a novel framework to solve these problems. Moreover, we propose two new tree structures, called utility-based WAS tree (UWAS-tree) and incremental UWAS-tree (IUWAS-tree) for mining WASs in static and incremental databases, respectively. Our approach can handle both forward and backward references, static and incremental data, avoids the level-wise candidate generation-and-test methodology, does not scan databases several times, and considers both internal and external utilities of a web page. The IUWAS-tree is also applicable for interactive mining. Extensive performance analyses show that our approach is very efficient for both static and incremental mining of high utility WASs.

出版日期2011-2

全文

访问全文

收藏分享被引(26) 浏览

更新时间：2024-04-17 02:52

A Framework for Mining High Utility Web Access Sequences

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友