A Disk-Based Mining Algorithm for Frequent Pattern Discovery from Big Data in Distributed Computing Environments

作者:Lin Kawuu W*; Chung Sheng Hao; Hsiao Chun Yuan; Lin Chun Cheng; Chen Pei Ling
来源:Journal of Internet Technology, 2016, 17(6): 1259-1268.
DOI:10.6138/JIT.2016.17.6.20150603c

摘要

In distributed computing environments, frequent pattern mining by a multi-computing node can greatly improve mining efficiency. However, the drawback of memory limitations may cause interruption in the kernel and computing nodes when recursively building a frequent pattern (FP) tree or an FP-growth algorithm. In this paper, we propose disk-based FP-tree generation and node-based clustering mechanisms to solve the insufficient memory problem. Results from empirical evaluations show that the proposed method delivers excellent scalability.

  • 出版日期2016-11

全文