摘要

Mining sequential patterns (MSP) is an important task for knowledge discovery and data mining (KDD). Like in most KDD tasks, MSP also invokes a number of iterations for generating, adjusting, and comparing data. This paper presents an empirical study on deploying MSP in a grid computing environment and demonstrates the effectiveness and performance improvements gained in this deployment. GSP, which is a typical MSP method, is used as the mining algorithm to be investigated. A grid computing environment is designed and implemented, where all GSP functions are organized as loosely coupled web-services. MSP is achieved through the cooperation of these web-services using the divide-and-conquer strategy. Several monitoring mechanisms are developed to help manage the MSP process. The experimental results show that the proposed grid computing environment provides a flexible and efficient platform for MSP.

  • 出版日期2012-4