MR-CLOPE: A MapReduce based transactional clustering algorithm for DNS query log analysis

Li Ye-feng; Le Jia-jin; Wang Mei; Zhang Bin; Liu Liang-xu<sup>*</sup>

doi:10.1007/s11771-015-2888-9

摘要

DNS (domain name system) query log analysis has been a popular research topic in recent years. CLOPE, the represented transactional clustering algorithm, could be readily used for DNS query log mining. However, the algorithm is inefficient when processing large scale data. The MR-CLOPE algorithm is proposed, which is an extension and improvement on CLOPE based on MapReduce. Different from the previous parallel clustering method, a two-stage MapReduce implementation framework is proposed. Each of the stage is implemented by one kind MapReduce task. In the first stage, the DNS query logs are divided into multiple splits and the CLOPE algorithm is executed on each split. The second stage usually tends to iterate many times to merge the small clusters into bigger satisfactory ones. In these two stages, a novel partition process is designed to randomly spread out original sub clusters, which will be moved and merged in the map phrase of the second phase according to the defined merge criteria. In such way, the advantage of the original CLOPE algorithm is kept and its disadvantages are dealt with in the proposed framework to achieve more excellent clustering performance. The experiment results show that MR-CLOPE is not only faster but also has better clustering quality on DNS query logs compared with CLOPE.

出版日期2015-9
单位东华大学; 宁波工程学院

全文

访问全文

收藏分享被引(2) 浏览

更新时间：2021-07-18 05:05

MR-CLOPE: A MapReduce based transactional clustering algorithm for DNS query log analysis

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友