Data summarization for network traffic monitoring

作者:Hoplaros Demetris*; Tari Zahir; Khalil Ibrahim
来源:Journal of Network and Computer Applications, 2014, 37: 194-205.
DOI:10.1016/j.jnca.2013.02.021

摘要

Network traffic monitoring is a very difficult task, given the amount of network traffic generated even in small networks. One approach to facilitate this task is network traffic summarization. Data summarization is a key concept in data mining. However, no current measures exist in order to facilitate the evaluation of summaries. This paper presents four metrics which can be used to characterize data summarization results. Conciseness and Information Loss have already been defined, but we modified Information Loss, due to the fact that it was biased towards recurring attributes across individual summaries. We also propose two additional metrics, Interestingness and Intelligibility. Using the proposed metrics, we evaluated existing summarization techniques on well known network traffic datasets. We also proposed a summarization technique, based on an existing one but incorporating the proposed metrics as objective function. In order to further demonstrate the usability of the metrics, we performed classification, on summarized datasets, showing that the metrics can be used to facilitate the selection of summaries for performing data mining. Using the summarized datasets with a reasonable conciseness, we were able to achieve similar results in terms of accuracy, but at a fraction of the running time, proportional to the conciseness of the summarized dataset.

  • 出版日期2014-1