摘要

This article proposes a machine learning-based high-accuracy algorithm called "APPlication Round method (APPR)" to identify network application traffic at the early stage. For each TCP/UDP flow, discriminators available at the early stage are determined to support high-accuracy and real-time traffic classification. Such discriminators characterize the possible negotiation behaviors of each flow from an application layer perspective. The ability of flow attributes is tested using several machine learning algorithms. By contrast, this study also compares the level of accuracy of the proposed method with those reported by machine learning-based application traffic classification methods that have addressed real-time application traffic classification problems based on identical sample traffic sets. By applying a pruned C4.5 tree machine learning algorithm to real traffic traces, the proposed method offers a maximal 99.21%, with an average overall accuracy of 92.88% for all traffic samples. Compared to other machine learning algorithms, the proposed algorithm not only provides a minimal accuracy improvement of approximately 7-8% for normal ratio data sets and more than 15-30% improvement of overall accuracy for fixed ratio data samples, but is also suitable for on-line identification because of the low-flow test duration. Furthermore, the proposed method is also appropriate for identifying encrypted protocols because it demonstrates high accuracy, classifies encryption-based protocols, and supports real-time classification.

全文