A multi-pattern hash-binary hybrid algorithm for URL matching in the HTTP protocol

作者:Zeng, Ping; Tan, Qingping*; Meng, Xiankai; Shao, Zeming; Xie, Qinzheng; Yan, Ying; Cao, Wei; Xu, Jianjun
来源:PLos One, 2017, 12(4): e0175500.
DOI:10.1371/journal.pone.0175500

摘要

In this paper, based on our previous multi-pattern uniform resource locator (URL) binary-matching algorithm called HEM, we propose an improved multi-pattern matching algorithm called MH that is based on hash tables and binary tables. The MH algorithm can be applied to the fields of network security, data analysis, load balancing, cloud robotic communications, and so on-all of which require string matching from a fixed starting position. Our approach effectively solves the performance problems of the classical multi-pattern matching algorithms. This paper explores ways to improve string matching performance under the HTTP protocol by using a hash method combined with a binary method that transforms the symbol-space matching problem into a digital-space numerical-size comparison and hashing problem. The MH approach has a fast matching speed, requires little memory, performs better than both the classical algorithms and HEM for matching fields in an HTTP stream, and it has great promise for use in real-world applications.