摘要

The filtering of large-scale hazardous URLs plays a fundamental role in many network security applications. The classical multiple string matching algorithms perform poorly on large-scale URLs, due to the heavy consumption of CPU power and memory space. Here we propose a multiple string matching algorithm-SOGOPT for large-scale URL filtering. By exploiting the characteristics of URLs, the proposed algorithm devises two strategies, i. e. the optimal window selection strategy and the pattern set partitioning and reduction strategy, to speed up the classical SOG algorithm. The proposed algorithm improves the searching speed of SOG greatly, especially on large-scale URLs. It is very suitable for large-scale (up to 1 million URLs) and online URL filtering.

全文