ALACA: A platform for dynamic alarm collection and alert notification in network management systems

作者:Solmaz Selcuk Emre*; Gedik Bugra; Ferhatosmanoglu Hakan; Sozuer Selcuk; Zeydan Engin; Etemoglu Cagri Ozgenc
来源:International Journal of Network Management, 2017, 27(4): UNSP e1980.
DOI:10.1002/nem.1980

摘要

Mobile network operators run Operations Support Systems that produce vast amounts of alarm events. These events can have different significance levels and domains and also can trigger other ones. Network operators face the challenge to identify the significance and root causes of these system problems in real time and to keep the number of remedial actions at an optimal level, so that customer satisfaction rates can be guaranteed at a reasonable cost. In this paper, we propose a scalable streaming alarm management system, referred to as Alarm Collector and Analyzer, that includes complex event processing and root cause analysis. We describe a rule mining and root cause analysis solution for alarm event correlation and analyses. The solution includes a dynamic index for matching active alarms, an algorithm for generating candidate alarm rules, a sliding window-based approach to save system resources, and a graph-based solution to identify root causes. Alarm Collector and Analyzer is used in the network operation center of a major mobile telecom provider. It helps operators to enhance the design of their alarm management systems by allowing continuous analysis of data and event streams and predict network behavior with respect to potential failures by using the results of root cause analysis. We present experimental results that provide insights on performance of real-time alarm data analytics systems.

  • 出版日期2017-8