摘要

Predicting system failures can be of great benefit to managers that get a better command over system performance. Data that systems generate in the form of logs is a valuable source of information to predict system reliability. As such, there is an increasing demand of tools to mine logs and provide accurate predictions. However, interpreting information in logs poses some challenges. This study discusses how to effectively mining sequences of logs and provide correct predictions. The approach integrates different machine learning techniques to control for data brittleness, provide accuracy of model selection and validation, and increase robustness of classification results. We apply the proposed approach to log sequences of 25 different applications of a software system for telemetry and performance of cars. On this system, we discuss the ability of three well-known support vector machines - multilayer perceptron, radial basis function and linear kernels - to fit and predict defective log sequences. Our results show that a good analysis strategy provides stable, accurate predictions. Such strategy must at least require high fitting ability of models used for prediction. We demonstrate that such models give excellent predictions both on individual applications - e.g., 1 % false positive rate, 94 % true positive rate, and 95 % precision - and across system applications - on average, 9 % false positive rate, 78 % true positive rate, and 95 % precision. We also show that these results are similarly achieved for different degree of sequence defectiveness. To show how good are our results, we compare them with recent studies in system log analysis. We finally provide some recommendations that we draw reflecting on our study.

  • 出版日期2015-8