摘要

Predicting grid performance is a complex task because heterogeneous resource nodes are involved in a distributed environment. Long execution workload on a grid is even harder to predict due to heavy load fluctuations. In this paper, we use Kalman filter to minimize the prediction errors. We apply Savitzky-Golay filter to train a sequence of confidence windows. The purpose is to smooth the prediction process from being disturbed by load fluctuations. We present a new adaptive hybrid method (AHModel) for load prediction guided by trained confidence windows. We test the effectiveness of this new prediction scheme with real-life workload traces on the AuverGrid and Grid5000 in France. Both theoretical and experimental results are reported in this paper. As the lookahead span increases from 10 to 50 steps (5 minutes per step), the AHModel predicts the grid workload with a mean-square error (MSE) of 0.04-0.73 percent, compared with 2.54-30.2 percent in using the static point value autoregression (AR) prediction method. The significant gain in prediction accuracy makes the new model very attractive to predict Grid performance. The model was proved especially effective to predict large workload that demands very long execution time, such as exceeding 4 hours on the Grid5000 over 5,000 processors. With minor changes of some system parameters, the AHModel can apply to other computational grids as well. At the end, we discuss extended research issues and tool development for Grid performance prediction.