摘要

Due to current technology scaling trends, digital designs are becoming strongly susceptible to space radiation effects. These effects can cause unwanted single-event upsets (SEUs) in any state element. This paper presents a new system-level model of SEUs propagation through processors as a continuous-time Markov chain (CTMC). Moreover, probabilistic formal techniques (such as probabilistic model checking) are utilized to exhaustively estimate the impact of SEUs on the system behavior. The proposed CTMC model was analyzed for different SEU injection scenarios and different bit-flip rates. Results demonstrate that the proposed approach can provide an accurate estimation of different reliability metrics, such as mean time to failure, mean time to recover, and the probability of failure for each SEU injection scenario in the system's subcomponents. Furthermore, the proposed probabilistic system-level analysis was utilized to investigate the optimal self-repair rate required in the system to obtain the desired level of reliability. Results demonstrate that in comparison with existing simulation techniques for fault impact evaluation, the presented approach can provide consistent results while being orders of magnitude faster in terms of CPU time.

  • 出版日期2017-9