摘要

Multicore system-on-chips (SOCs) rely on runtime thermal monitoring using on-chip thermal sensors for dynamic thermal management (DTM). However, on-chip sensors are highly susceptible to noise due to fabrication randomness, fluctuations, etc. This causes discrepancy between the actual temperature and the one observed by thermal sensor. In this paper, we address the problem of estimating the accurate temperature of on-chip thermal sensor when the sensor reading has been corrupted by noise. We present statistical techniques for the following: 1) when the underlying randomness exhibits jointly-Gaussian characteristics we present the optimal solution for temperature estimation; 2) for close to Gaussian cases we give a heuristic based on Moment Matching; 3) when the underlying randomness is non-Gaussian a hypothesis testing framework is used to predict the sensor temperatures. The previous three techniques are investigated in both single sensor and multisensor scenarios, respectively. The latter tries to estimate the actual temperatures for several sensors simultaneously while exploiting the correlations in temperature and circuit parameters among different sensors. The experiments showed that using our estimation schemes the root mean square (RMS) error can be reduce (with very small runtime overhead) by 71.5% as compared to blindly trusting the sensors to be noise-free.

  • 出版日期2011-9

全文