摘要

The evaluation of teachers' performance in the classroom is an important application of educational testing and psychometrics. School districts are using gain scores, such as the average difference between pre- and posttest scores, to characterize teacher performance in the classroom. Consequently, additional research is needed to understand factors that affect the reliability of gain scores to ensure reliable teaching evaluations. This article examines a class of linear gain scores (LGS) with a modified common factor model to understand the effect of latent variable characteristics on the reliability of observed gain scores. The analytic results derive an upper bound for the reliability of a class of LGS and compare simple difference scores and residualized scores. The results suggest that simple difference scores tend to be more reliable than residualized gain scores whenever strong invariance is satisfied where latent intercepts and loadings are equal over classrooms. However, residualized gain scores are more similar to the optimal reliability in instances when classrooms differ in latent measurement intercepts and loadings. In addition, in contrast with previous conjectures, the results imply that student tracking artificially inflates LGS reliability. The results in this article serve as a guide for researchers who develop and refine methods for measuring student learning gains and evaluating teachers.

  • 出版日期2014-10