An Examination of Power and Type I Errors for Two Differential Item Functioning Indices Using the Graded Response Model

Clark Patrick C<sup>*</sup>; LaHuis David M

doi:10.1177/1094428111403815

摘要

This study examined two methods for detecting differential item functioning (DIF): Raju, van der Linden, and Fleer's 1995 differential functioning of items and tests (DFIT) procedure and Thissen, Steinberg, and Wainer's 1988 likelihood ratio test (LRT). The major research questions concerned which test provides the best balance of Type I errors and power and if the tests differ in terms of detecting different types of DIF. Monte Carlo simulations were conducted to address these questions. Equal and unequal sample size conditions were fully crossed with test lengths of 10 and 20 items. In addition, alpha and beta parameters were manipulated in order to simulate DIF. Findings indicate that DFIT and LRT both had acceptable Type I error rates when sample sizes were equal but that DFIT produced too many Type I errors when sample sizes were unequal. Overall, the LRT exhibited greater power to detect both alpha and beta parameter DIF than did DFIT. However, DFIT was more powerful than LRT when the last two beta parameters had DIF as opposed to when the extreme beta parameters had DIF.

出版日期2012-4

全文

访问全文

收藏分享被引(5) 浏览

更新时间：2017-06-28 12:39

An Examination of Power and Type I Errors for Two Differential Item Functioning Indices Using the Graded Response Model

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友