Model Selection Criteria for Missing-Data Problems Using the EM Algorithm

Ibrahim, Joseph G<sup>*</sup>; Zhu, Hongtu; Tang, Niansheng

doi:10.1198/016214508000001057

摘要

We consider novel methods for the Computation of model selection criteria in missing-data problems based on the output of the EM algorithm The methodology is very general and can be applied to numerous simulations involving incomplete data within an EM framework, from covariates missing at random in arbitrary regression models to nonignorably missing longitudinal responses and/or covariates. Toward this goal, we develop a class of information criteria for missing-data problems called ICH,Q, which yields the Akaike information criterion and the Bayesian information criterion as special cases. The computation of ICH,Q requires an analytic approximation to a complicated function. called the H-function, along with output from the EM algorithm used in obtaining maximum likelihood estimates. The approximation to the H-function leads to a large class of information criteria, called IC(H) over tilde (k),Q. Theoretical properties of IC(H) over tilde (k),Q, including consistency, are investigated in detail. To eliminate the analytic approximation to the H-function, a computationally simpler approximation to ICH,Q. called ICQ, is proposed, the computation of which depends solely on the Q-function of the EM algorithm. Advantages and disadvantages of IC(H) over tilde (k),Q and ICQ are discussed and examined in detail in the context of missing-data problems. Extensive simulations are given to demonstrate the methodology and examine the small-sample and large-sample performance of IC(H) over tilde (k),Q and ICQ in missing-data problems. An AIDS data set also is presented to illustrate the proposed methodology.

出版日期2008-12
单位云南大学

全文

访问全文

收藏分享被引(86) 浏览

更新时间：2024-04-13 13:49

Model Selection Criteria for Missing-Data Problems Using the EM Algorithm

摘要

全文

产品服务

站内浏览

服务支持

联系方式

科研之友