摘要

We consider a finite mixture of Gaussian regression models for high-dimensional heterogeneous data where the number of covariates may be much larger than the sample size. We propose to estimate the unknown conditional mixture density by an l(1)-penalized maximum likelihood estimator. We shall provide an l(1)-oracle inequality satisfied by this Lasso estimator with the Kullback-Leibler loss. In particular, we give a condition on the regularization parameter of the Lasso to obtain such an oracle inequality. Our aim is twofold: to extend the l(1)-oracle inequality established by Massart and Meynet [12] in the homogeneous Gaussian linear regression case, and to present a complementary result to Stadler et al. [18], by studying the Lasso for its l(1)-regularization properties rather than considering it as a variable selection procedure. Our oracle inequality shall be deduced from a finite mixture Gaussian regression model selection theorem for l(1)-penalized maximum likelihood conditional density estimation, which is inspired from Vapnik's method of structural risk minimization [23] and from the theory on model selection for maximum likelihood estimators developed by Massart in [11].

  • 出版日期2013-1