摘要

The application of traditional machine learning algorithms on click data has met great challenges working with severe sparse and transient ID features, which tend to bloat data size and prolong training time considerably. On the other hand, due to the data size requirement, training and publishing overhead, the bottleneck of minute-level incremental model updates has emerged. We propose a novel real-time click through rate (CTR) prediction model based on empirical CTRs with a set of pre-learned priors, upon which a Minimum Variance Unbiased Estimator is constructed as the CTR prediction. The dimensions of the empirical CTRs are in the sparsest and finest ID levels, which can be strong indicators but are generally unsuited as machine learning features. Experiments on real-life click data show that our prior-based real-time estimator, combined with traditional machine learning model, gains significant improvement in both prediction accuracy and ranking capability, especially with latest data beyond the time-effectiveness of the machine learning model.

全文