摘要

Click-through rate (CTR) prediction plays a predominant role in the online advertisements. CTR prediction is a problem of binary classification with imbalanced data. Many existing approaches for imbalance learning only focus on over-sampling and under-sampling, but these methods definitely ignore some vital information of the original data. In this paper, we first propose a weighted output extreme learning machine (WO-ELM) to learn the imbalanced data. A hierarchical extreme learning machine (H-C-ELM) is proposed based on the proposed WO-ELM and the weighted extreme learning machine (W-ELM). The H-C-ELM has two levels in its structure. In the first level, the WO-ELM and the W-ELM are trained on different combined fields of the CTR (each field has some attributes). The two extreme learning machines (ELMs) output their predicted scores of the corresponding combined fields of the CTR. The WO-ELM and the W-ELM have different predicted results on the same combined fields because of the difference of the two ELMs. Therefore, in the second level, another ELM is applied based on the outputs of the two ELMs in the first level and the actual outputs in order to improve the prediction accuracy. The experimental results demonstrate that the proposed H-C-ELM method has better performance for the binary classification with imbalanced data than the other related algorithms on CTR prediction, such as the WO-ELM, the W-ELM, and the stacked autoencoder-logistic regression.