Adding Value to Daily-Deals Recommendation: Multi-Armed Bandits to Match Customers and Deals

作者:Lacerda Anisio*; Veloso Adriano; Ziviani Nivio
来源:2015 Brazilian Conference on Intelligent Systems (BRACIS 2015), 2015-11-04 To 2015-11-07.
DOI:10.1109/BRACIS.2015.66

摘要

The typical marketing strategy of online group-buying sites, like Groupon and LivingSocial, is to send e-mails to customers alerting them of products and services with big discounts for a limited time. Since these alerts are sent on a daily-basis, customers are likely to become bored by the constant barrage of e-mails offering discounts on all sorts of deals. Intuitively, a more effective strategy should take into account many criteria (e.g., last time the customer purchased a deal, customer's behavior/segment, or customer's taste) to avoid flooding user inboxes with unnecessary e-mails on deals that are unlikely to be clicked. We model this task as a reinforcement learning problem in which the goal is to accumulate rewards from a payoff distribution with unknown parameters that are learned sequentially. Specifically, we employ multi-armed bandit algorithms to maximize the fraction of opportune e-mails (those that are opened and clicked) by sequentially deciding the best criterion to apply at each time step. A systematic set of experiments using real data obtained from the largest daily deals website in Brazil, show that we can exploit the trade-off between the number of e-mails sent to customers and the number of clicks received. Our results show that well-known multi-armed bandit algorithms are extremely effective in sorting customers that are likely to click the e-mail, pointing that we may send about 60% of the e-mails without observing relevant decrease (i.e., < 10%) in the number of clicks.

  • 出版日期2015