摘要

This article studies the problem of determining stocking quantities in a periodic-review inventory model when the demand distribution is unknown. Moreover, lost sales are unobservable in the system and hence inventory decisions are to be made solely based on sales data. Both the non-perishable and perishable inventory problems are addressed. Using an online convex optimization procedure, a non-parametric adaptive algorithm that produces inventory policy in each period that depends on the entire history of stocking decisions and sales observations. With the help of a convex quadratic underestimator of the cost function, it is established that the T-period average expected cost of the inventory policy converges to the optimal newsvendor cost at the rate of O(log T/T) for demands whose expected cost functions satisfy an alpha-exp-concavity property. It is shown that, when the demand distribution is continuous, this property holds the probability density function over the decision set is bounded away from zero. For other continuous distributions, a "shifted" version of the density function is constructed to show an epsilon-consistency property of the algorithm so that the gap between the T-period average expected cost of the proposed policy and the optimal newsvendor cost is of the order O(log T/T) + epsilon (for a given small epsilon > 0). Simulation results show that the proposed algorithm performs consistently better than two existing algorithms that are closely related to the proposed algorithms.

  • 出版日期2015-1-2