摘要

Free-riding is one of the main challenges of Peer-to-Peer (P2P) streaming systems which results in reduction in video streaming quality. Therefore, providing an incentive mechanism for stimulating cooperation is one of the essential requirements to maintain video Quality of Experience (QoE) in such systems. Among the existing mechanisms, payment-based schemes are most suitable for streaming applications due to their low overhead. However, to date, no dynamic payment mechanism has been proposed which can take the stochastic dynamics of the video streaming ecosystem (e.g., the request arrival, demand submission, bandwidth availability, etc.) into account. In this paper, we propose a dynamic token-based payment mechanism in which each peer earns tokens by admitting other peers' requests and spends tokens for submitting its demands to the others. This system allows the peers to dynamically adjust their income level in adaptation to changes in the system state. We propose a Constrained Markov Decision Process (CMDP) formulation in which the goal of each peer is to obtain a request admission policy which minimizes the expected cumulative cost of consumed bandwidth, while satisfying a long-term constraint on the Mean Opinion Score (MOS) of the users as the measure of QoE. The proposed admission policy is adaptive to the request arrival process, bandwidth state and the token bucket length of each peer. To make up for the lack of design-time knowledge of the system's statistics, each individual peer is equipped with a model-free algorithm to learn its optimal admission policy over the course of real-time interaction with the system. Simulation results are presented to compare the performance of the proposed algorithm against baseline schemes such as: random, token-threshold, bandwidth-threshold and myopic algorithms.

  • 出版日期2018-6