摘要

We study sequential decision problems where the decision maker does not observe the states of nature, but rather receives a noisy signal, whose distribution depends on the current state and on the action that she plays. We do not assume that the decision maker considers the worst-case scenario, but rather has a response correspondence, which maps distributions over signals to subjective best responses. We extend the concept of internal regret-free strategy to this setup and provide an algorithm that generates such a strategy.

  • 出版日期2016-3
  • 单位INSEAD