Paper 2, Section II, J

Optimization and Control | Part II, 2012

Describe the elements of a generic stochastic dynamic programming equation for the problem of maximizing the expected sum of discounted rewards accrued at times 0,1,0,1, \ldots What is meant by the positive case? What is specially true in this case that is not true in general?

An investor owns a single asset which he may sell once, on any of the days t=0,1,t=0,1, \ldots. On day tt he will be offered a price XtX_{t}. This value is unknown until day tt, is independent of all other offers, and a priori it is uniformly distributed on [0,1][0,1]. Offers remain open, so that on day tt he may sell the asset for the best of the offers made on days 0,,t0, \ldots, t. If he sells for xx on day tt then the reward is xβtx \beta^{t}. Show from first principles that if 0<β<10<\beta<1 then there exists xˉ\bar{x} such that the expected reward is maximized by selling the first day the offer is at least xˉ\bar{x}.

For β=4/5\beta=4 / 5, find both xˉ\bar{x} and the expected reward under the optimal policy.

Explain what is special about the case β=1\beta=1.

Typos? Please submit corrections to this page on GitHub.