Paper 2, Section II, J

Describe the elements of a generic stochastic dynamic programming equation for the problem of maximizing the expected sum of discounted rewards accrued at times $0,1, \ldots$ What is meant by the positive case? What is specially true in this case that is not true in general?

An investor owns a single asset which he may sell once, on any of the days $t=0,1, \ldots$ . On day $t$ he will be offered a price $X_{t}$ . This value is unknown until day $t$ , is independent of all other offers, and a priori it is uniformly distributed on $[0,1]$ . Offers remain open, so that on day $t$ he may sell the asset for the best of the offers made on days $0, \ldots, t$ . If he sells for $x$ on day $t$ then the reward is $x \beta^{t}$ . Show from first principles that if $0<\beta<1$ then there exists $\bar{x}$ such that the expected reward is maximized by selling the first day the offer is at least $\bar{x}$ .

For $\beta=4 / 5$ , find both $\bar{x}$ and the expected reward under the optimal policy.

Explain what is special about the case $\beta=1$ .