2.II.29I

Optimization and Control | Part II, 2008

Consider a stochastic controllable dynamical system PP with action-space AA and countable state-space SS. Thus P=(pxy(a):x,yS,aA)P=\left(p_{x y}(a): x, y \in S, a \in A\right) and pxy(a)p_{x y}(a) denotes the transition probability from xx to yy when taking action aa. Suppose that a cost c(x,a)c(x, a) is incurred each time that action aa is taken in state xx, and that this cost is uniformly bounded. Write down the dynamic optimality equation for the problem of minimizing the expected long-run average cost.

State in terms of this equation a general result, which can be used to identify an optimal control and the minimal long-run average cost.

A particle moves randomly on the integers, taking steps of size 1 . Suppose we can choose at each step a control parameter u[α,1α]u \in[\alpha, 1-\alpha], where α(0,1/2)\alpha \in(0,1 / 2) is fixed, which has the effect that the particle moves in the positive direction with probability uu and in the negative direction with probability 1u1-u. It is desired to maximize the long-run proportion of time π\pi spent by the particle at 0 . Show that there is a solution to the optimality equation for this example in which the relative cost function takes the form θ(x)=μx\theta(x)=\mu|x|, for some constant μ\mu.

Determine an optimal control and show that the maximal long-run proportion of time spent at 0 is given by

π=12α2(1α).\pi=\frac{1-2 \alpha}{2(1-\alpha)} .

You may assume that it is valid to use an unbounded function θ\theta in the optimality equation in this example.

Typos? Please submit corrections to this page on GitHub.