Paper 4, Section II, $25 K$

Consider the scalar system evolving as

x_{t}=x_{t-1}+u_{t-1}+\epsilon_{t}, \quad t=1,2, \ldots,

where $\left\{\epsilon_{t}\right\}_{t=1}^{\infty}$ is a white noise sequence with $E \epsilon_{t}=0$ and $E \epsilon_{t}^{2}=v$ . It is desired to choose controls $\left\{u_{t}\right\}_{t=0}^{h-1}$ to minimize $E\left[\sum_{t=0}^{h-1}\left(\frac{1}{2} x_{t}^{2}+u_{t}^{2}\right)+x_{h}^{2}\right]$ . Show that for $h=6$ the minimal cost is $x_{0}^{2}+6 v$ .

Find a constant $\lambda$ and a function $\phi$ which solve

\phi(x)+\lambda=\min _{u}\left[\frac{1}{2} x^{2}+u^{2}+E \phi\left(x+u+\epsilon_{1}\right)\right]

Let $P$ be the class of those policies for which every $u_{t}$ obeys the constraint $\left(x_{t}+u_{t}\right)^{2} \leqslant(0.9) x_{t}^{2}$ . Show that $E_{\pi} \phi\left(x_{t}\right) \leqslant x_{0}^{2}+10 v$ , for all $\pi \in P$ . Find, and prove optimal, a policy which over all $\pi \in P$ minimizes

\lim _{h \rightarrow \infty} \frac{1}{h} E_{\pi}\left[\sum_{t=0}^{h-1}\left(\frac{1}{2} x_{t}^{2}+u_{t}^{2}\right)\right]