Paper 3, Section II, 28K

Optimization and Control | Part II, 2011

An observable scalar state variable evolves as xt+1=xt+ut,t=0,1,x_{t+1}=x_{t}+u_{t}, t=0,1, \ldots Let controls u0,u1,u_{0}, u_{1}, \ldots be determined by a policy π\pi and define

Cs(π,x0)=t=0s1(xt2+2xtut+7ut2) and Cs(x0)=infπCs(π,x0)C_{s}\left(\pi, x_{0}\right)=\sum_{t=0}^{s-1}\left(x_{t}^{2}+2 x_{t} u_{t}+7 u_{t}^{2}\right) \quad \text { and } \quad C_{s}\left(x_{0}\right)=\inf _{\pi} C_{s}\left(\pi, x_{0}\right)

Show that it is possible to express Cs(x0)C_{s}\left(x_{0}\right) in terms of Πs\Pi_{s}, which satisfies the recurrence

Πs=6(1+Πs1)7+Πs1,s=1,2,\Pi_{s}=\frac{6\left(1+\Pi_{s-1}\right)}{7+\Pi_{s-1}}, \quad s=1,2, \ldots

with Π0=0\Pi_{0}=0.

Deduce that C(x0)2x02.[C(x0)C_{\infty}\left(x_{0}\right) \geqslant 2 x_{0}^{2} .\left[C_{\infty}\left(x_{0}\right)\right. is defined as limsCs(x0).]\left.\lim _{s \rightarrow \infty} C_{s}\left(x_{0}\right) .\right]

By considering the policy π\pi^{*} which takes ut=(1/3)(2/3)tx0,t=0,1,u_{t}=-(1 / 3)(2 / 3)^{t} x_{0}, t=0,1, \ldots, show that C(x0)=2x02C_{\infty}\left(x_{0}\right)=2 x_{0}^{2}.

Give an alternative description of π\pi^{*} in closed-loop form.

Typos? Please submit corrections to this page on GitHub.