Paper 2, Section II, $30 K$

Optimisation and Control | Part II, 2018

(a) A ball may be in one of $n$ boxes. A search of the $i^{\text {th }}$ box costs $c_{i}>0$ and finds the ball with probability $\alpha_{i}>0$ if the ball is in that box. We are given initial probabilities $\left(P_{i}^{1}, i=1,2, \ldots, n\right)$ that the ball is in the $i^{\text {th }}$ box.

Show that the policy which at time $t=1,2, \ldots$ searches the box with the maximal value of $\alpha_{i} P_{i}^{t} / c_{i}$ minimises the expected searching cost until the ball is found, where $P_{i}^{t}$ is the probability (given everything that has occurred up to time $t$ ) that the ball is in box $i$ .

(b) Next suppose that a reward $R_{i}>0$ is earned if the ball is found in the $i^{\text {th }}$ box. Suppose also that we may decide to stop at any time. Develop the dynamic programming equation for the value function starting from the probability distribution $\left(P_{i}^{t}, i=1,2, \ldots, n\right)$ .

Show that if $\sum_{i} c_{i} /\left(\alpha_{i} R_{i}\right)<1$ then it is never optimal to stop searching until the ball is found. In this case, is the policy defined in part (a) optimal?

comment

Paper 3, Section II, K

Optimisation and Control | Part II, 2018

The scalars $x_{t}, y_{t}, u_{t}$ are related by the equations

x_{t}=x_{t-1}+u_{t-1}, \quad y_{t}=x_{t-1}+\eta_{t-1}, \quad t=1,2, \ldots, T,

where the initial state $x_{0}$ is normally distributed with mean $\hat{x}_{0}$ and variance 1 and $\left\{\eta_{t}\right\}$ is a sequence of independent random variables each normally distributed with mean 0 and variance 1 . The control variable $u_{t}$ is to be chosen at time $t$ on the basis of information $W_{t}$ , where $W_{0}=\left(\hat{x}_{0}\right)$ and

W_{t}=\left(\hat{x}_{0}, u_{0}, \ldots, u_{t-1}, y_{1}, \ldots, y_{t}\right), \quad t=1,2, \ldots, T

(a) Let $\hat{x}_{1}, \hat{x}_{2}, \ldots, \hat{x}_{T}$ be the Kalman filter estimates of $x_{1}, x_{2}, \ldots, x_{T}$ , i.e.

\hat{x}_{t}=\hat{x}_{t-1}+u_{t-1}+h_{t}\left(y_{t}-\hat{x}_{t-1}\right)

where $h_{t}$ is chosen to minimise $\mathbb{E}\left(\left(\hat{x}_{t}-x_{t}\right)^{2} \mid W_{t}\right)$ . Calculate $h_{t}$ and show that, conditional on $W_{t}, x_{t}$ is normally distributed with mean $\hat{x}_{t}$ and variance $V_{t}=1 /(1+t)$ .

(b) Define

\begin{aligned} &F\left(W_{T}\right)=\mathbb{E}\left(x_{T}^{2} \mid W_{T}\right), \quad \text { and } \\ &F\left(W_{t}\right)=\inf _{u_{t}, \ldots, u_{T-1}} \mathbb{E}\left(x_{T}^{2}+\sum_{j=t}^{T-1} u_{j}^{2} \mid W_{t}\right), \quad t=0, \ldots, T-1 . \end{aligned}

Show that $F\left(W_{t}\right)=\hat{x}_{t}^{2} P_{t}+d_{t}$ , where $P_{t}=1 /(T-t+1), d_{T}=1 /(1+T)$ and $d_{t-1}=V_{t-1} V_{t} P_{t}+d_{t}$ .

(c) Show that the minimising control $u_{t}$ can be expressed in the form $u_{t}=-K_{t} \hat{x}_{t}$ and find $K_{t}$ . How would the expression for $K_{t}$ be altered if $x_{0}$ or $\left\{\eta_{t}\right\}$ had variances other than 1?

comment

Paper 4, Section II, $30 K$

Optimisation and Control | Part II, 2018

Consider the deterministic system

\dot{x}_{t}=u_{t}

where $x_{t}$ and $u_{t}$ are scalars. Here $x_{t}$ is the state variable and the control variable $u_{t}$ is to be chosen to minimise, for a fixed $h>0$ , the cost

x_{h}^{2}+\int_{0}^{h} c_{t} u_{t}^{2} \mathrm{~d} t

where $c_{t}$ is known and $c_{t}>c>0$ for all $t$ . Let $F(x, t)$ be the minimal cost from state $x$ and time $t$ .

(a) By writing the dynamic programming equation in infinitesimal form and taking the appropriate limit show that $F(x, t)$ satisfies

\frac{\partial F}{\partial t}=-\inf _{u}\left[c_{t} u^{2}+\frac{\partial F}{\partial x} u\right], \quad t<h

with boundary condition $F(x, h)=x^{2}$ .

(b) Determine the form of the optimal control in the special case where $c_{t}$ is constant, and also in general.

comment

Mathematics Tripos Papers

Optimisation And Control

Jump to year

2018

Paper 2, Section II, $30 K$

Paper 3, Section II, K

Paper 4, Section II, $30 K$

Optimisation And Control

Optimisation And Control

Jump to year

2018

Paper 2, Section II, 30K30 K30K

Paper 3, Section II, K

Paper 4, Section II, 30K30 K30K

Paper 2, Section II, $30 K$

Paper 4, Section II, $30 K$