Paper 4, Section II, J

Let $D=\left(x_{i}, y_{i}\right)_{i=1}^{n}$ be a dataset of $n$ input-output pairs lying in $\mathbb{R}^{p} \times[-M, M]$ for $M \in \mathbb{R}$ . Describe the random-forest algorithm as applied to $D$ using decision trees $\left(\hat{T}^{(b)}\right)_{b=1}^{B}$ to produce a fitted regression function $f_{\mathrm{rf}}$ . [You need not explain in detail the construction of decision trees, but should describe any modifications specific to the random-forest algorithm.]

Briefly explain why for each $x \in \mathbb{R}^{p}$ and $b=1, \ldots, B$ , we have $\hat{T}^{(b)}(x) \in[-M, M]$ .

State the bounded-differences inequality.

Treating $D$ as deterministic, show that with probability at least $1-\delta$ ,

\sup _{x \in \mathbb{R}^{p}}\left|f_{\mathrm{rf}}(x)-\mu(x)\right| \leqslant M \sqrt{\frac{2 \log (1 / \delta)}{B}}+\mathbb{E}\left(\sup _{x \in \mathbb{R}^{p}}\left|f_{\mathrm{rf}}(x)-\mu(x)\right|\right),

where $\mu(x):=\mathbb{E} f_{\mathrm{rf}}(x)$ .

$\left[\right.$ Hint: Treat each $\hat{T}^{(b)}$ as a random variable taking values in an appropriate space $\mathcal{Z}$ (of functions), and consider a function $G$ satisfying

G\left(\hat{T}^{(1)}, \ldots, \hat{T}^{(B)}\right)=\sup _{x \in \mathbb{R}^{p}}\left|f_{\mathrm{rf}}(x)-\mu(x)\right|