Paper 1, Section II, I

Statistical Modelling | Part II, 2009

A three-year study was conducted on the survival status of patients suffering from cancer. The age of the patients at the start of the study was recorded, as well as whether or not the initial tumour was malignant. The data are tabulated in R\mathrm{R} as follows:

> cancer  age malignant  survive  die 1<50 no 77102<50 yes 511335069 no 511145069 yes 3820570+ no 73670+ yes 63\begin{array}{rrrrr} > & \text { cancer } & & & \\ & & \text { age malignant } & \text { survive } & \text { die } \\ 1 & <50 & \text { no } & 77 & 10 \\ 2 & <50 & \text { yes } & 51 & 13 \\ 3 & 50-69 & \text { no } & 51 & 11 \\ 4 & 50-69 & \text { yes } & 38 & 20 \\ 5 & 70+ & \text { no } & 7 & 3 \\ 6 & 70+ & \text { yes } & 6 & 3 \end{array}

Describe the model that is being fitted by the following R\mathrm{R} commands:

Explain the (slightly abbreviated) output from the code below, describing how the hypothesis tests are performed and your conclusions based on their results.

Based on the summary above, motivate and describe the following alternative model:

Based on the output of the code that follows, which of the two models do you prefer? Why?

What is the final value obtained by the following commands?

Typos? Please submit corrections to this page on GitHub.