Statistical Modelling
Jump to year
Paper 1, Section I, J
commentLet . The probability density function of the inverse Gaussian distribution (with the shape parameter equal to 1 ) is given by
Show that this is a one-parameter exponential family. What is its natural parameter? Show that this distribution has mean and variance .
Paper 1, Section II, J
commentThe following data were obtained in a randomised controlled trial for a drug. Due to a manufacturing error, a subset of trial participants received a low dose (LD) instead of a standard dose (SD) of the drug.
(a) Below we analyse the data using Poisson regression:
(i) After introducing necessary notation, write down the Poisson models being fitted above.
(ii) Write down the corresponding multinomial models, then state the key theoretical result (the "Poisson trick") that allows you to fit the multinomial models using Poisson regression. [You do not need to prove this theoretical result.]
(iii) Explain why the number of degrees of freedom in the likelihood ratio test is 2 in the analysis of deviance table. What can you conclude about the drug?
(b) Below is the summary table of the second model:
(i) Drug efficacy is defined as one minus the ratio of the probability of worsening in the treated group to the probability of worsening in the control group. By using a more sophisticated method, a published analysis estimated that the drug efficacy is for the LD treatment and for the treatment. Are these numbers similar to what is obtained by Poisson regression? [Hint: , and , where is the base of the natural logarithm.]
(ii) Explain why the information in the summary table is not enough to test the hypothesis that the LD drug and the SD drug have the same efficacy. Then describe how you can test this hypothesis using analysis of deviance in .
Paper 2, Section I, J
commentDefine a generalised linear model for a sample of independent random variables. Define further the concept of the link function. Define the binomial regression model (without the dispersion parameter) with logistic and probit link functions. Which of these is the canonical link function?
Paper 3, Section I, J
commentConsider the normal linear model , where is a design matrix, is a vector of responses, is the identity matrix, and are unknown parameters.
Derive the maximum likelihood estimator of the pair and . What is the distribution of the estimator of ? Use it to construct a -level confidence interval of . [You may use without proof the fact that the "hat matrix" is a projection matrix.]
Paper 4 , Section I, J
commentThe data frame data contains the daily number of new avian influenza cases in a large poultry farm.
Write down the model being fitted by the code below. Does the model seem to provide a satisfactory fit to the data? Justify your answer.
The owner of the farm estimated that the size of the epidemic was initially doubling every 7 days. Is that estimate supported by the analysis below? [You may need .]
Paper 4, Section II, J
commentLet be an non-random design matrix and be a -vector of random responses. Suppose , where is an unknown vector and is known.
(a) Let be a constant. Consider the ridge regression problem
Let be the fitted values. Show that , where
(b) Show that
(c) Let , where is independent of . Show that is an unbiased estimator of .
(d) Describe the behaviour (monotonicity and limits) of as a function of when and . What is the minimum value of ?
Paper 1, Section I, J
commentConsider a generalised linear model with full column rank design matrix , output variables , link function , mean parameters and known dispersion parameters . Denote its variance function by and recall that , where and is the row of .
(a) Define the score function in terms of the log-likelihood function and the Fisher information matrix, and define the update of the Fisher scoring algorithm.
(b) Let be a diagonal matrix with positive entries. Note that is invertible. Show that
[Hint: you may use that
(c) Recall that the score function and the Fisher information matrix have entries
Justify, performing the necessary calculations and using part (b), why the Fisher scoring algorithm is also known as the iterative reweighted least squares algorithm.
Paper 1, Section II, J
commentWe consider a subset of the data on car insurance claims from Hallin and Ingenbleek (1983). For each customer, the dataset includes total payments made per policy-year, the amount of kilometres driven, the bonus from not having made previous claims, and the brand of the car. The amount of kilometres driven is a factor taking values , or 5 , where a car in level has driven a larger number of kilometres than a car in level for any . A statistician from an insurance company fits the following model on .
model1 <- Im(Paymentperpolicyyr as numeric(Kilometres) Brand Bonus)
(i) Why do you think the statistician transformed variable Kilometres from a factor to a numerical variable?
(ii) To check the quality of the model, the statistician applies a function to model1 which returns the following figure:
What does the plot represent? Does it suggest that model1 is a good model? Explain. If not, write down a model which the plot suggests could be better.
(iii) The statistician fits the model suggested by the graph and calls it model2. Consider the following abbreviated output:
Coefficients:
Brand2
Brand9
Bonus
Signif. codes: 0 '' '' '' '.' ' 1
Residual standard error: on 284 degrees of freedom ..
Using the output, write down a prediction interval for the ratio between the total payments per policy year for two cars of the same brand and with the same value of Bonus, one of which has a Kilometres value one higher than the other. You may express your answer as a function of quantiles of a common distribution, which you should specify.
(iv) Write down a generalised linear model for Paymentperpolicyyr which may be a better model than model1 and give two reasons. You must specify the link function.
Paper 2, Section I, J
commentThe data frame WCG contains data from a study started in 1960 about heart disease. The study used 3154 adult men, all free of heart disease at the start, and eight and a half years later it recorded into variable chd whether they suffered from heart disease (1 if the respective man did and 0 otherwise) along with their height and average number of cigarettes smoked per day. Consider the code below and its abbreviated output.
(a) Write down the model fitted by the code above.
(b) Interpret the effect on heart disease of a man smoking an average of two packs of cigarettes per day if each pack contains 20 cigarettes.
(c) Give an alternative latent logistic-variable representation of the model. [Hint: if is the cumulative distribution function of a logistic random variable, its inverse function is the logit function.]
Paper 3, Section I, J
commentSuppose we have data , where the are independent conditional on the design matrix whose rows are the . Suppose that given , the true probability density function of is , so that the data is generated from an element of a model for some and .
(a) Define the log-likelihood function for , the maximum likelihood estimator of and Akaike's Information Criterion (AIC) for .
From now on let be the normal linear model, i.e. , where has full column rank and .
(b) Let denote the maximum likelihood estimator of . Show that the AIC of is
(c) Let be a chi-squared distribution on degrees of freedom. Using any results from the course, show that the distribution of the AIC of is
Hint: , where is the maximum likelihood estimator of and is the projection matrix onto the column space of .]
Paper 4, Section I, J
commentSuppose you have a data frame with variables response, covar1, and covar2. You run the following commands on .
...
(a) Consider the following three scenarios:
(i) All the output you have is the abbreviated output of summary (model) above.
(ii) You have the abbreviated output of summary (model) above together with
Residual standard error: on 47 degrees of freedom
Multiple R-squared: , Adjusted R-squared:
F-statistic: on 2 and 47 DF, p-value: <
(iii) You have the abbreviated output of summary (model) above together with
Residual standard error: on 47 degrees of freedom
Multiple R-squared: , Adjusted R-squared:
F-statistic: on 2 and 47 DF, p-value:
What conclusion can you draw about which variables explain the response in each of the three scenarios? Explain.
(b) Assume now that you have the abbreviated output of summary (model) above together with
anova(lm(response response , model
What are the values of the entries with a question mark? [You may express your answers as arithmetic expressions if necessary].
Paper 4, Section II, J
comment(a) Define a generalised linear model with design matrix , output variables and parameters and . Derive the moment generating function of , i.e. give an expression for , wherever it is well-defined.
Assume from now on that the GLM satisfies the usual regularity assumptions, has full column rank, and is known and satisfies .
(b) Let be the output variables of a GLM from the same family as that of part (a) and parameters and . Suppose the output variables may be split into blocks of size with constant parameters. To be precise, for each block , if then
with and defined as in part a . Let , where .
(i) Show that is equal to in distribution. [Hint: you may use without proof that moment generating functions uniquely determine distributions from exponential dispersion families.]
(ii) For any , let , where . Show that the model function of satisfies
for some functions , and conclude that is a sufficient statistic for from .
(iii) For the model and data from part (a), let be the maximum likelihood estimator for and let be the deviance at . Using (i) and (ii), show that
where means equality in distribution and and are nested subspaces of which you should specify. Argue that and , and, assuming the usual regularity assumptions, conclude that
stating the name of the result from class that you use.
Paper 1, Section I, J
commentThe Gamma distribution with shape parameter and scale parameter has probability density function
Give the definition of an exponential dispersion family and show that the set of Gamma distributions forms one such family. Find the cumulant generating function and derive the mean and variance of the Gamma distribution as a function of and .
Paper 1, Section II, J
commentThe ice_cream data frame contains the result of a blind tasting of 90 ice creams, each of which is rated as poor, good, or excellent. It also contains the price of each ice cream classified into three categories. Consider the code below and its output.
(a) Write down the generalised linear model fitted by the code above.
(b) Prove that the fitted values resulting from the maximum likelihood estimator of the coefficients in this model are identical to those resulting from the maximum likelihood estimator when fitting a Multinomial model which assumes the number of ice creams at each price level is fixed.
(c) Using the output above, perform a goodness-of-fit test at the level, specifying the null hypothesis, the test statistic, its asymptotic null distribution, any assumptions of the test and the decision from your test. (d) If we believe that better ice creams are more expensive, what could be a more powerful test against the model fitted above and why?
Paper 2, Section I, J
commentThe cycling data frame contains the results of a study on the effects of cycling to work among 1,000 participants with asthma, a respiratory illness. Half of the participants, chosen uniformly at random, received a monetary incentive to cycle to work, and the other half did not. The variables in the data frame are:
miles: the average number of miles cycled per week
episodes: the number of asthma episodes experienced during the study
incentive: whether or not a monetary incentive to cycle was given
history: the number of asthma episodes in the year preceding the study
Consider the code below and its abbreviated output.
(episodes miles history, data=cycling)
Coefficients:
Estimate Std. Error value
(Intercept)
miles
history
episodes incentive history, data=cycling)
summary (lm.2)
Coefficients:
Estimate Std. Error value
(Intercept)
incentiveYes
history
miles incentive history, data=cycling)
Coefficients :
Estimate Std. Error t value
(Intercept)
incentiveYes
history
(a) For each of the fitted models, briefly explain what can be inferred about participants with similar histories.
(b) Based on this analysis and the experimental design, is it advisable for a participant with asthma to cycle to work more often? Explain.
Paper 3, Section I, J
comment(a) For a given model with likelihood , define the Fisher information matrix in terms of the Hessian of the log-likelihood.
Consider a generalised linear model with design matrix , output variables , a bijective link function, mean parameters and dispersion parameters . Assume is known.
(b) State the form of the log-likelihood.
(c) For the canonical link, show that when the parameter is known, the Fisher information matrix is equal to
for a diagonal matrix depending on the means . Identify .
Paper 4, Section I, J
commentIn a normal linear model with design matrix , output variables and parameters and , define a -level prediction interval for a new observation with input variables . Derive an explicit formula for the interval, proving that it satisfies the properties required by the definition. [You may assume that the maximum likelihood estimator is independent of , which has a distribution.]
Paper 4, Section II, J
commentA sociologist collects a dataset on friendships among Cambridge graduates. Let if persons and are friends 3 years after graduation, and otherwise. Let be a categorical variable for person 's college, taking values in the set . Consider logistic regression models,
with parameters either
; or,
; or,
, where if and 0 otherwise.
(a) Write the likelihood of the models.
(b) Show that the three models are nested and specify the order. Suggest a statistic to compare models 1 and 3, give its definition and specify its asymptotic distribution under the null hypothesis, citing any necessary theorems.
(c) Suppose persons and are in the same college consider the number of friendships, and , that each of them has with people in college ( and fixed). In each of the models above, compare the distribution of these two random variables. Explain why this might lead to a poor quality of fit.
(d) Find a minimal sufficient statistic for model 3. [You may use the following characterisation of a minimal sufficient statistic: let be the likelihood in this model, where and suppose is a statistic such that is constant in if and only if ; then, is a minimal sufficient statistic for .]
Paper 1, Section I, J
commentThe data frame Ambulance contains data on the number of ambulance requests from a Cambridgeshire hospital on different days. In addition to the number of ambulance requests on each day, the dataset records whether each day fell in the winter season, on a weekend, or on a bank holiday, as well as the pollution level on each day.
A health researcher fitted two models to the dataset above using . Consider the following code and its output.
Define the two models fitted by this code and perform a hypothesis test with level in which one of the models is the null hypothesis and the other is the alternative. State the theorem used in this hypothesis test. You may use the information generated by the following commands.
Paper 1, Section II, J
commentA clinical study follows a number of patients with an illness. Let be the length of time that patient lives and a vector of predictors, for . We shall assume that are independent. Let and be the probability density function and cumulative distribution function, respectively, of . The hazard function is defined as
We shall assume that , where is a vector of coefficients and is some fixed hazard function.
(a) Prove that .
(b) Using the equation in part (a), write the log-likelihood function for in terms of and only.
(c) Show that the maximum likelihood estimate of can be obtained through a surrogate Poisson generalised linear model with an offset.
Paper 2, Section I,
commentConsider a linear model with , where the design matrix is by . Provide an expression for the -statistic used to test the hypothesis for . Show that it is a monotone function of a log-likelihood ratio statistic.
Paper 3, Section I, J
commentThe data frame Cases. of .flu contains a list of cases of flu recorded in 3 London hospitals during each month of 2017 . Consider the following code and its output.
table (Cases. of.flu)
Month Hospital
May
November
October
September
Cases. of.flu.table = as.data.frame (table (Cases. of .flu))
head (Cases. of .flu.table)
Month Hospital Freq
1 April A 10
2 August A 9
3 December A 24
4 February A 49
5 January A 45
6 July A 5
glm (Freq ., data=Cases. of .flu.table, family=poisson)
[1]
levels (Cases. of.flu$Month)
Describe a test for the null hypothesis of independence between the variables Month and Hospital using the deviance statistic. State the assumptions of the test.
Perform the test at the level for each of the two different models shown above. You may use the table below showing 99 th percentiles of the distribution with a range of degrees of freedom . How would you explain the discrepancy between their conclusions?
Paper 4, Section I, J
commentA scientist is studying the effects of a drug on the weight of mice. Forty mice are divided into two groups, control and treatment. The mice in the treatment group are given the drug, and those in the control group are given water instead. The mice are kept in 8 different cages. The weight of each mouse is monitored for 10 days, and the results of the experiment are recorded in the data frame Weight.data. Consider the following code and its output.
head (Weight.data)
Time Group Cage Mouse Weight
11 Control 1 1
Control 1 1
Control
44 Control
Control 1 1
Control
(Weight Time*Group Cage, data=Weight. data)
Call:
(formula Weight Time Group Cage, data Weight. data)
Residuals:
Min Median Max
Coefficients:
Estimate Std. Error t value
GroupTreatment
Cage2
Time: GroupTreatment
Signif. codes: 0 '' '' '' '., ', 1
Residual standard error: on 391 degrees of freedom
Multiple R-squared: , Adjusted R-squared:
F-statistic: on 8 and 391 DF, p-value:
Which parameters describe the rate of weight loss with time in each group? According to the output, is there a statistically significant weight loss with time in the control group?
Three diagnostic plots were generated using the following code.
Weight.data$Time[mouse1]
Weight.data$Time[mouse2]
Based on these plots, should you trust the significance tests shown in the output of the command summary (mod1)? Explain.
Paper 4, Section II, J
commentBridge is a card game played by 2 teams of 2 players each. A bridge club records the outcomes of many games between teams formed by its members. The outcomes are modelled by
where is a parameter representing the skill of player , and is a parameter representing how well-matched the team formed by and is.
(a) Would it make sense to include an intercept in this logistic regression model? Explain your answer.
(b) Suppose that players 1 and 2 always play together as a team. Is there a unique maximum likelihood estimate for the parameters and ? Explain your answer.
(c) Under the model defined above, derive the asymptotic distribution (including the values of all relevant parameters) for the maximum likelihood estimate of the probability that team wins a game against team . You can state it as a function of the true vector of parameters , and the Fisher information matrix with games. You may assume that as , and that has a unique maximum likelihood estimate for large enough.
Paper 1, Section I, J
commentThe dataset ChickWeights records the weight of a group of chickens fed four different diets at a range of time points. We perform the following regressions in .
(i) Which hypothesis test does the following command perform? State the degrees of freedom, and the conclusion of the test.
(ii) Define a diagnostic plot that might suggest the logarithmic transformation of the response in fit2.
(iii) Define the dashed line in the following plot, generated with the command plot(fit3). What does it tell us about the data point 579 ?
Paper 1, Section II, J
commentThe Cambridge Lawn Tennis Club organises a tournament in which every match consists of 11 games, all of which are played. The player who wins 6 or more games is declared the winner.
For players and , let be the total number of games they play against each other, and let be the number of these games won by player . Let and be the corresponding number of matches.
A statistician analysed the tournament data using a Binomial Generalised Linear Model (GLM) with outcome . The probability that wins a game against is modelled by
with an appropriate corner point constraint. You are asked to re-analyse the data, but the game-level results have been lost and you only know which player won each match.
We define a new GLM for the outcomes with and , where the are defined in . That is, is the log-odds that wins a game against , not a match.
Derive the form of the new link function . [You may express your answer in terms of a cumulative distribution function.]
Paper 2, Section I, J
commentA statistician is interested in the power of a -test with level in linear regression; that is, the probability of rejecting the null hypothesis with this test under an alternative with .
(a) State the distribution of the least-squares estimator , and hence state the form of the -test statistic used.
(b) Prove that the power does not depend on the other coefficients for .
Paper 3, Section I, J
commentFor Fisher's method of Iteratively Reweighted Least-Squares and Newton-Raphson optimisation of the log-likelihood, the vector of parameters is updated using an iteration
for a specific function . How is defined in each method?
Prove that they are identical in a Generalised Linear Model with the canonical link function.
Paper 4, Section I, J
commentA Cambridge scientist is testing approaches to slow the spread of a species of moth in certain trees. Two groups of 30 trees were treated with different organic pesticides, and a third group of 30 trees was kept under control conditions. At the end of the summer the trees are classified according to the level of leaf damage, obtaining the following contingency table.
Which of the following Generalised Linear Model fitting commands is appropriate for these data? Why? Describe the model being fit.
Paper 4, Section II, J
commentThe dataset diesel records the number of diesel cars which go through a block of Hills Road in 6 disjoint periods of 30 minutes, between 8AM and 11AM. The measurements are repeated each day for 10 days. Answer the following questions based on the code below, which is shown with partial output.
(a) Can we reject the model fit. 1 at a level? Justify your answer.
(b) What is the difference between the deviance of the models fit. 2 and fit.3?
(c) Which of fit. 2 and fit. 3 would you use to perform variable selection by backward stepwise selection? Why?
(d) How does the final plot differ from what you expect under the model in fit.2? Provide a possible explanation and suggest a better model.
head (diesel)
period num.cars day
fit. glm(num.cars period, data=diesel, family=poisson)
summary (fit.1)
Deviance Residuals:
Min 1Q Median 3Q Max
Coefficients:
Estimate Std. Error value
(Intercept)
period
Signif. codes: 0 ? ? ? ? ?.? ? ? 1
(Dispersion parameter for poisson family taken to be 1)
Null deviance: on 59 degrees of freedom
Residual deviance: on 58 degrees of freedom
AIC:
diesel$period.factor = factor(diesel$period)
fit. glm (num.cars period.factor, data=diesel, family=poisson)
(fit.2)
Coefficients:
Estimate Std. Error z value
Part II, List of Questions
[TURN OVER
Paper 1, Section I, K
commentThe body mass index (BMI) of your closest friend is a good predictor of your own BMI. A scientist applies polynomial regression to understand the relationship between these two variables among 200 students in a sixth form college. The commands
fit. poly friendBMI , 2, raw=T
fit. poly friendBMI, 3, raw
fit the models and , respectively, with in each case.
Setting the parameters raw to FALSE:
fit. poly friendBMI , 2, raw=F )
fit. poly friendBMI, 3, raw
fits the models and , with . The function is a polynomial of degree . Furthermore, the design matrix output by the function poly with raw=F satisfies:
poly friendBMI, 3, raw poly , raw
How does the variance of differ in the models and ? What about the variance of the fitted values ? Finally, consider the output of the commands
(fit.1,fit.2)
anova(fit.3,fit.4)
Define the test statistic computed by this function and specify its distribution. Which command yields a higher statistic?
Paper 1, Section II, K
comment(a) Let be an -vector of responses from the linear model , with . The internally studentized residual is defined by
where is the least squares estimate, is the leverage of sample , and
Prove that the joint distribution of is the same in the following two models: (i) , and (ii) , with (in this model, are identically -distributed). [Hint: A random vector is spherically symmetric if for any orthogonal matrix . If is spherically symmetric and a.s. nonzero, then is a uniform point on the sphere; in addition, any orthogonal projection of is also spherically symmetric. A standard normal vector is spherically symmetric.]
(b) A social scientist regresses the income of 120 Cambridge graduates onto 20 answers from a questionnaire given to the participants in their first year. She notices one questionnaire with very unusual answers, which she suspects was due to miscoding. The sample has a leverage of . To check whether this sample is an outlier, she computes its externally studentized residual,
where is estimated from a fit of all samples except the one in question, . Is this a high leverage point? Can she conclude this sample is an outlier at a significance level of ?
(c) After examining the following plot of residuals against the response, the investigator calculates the externally studentized residual of the participant denoted by the black dot, which is . Can she conclude this sample is an outlier with a significance level of ?
Part II, List of Questions
Paper 2, Section I, K
commentDefine an exponential dispersion family. Prove that the range of the natural parameter, , is an open interval. Derive the mean and variance as a function of the log normalizing constant.
[Hint: Use the convexity of , i.e. for all
Paper 3, Section I, K
commentThe command
rainfall month+elnino+month:elnino)
performs a Box-Cox transform of the response at several values of the parameter , and produces the following plot:
We fit two linear models and obtain the Q-Q plots for each fit, which are shown below in no particular order:
Define the variable on the -axis in the output of boxcox, and match each Q-Q plot to one of the models.
After choosing the model fit.2, the researcher calculates Cook's distance for the th sample, which has high leverage, and compares it to the upper -point of an distribution, because the design matrix is of size . Provide an interpretation of this comparison in terms of confidence sets for . Is this confidence statement exact?
Paper 4, Section I, K
comment(a) Let where for are independent and identically distributed. Let for , and suppose that these variables follow a binary regression model with the complementary log-log link function . What is the probability density function of ?
(b) The Newton-Raphson algorithm can be applied to compute the MLE, , in certain GLMs. Starting from , we let be the maximizer of the quadratic approximation of the log-likelihood around :
where and are the gradient and Hessian of the log-likelihood. What is the difference between this algorithm and Iterative Weighted Least Squares? Why might the latter be preferable?
Paper 4, Section II, K
commentFor 31 days after the outbreak of the 2014 Ebola epidemic, the World Health Organization recorded the number of new cases per day in 60 hospitals in West Africa. Researchers are interested in modelling , the number of new Ebola cases in hospital on day , as a function of several covariates:
lab: a Boolean factor for whether the hospital has laboratory facilities,
casesBefore: number of cases at the hospital on the previous day,
urban: a Boolean factor indicating an urban area,
country: a factor with three categories, Guinea, Liberia, and Sierra Leone,
numDoctors: number of doctors at the hospital,
tradBurials: a Boolean factor indicating whether traditional burials are common in the region.
Consider the output of the following code (with some lines omitted):
fit. 1 <- glm(newCases lab+casesBefore+urban+country+numDoctors+tradBurials,
- data=ebola, family=poisson)
summary (fit.1)
Coefficients:
Estimate Std. Error z value
casesBefore
countryLiberia
countrySierra Leone
numDoctors
tradBurialstrUE
Signif. codes:
(a) Would you conclude based on the -tests that an urban setting does not affect the rate of infection?
(b) Explain how you would predict the total number of new cases that the researchers will record in Sierra Leone on day 32 .
We fit a new model which includes an interaction term, and compute a test statistic using the code:
fit. glm (newCases casesBefore+country+country:casesBefore+numDoctors,
- data=ebola, family=poisson)
fit. 2 deviance - fit.1$deviance
[1]
(c) What is the distribution of the statistic computed in the last line?
(d) Under what conditions is the deviance of each model approximately chi-squared?
Paper 1, Section I, J
commentThe outputs of a particular process are positive and are believed to be related to -vectors of covariates according to the following model
In this model are i.i.d. random variables where is known. It is not possible to measure the output directly, but we can detect whether the output is greater than or less than or equal to a certain known value . If
show that a probit regression model can be used for the data .
How can we recover and from the parameters of the probit regression model?
Paper 1, Section II, J
commentAn experiment is conducted where scientists count the numbers of each of three different strains of fleas that are reproducing in a controlled environment. Varying concentrations of a particular toxin that impairs reproduction are administered to the fleas. The results of the experiment are stored in a data frame in , whose first few rows are given below.
The full dataset has 80 rows. The first column provides the number of fleas, the second provides the concentration of the toxin and the third specifies the strain of the flea as factors 0,1 or 2 . Strain 0 is the common flea and strains 1 and 2 have been genetically modified in a way thought to increase their ability to reproduce in the presence of the toxin.
Explain and interpret the commands and (abbreviated) output below. In particular, you should describe the model being fitted, briefly explain how the standard errors are calculated, and comment on the hypothesis tests being described in the summary.
Explain and motivate the following code in the light of the output above. Briefly explain the differences between the models fitted below, and the model corresponding to it
Denote by the three models being fitted in sequence above. Explain the hypothesis tests comparing the models to each other that can be performed using the output from the following code.
fit1$dev, fit2$dev, fit3$dev)
[1]
[1]
Use these numbers to comment on the most appropriate model for the data.
Paper 2, Section I, J
commentLet be independent Poisson random variables with means , where for some known constants and an unknown parameter . Find the log-likelihood for .
By first computing the first and second derivatives of the log-likelihood for , describe the algorithm you would use to find the maximum likelihood estimator . Hint: Recall that if then
for .]
Paper 3, Section I, J
commentData are available on the number of counts (atomic disintegration events that take place within a radiation source) recorded with a Geiger counter at a nuclear plant. The counts were registered at each second over a 30 second period for a short-lived, man-made radioactive compound. The first few rows of the dataset are displayed below.
Describe the model being fitted with the following command.
fit Counts Time, data=geiger)
Below is a plot against time of the residuals from the model fitted above.
Referring to the plot, suggest how the model could be improved, and write out the code for fitting this new model. Briefly describe how one could test in whether the new model is to be preferred over the old model.
Paper 4, Section I, J