Homework 8

Chapter 7

Complete the following 4 problems:

Problem 7.3b
- Test the null hypothesis using a \(T\)-statistic (not the \(F\)-statistic) under MLRGN model assumptions and using confcurve's confpvalue.
- Perform the bootstrapping with B = 10000 bootstrap replicates.
- Comment on the difference between the inferences, if any.
Problem 7.6
- Test the null hypothesis by checking the inclusion / exclusion of the null parameter values in an appropriate confidence ellipse.
- Approximate the \(P\)-value of the sample estimates under the given null hypothesis by constructing the appropriate confidence ellipse. You should approximate the \(P\)-value to at least two significant digits.
Problem 7.8
- Follow the same directions as Problem 7.6.
Problem 7.9
- Follow the same directions as Problem 7.6.

Hint for Problems 7.6, 7.8, and 7.9: The function confidenceEllipse in the car package will return a matrix with the \(x\) and \(y\) coordinates of the confidence ellipse that you can plot in Base R using:

plot(confidenceEllipse(...), type = 'l')

where the ... should be the appropriate arguments to confidenceEllipse.

You can also add a single \((x, y)\) point to a plot in Base R by using the command

points(x, y)

Additional Problems

Read Section 9.2.1 of The Truth About Linear Regression [1] about interpreting coefficients in a linear regression after log-transforming the response. Then answer the following questions.

Note: Everywhere you see a \(\log\), you should assume it is the natural logarithm. As some of you have already heard me say, there is only one logarithm, and it is the natural logarithm.

Problem 1

Assume that the MLRGN model assumptions hold for the multiple linear regression model of the log-response on the predictors: \[ \log Y_{i} = \beta_{0} + \sum_{j = 1}^{p} \beta_{j} X_{ij} + \epsilon_{i}, i = 1, \ldots, n\]

What does \(\beta_{0}\) tell us about the distribution of the population response? Hint: it is not the expected response when the predictors are zero.
What does \(\beta_{j}\) for \(j \neq 0\) tell us about the relationship between the population response and the \(j\)th predictor? Hint: it is not the expected increase in the response for each unit increase in the \(j\)th predictor while accounting for the other predictors in the model.

Problem 2

Consider the simple linear regression model \[ \log Y = \beta_{0} + \beta_{1} X + \epsilon.\] As you discovered in the previous problem, \(\beta_{1}\) no longer corresponds to the expected increase in \(Y\) for each unit increase in \(X\). However, we can provide an interpretation for \(\beta_{1}\) that is nearly as easy to understand, as long as \(\beta_{1}\) is sufficiently small in magnitude.

Solve for \(Y\) in the simple linear regression.
Show that the proportional increase in \(Y\) for each unit increase in \(X\), \[ \frac{Y(X + 1) - Y(X)}{Y(X)},\] simplifies to \(e^{\beta_{1}} - 1\).
Derive (or find) the Maclaurin series expansion of \(e^{\beta_{1}}\).
Explain why, for small enough \(\beta_{1}\), we can approximate \( e^{\beta_{1}} - 1\) by \(\beta_{1}\).
Provide an interpretation for \(\beta_{1}\) given this fact, and demonstrate that this is the case for \(\beta_{1} = 0.05\) and \(\beta_{1} = -0.05\).

This is one of my favorite textbooks on regression, by one of my favorite statisticians, Cosma Shalizi. As the title suggests, it cuts past a lot of b******t you might read / hear about linear regression and provides just the facts about what a regression can (and cannot) say about a statistical question. If you ever want to know more about linear regression, I strongly recommend this book. ↩