Homework 23

Instructions:

For each of the following problems:

  1. Use R to generate a scatter plot of the predictor versus the response.
  2. Print out the scatter plot.
  3. Annotate the print out with a description of the trend between the predictor and the response and its strength.
  4. Use R to estimate the linear regression model using lm. Record the slope and intercept reported by R.
  5. Interpret the slope and intercept of the linear regression model in the context of the problem.
  6. Write the prediction function \(\widehat{y} = b_{0} + b_{1}x\), substituting in the appropriate values for the parameters.
  7. Use summary to find the standard error of prediction. Record the standard error of prediction.
  8. Use R to compute the median absolute error of prediction. Interpret the median absolute error of prediction in the context of the problem.
  9. Give the prediction for the response variable at the specified value of the predictor variable. That is, evaluate \(\widehat{y} = b_{0} + b_{1} x\) at the given value of the predictor.

Problems

  1. Use Pearson's height data from the course website, but predict the heights of the fathers using the heights of their sons. Complete each of the steps above. For Step 9, predict the height of the father of a 6 foot tall son. Are the slope and intercept for predicting the height of a father from the height of his son the same as the slope and intercept for predicting the height of a son from the height of his father?
  2. Use the SAT data from the wooldridge library in R:
    1. You need to install the library in R using: install.packages("wooldridge")
    2. You need to load the library using: library(wooldridge)
    3. The data frame can be accessed through the data frame: wooldridge::gpa2
    Predict the first semester GPA (colgpa) of a student using the size of their graduating class (hsize, which is reported in 100s of students). For Step 9, predict the first semester GPA of a student who had a high school graduating class (approximately) the same size as your high school graduating class. How close is this to your GPA for your first semester in college? Does it makes sense to use the regression model fit to this data set to predict your first semester GPA? Why or why not?