Exploratory Data Analysis with Multiple Predictors (Lecture Notes for Lecture 13)

  1. Explain what a matrix plot is, and how a matrix plot can be used to investigate the relationship between multiple predictors and a response.
  2. Interpret a 3D scatter plot showing the relationship between two predictors and a response.

Section 6.1: Multiple Linear Regression

  1. Describe the overall goal of multiple linear regression from the perspective of prediction.
  2. State the multiple linear regression model with \(p\) predictors per-unit.
  3. State the multiple linear regression model with \(p\) predictors in matrix-vector form.
  4. Construct a response vector, predictor vector, and design matrix given a particular multiple linear regression model.
  5. Describe what the multiple linear regression function corresponds to, geometrically, using two predictors.
  6. Describe what the multiple linear regression function corresponds to, geometrically, using more than two predictors.
  7. Interpret the coefficients from a multiple linear regression model in the context of a given problem.
  8. Explain why economists and other end-users of regression talk about regression coefficients in terms of ceteris paribus (Latin for “other things equal”) or “controlling” for the other predictors, and why this sort of thinking is fuzzy at-best and entirely fallacious at-worst.
  9. Carefully distinguish between statements such as “the response increases by \(u\) units for each unit increase in the predictor” (a causal statement) and “the prediction of the response increases by \(u\) units for each unit increase in the predictor” (a predictive statement).
  10. Explain why the interpretation of the coefficients from a multiple linear regression depend on the predictors included in the model.

Section 6.3: Estimation of Regression Coefficients

  1. State the objective function that is minimized to determine the ordinary least squares estimator \(\mathbf{b}\) for a multiple linear regression model.
  2. State the form of the ordinary least squares estimator in terms of the design matrix \(\mathbf{X}\) and response vector \(\mathbf{Y}\).

Section 6.4: Fitted Values and Residuals

  1. Explain how to compute the fitted (predicted) values of the response using the design matrix \(\mathbf{X}\) and coefficient vector \(\mathbf{b}\).

R

  1. Plot a matrix plot from a data frame containing a response variable and multiple predictors using pairs.
  2. Plot a 3D scatter plot of a response variable against two predictor variables using plot3d in the rgl package.
  3. Fit a multiple linear regression from a data frame using lm.
  4. Interpret the output of lm when used to fit a multiple linear regression.
  5. Plot the plane-of-best fit from a multiple linear regression with two predictor variables using planes3d in the rgl package.