Study Guide for Exam 2
Exam Format
This will be an closed-book, closed-notes exam. You will be able to use R on a campus computer.
You will need a calculator. A calculator that performs arithmetic operations, reciprocals, square roots, powers and logarithms (base 10) is sufficient. Graphing calculators are permitted. Calculators on your phone are not permitted.
The exam will consist of 6 problems, 5 of which will be open-ended and 1 of which will be multiple choice. All open-ended problems will be multi-part.
Advice for Preparing for the Exam
This is a mathematics exam. As such, you will be expected to do mathematics. A common misconception made by students is that, in preparing for a mathematics exam, it suffices to make sure you can understand the solution to a given problem with the solution in front of you. The exam, however, tests your ability to solve the problem yourself from scratch. Thus, you should practice solving problems from scratch in preparation for the exam. The homework and quizzes are part of this practice, but only completing those once will likely not lead to a strong performance on the exam.
With the ultimate goal of solving problems yourself from scratch in mind, a possible study check list is below:
- Exam questions will be modeled off of in-class examples, homework problems, and quiz problems. As such, go through homework and quiz problems, especially those you found difficult or that were marked as incorrect / incomplete, and make sure you can solve them from scratch, without the solution in front of you.
- Since problems on the exam will not come tagged with the chapter / section relevant to their solutions, you might try solving the quiz and homework problems in a random order. If you would like help with this, I can show you how to construct a randomized list of problems using R.
- Review the compilation of learning objectives below, and make sure you can perform each action below without your notes or book. If there is a particular learning objective you are unclear on, email or see me.
- Cramming (studying right before the exam) combined with massed practice (solving lots of problems all at once) is an ineffective study strategy. Instead, distribute your study sessions over the remaining time until the exam. Study for at most 60 to 90 minutes at a time, and take breaks, whether legitimate breaks (go for a walk, hang out with friends, etc.), or a transition to studying for another course. The sooner before the exam you start studying, the better.
To do well on the exam, you should be able to do the following:
Chapter 3: Discrete Random Variables
Section 3.2: Probability Distributions for Discrete Random Variables
- State the experiment that is modeled by a discrete uniform random variable.
- State the probability mass function and cumulative distribution function for a discrete uniform random variable with a given range.
- Use R to simulate a discrete uniform random variable.
- State the experiment that is modelled by a Bernoulli random variable.
- State the probability mass function and cumulative distribution function for a Bernoulli random variable.
- Use R to simulate a Bernoulli variable.
- State the experiment that is modelled by a geometric random variable.
- State the probability mass function and cumulative distribution function for a geometric random variable.
- Use R to simulate a geometric variable.
- Describe how a probability histogram represents a probability mass function, and construct a probability histogram given a probability mass function and vice versa.
- Relate probability histograms to the other histograms we have learned in class: frequency, relative frequency, and density histograms.
Section 3.3: Expected Values of Discrete Random Variables
- Define the expected value of a discrete random variable \(X\), and compute \(X\)’s expected value given its probability mass function.
- Give a frequency interpretation of the expected value of a random variable \(X\).
- Recognize the convention of using Greek letters for parameters of a random variable, and their Roman counterpart for statistics of a sample. For example, \(\sigma\) for the standard deviation of a random variable and and \(s\) for the standard deviation of a sample.
- Compute the expected value of a random variable \(Y = g(X)\) defined through a function \(g\) of a random variable \(X\) with a known probability mass function.
- Simplify expectations of the form \(E[a X + b]\) and
\(E\left[\sum\limits_{j = 1}^{n} g_{j}(X)\right]\) directly, without resorting to the definition of expectation.
- Define the variance of a discrete random variable \(X\), and compute \(X\)’s variance given its probability mass function.
- Define the standard deviation of a random variable.
- Use the ‘variance shortcut’ to compute the variance of a random variable.
- Simplify variances (standard deviations) of the form \(\text{Var}(a X + b)\) (\(\sigma_{a X + b}\)) directly, without resorting to the definition of variance (standard deviation).
Section 3.5: The Binomial Probability Distribution
- Identify the 4 characteristics of a binomial experiment, and determine whether a given experiment has those characteristics.
- Determine the sample space for a given binomial experiment.
- State in what way a binomial random variable acts as a function from the sample space of a binomial experiment. That is, what does the binomial random variable output for a given outcome in the sample space?
- State the probability mass function for a binomial random variable with parameters \(n\) and \(p\).
- Identify the parameters of a binomial random variable for a given binomial experiment.
- Use the formula for the probability mass function of a binomial random variable to compute \(P(X = x; n, p)\) directly.
- Use R to compute binomial probabilities.
- State the mean and variance of a binomial random variable, and compute the mean and variance from a given binomial experiment.
Chapter 4: Continuous Random Variables
Section 4.1: Probability Density Functions and Cumulative Distribution Functions
- Give examples of quantities that could be modeled using a continuous random variable.
- Recognize and explain the correspondence between density histograms for data and probability density functions for random variables.
- Given a probability density function \(f\) for a continuous random variable \(X\) and a query region \([a, b]\), determine \(P(X \in [a, b]) = P(a \leq X \leq b)\) using \(f\).
- State the two sufficient conditions for a function \(f\) to be a valid probability density function, and identify when a given function \(f\) does or does not satisfy these conditions.
- Given that a probability density function is proportional to a known function, i.e. \(f(x) = c g(x)\), determine the constant \(c\) that makes \(f\) a probability density function.
- Define the cumulative distribution function of a continuous random variable, and compute the cumulative distribution function using the random variable’s probability density function.
- Specify what additional property, beyond the four properties shared by all cumulative distribution functions, unique to the cumulative distribution function of a continuous random variable.
- Compute a probability query \(P(X \in (a, b)) = P(a < X < b)\) using the cumulative distribution function of a continuous random variable.
- Determine the probability density function of a continuous random variable from its cumulative distribution function.
Section 4.2: Expected Values and Moment Generating Functions
- Define the expected value of a continuous random variable \(X\), and compute \(X\)’s expected value given its probability density function.
- Recognize the correspondence between sums and probability mass functions for discrete random variables and integrals and probability density functions for continuous random variables.
- Compute the expected value of a random variable \(Y = g(X)\) defined through a function \(g\) of a continuous random variable \(X\) with a known probability density function.
- Simplify expectations of the form \(E[a X + b]\) and \(E\left[\sum\limits_{j = 1}^{n} g_{j}(X)\right]\) directly, without resorting to the definition of expectation.
- Define the variance of a continuous random variable \(X\), and compute \(X\)’s variance given its probability density function.
- Use the ‘variance shortcut’ to compute the variance of a random variable.
- Simplify variances (standard deviations) of the form \(\text{Var}(a X + b)\) (\(\sigma_{a X + b}\)) directly, without resorting to the definition of variance (standard deviation).
Section 4.3: The Normal Distribution
- State the probability density function for a Gaussian random variable with parameters \(\mu\) and \(\sigma^{2}\).
- Recognize the notation \(X \sim N(\mu, \sigma^{2})\) as indicating that \(X\) is a Gaussian random variable with parameters \(\mu\) and \(\sigma^{2}\).
- State the mean and variance of a Gaussian random variable with parameters \(\mu\) and \(\sigma^{2}\).
- Sketch a graph of the probability density function for a Gaussian random variable with parameters \(\mu\) and \(\sigma^{2}\), getting the general shape and placement of the probability density function correct, and use this graph to reason about the area under consideration for a given probability query.
- Sketch a graph of the cumulative distribution function for a Gaussian random variable with parameters \(\mu\) and \(\sigma^{2}\), getting the general shape and placement of the cumulative distribution function correct.
- State the definition of a standard Gaussian random variable, and recognize the notation that \(Z\) will often be used to denote a standard Gaussian random variable.
- Recognize and use the convention of denoting the cumulative distribution function for a standard Gaussian random variable via \(\Phi(z) = P(Z \leq z)\).
- Standardize a random variable \(X \sim N(\mu, \sigma^{2})\), and indicate the distribution of the transformed random variable.
- Compute probability queries for a Gaussian random variable using R.
- Define the \(p\)-th percentile of a Gaussian random variable, and determine the \(p\)-th percentile using R.
Chapter 6: Statistics of Random Samples and Their Sampling Distributions
Section 6.1: Statistics and Their Distributions
- Define a random sample from a population.
- Explain why a statistic of a random sample is itself a random variable.
- Define the sampling distribution of a statistic.
- Given a probability model for how a random sample is generated, construct the sampling distribution of a statistic of that random sample for small sample sizes (\(n = 2\) or \(3\)).
Section 6.2: The Distribution of the Sample Mean
- Recognize and explain the notation \(X_{1}, X_{2}, \ldots, X_{n} \stackrel{\text{iid}}{\sim} \text{D}\) for a random sample from a population with distribution D.
- Compute the mean and variance of the sample mean for a random sample \(X_{1}, X_{2}, \ldots, X_{n} \stackrel{\text{iid}}{\sim} \text{D}\) of size \(n\) from a population with mean \(\mu\) and variance \(\sigma^{2}\).
- Distinguish between the mean and variance of a population and the mean and variance of the sample mean of a random sample from that population.
- State the premises (conditions) and conclusion of the Central Limit Theorem.
- Explain in what sense the Central Limit Theorem is an asymptotic result.
- State under what conditions the Central Limit Theorem-based approximation for the sampling distribution of the sample mean works well for small sample sizes.
- Use the Central Limit Theorem to approximate the sampling distribution of the sample mean for a random sample from a population with known mean and variance.
- Use the Central Limit Theorem to approximate the sample total for a random sample from a population with known mean and variance.
Chapter 7: Point Estimators
Section 7.1: Point Estimation — General Concepts and Criteria
- Compare and contrast descriptive and inferential statistics in terms of populations/samples and parameters / statistics.
- Explain why you can never answer a question about a population directly from a sample without making an inference.
- State the three main types of inferential procedures we will consider in this course, and describe them in plain English.
- Define point estimator, and state what a point estimator is an estimator for.
- Distinguish between a point estimator (the procedure) and a point estimate (an application of the procedure to a particular sample).
- State common point estimators for population parameters, including point estimators for population means, variances, and success probabilities.
- Given a collection of point estimators and their expected values and variances, come up with a rough ranking of how “good” the point estimators are.
- Recognize the statistical notation of using \(\theta\) for a generic parameter of a population, and \(\widehat{\theta}\) for a point estimator of that parameter.
- Define the standard error of a point estimator, and compute the standard error of a point estimator given its sampling distribution.
Chapter 8: Interval Estimators and Confidence Intervals
Section 8.1: Basic Properties of Confidence Intervals
- Determine the margin of error for the sample mean from a Gaussian population with known standard deviation \(\sigma\) at a confidence level \(c\).
- Recognize the relationship between a confidence level \(c\) and tail value \(\alpha\), and state how these specify the probability in the tails or body of a given distribution.
- Determine the critical value \(z_{\alpha}\) for a standard Gaussian random variable using R’s qnorm.
- Define interval estimator and confidence interval, and make an analogy between them and point estimators and point estimates.
- Construct a \(c = 1 - \alpha\) confidence interval for a population mean \(\mu\) using a random sample from a Gaussian population when the population standard deviation \(\sigma\) is known.
- Sketch the confidence interval from the previous learning objective.
Section 8.3: Intervals Based on a Normal Population Distribution
- Give a constructive definition of a \(t\)-distributed random variable \(T\) using a random sample from a Gaussian population.
- Determine the parameter of a \(t\)-distributed random variable \(T\) constructed using a random sample from a Gaussian population.
- Describe the general properties of the probability density function of a \(t\)-distributed random variable as compared to a standard Gaussian random variable including where each is centered, the symmetry properties of each, and the “fatness” of the tails of each.
- Determine the critical value \(t_{\alpha, \nu}\) for a random variable \(T\) that is \(t\)-distributed with parameter \(\nu\) using R’s qt.
- Construct a \(c = 1 - \alpha\) confidence interval for a population mean \(\mu\) using a random sample from a Gaussian population when the population standard deviation \(\sigma\) is unknown.
- Sketch the confidence interval from the previous learning objective.
- Reason about what confidence interval we have covered, if any, is appropriate for the population mean given a description of a sample from that population.