Study Guide for Exam 3
This will be a closed-book exam. You will be allowed to use R on a school computer for computations. This means you will not have access to, nor will you need to use, anything beyond Base R (e.g. mosaic, MUsaic, etc.) for the exam.
To do well on the exam, you should be able to do the following:
Chapter 15
How confidence intervals behave
- Distinguish between an interval estimator and an interval estimate (aka “confidence interval”).
- Explain what the probability \(c\) is associated with for a confidence level \(c\) confidence interval.
- Identify how the width of a confidence interval varies as \(n\), \(c\), and \(s_{X}\) vary.
Chapter 17
The \(t\) distributions
- State the sampling distribution of a \(Z\)-score for a sample mean from a Normal population.
- State the sampling distribution of a \(T\)-score for a sample mean from a Normal population.
- Identify the degrees of freedom for the \(t\) distribution resulting from a \(T\)-score for a sample mean.
- Compare and contrast the density curve for a \(t\)-distributed random variable to the density curve of a standard Normal random variable.
- Identify how the shape of the density curve of a \(t\)-distributed random variable changes as its degrees of freedom increase.
- Compute a critical value \(t_{\alpha, n-1}\) for a \(t\)-distribution with \(n - 1\) degrees of freedom using qt in R.
The one-sample \(t\) confidence interval
- Compute a \(t\)-based confidence interval given relevant information about a random sample from a Normal population.
- Sketch how the \(t\)-based confidence interval for a population mean is related to the sample mean, critical value, and estimate of the standard error of the sample mean.
Chapter 14
Hypothesis testing
- Identify what types of quantities a statistical hypothesis always addresses.
- Identify the population parameter from a statistical claim.
- Given a claim about a population parameter \(\theta\) as a “word problem,” write the claim as \(\theta \, \, \square \, \, \theta_{0}\) where the “box” \(\square\) contains one of \(=, \neq, \leq, \geq, <,\) or \(>\) and \(\theta_{0}\) is the claimed value of the population parameter.
- Define null hypothesis and alternative hypothesis.
- Given a claim in the form \(\theta \, \, \square \, \, \theta_{0}\), identify whether the claim is a null hypothesis or an alternative hypothesis.
- Given a claim in the form \(\theta \, \, \square \, \, \theta_{0}\), identify the associated null and alternative hypotheses for the claim.
- Recognize the symbols \(H_{0}\) (“H-naught”) and \(H_{a}\) (“H-a”) as denoting the null and alternative hypotheses, respectively.
- Explain why it never (no really, never) makes sense to frame a statistical hypothesis in terms of a sample statistic.
Tests for a population mean
- Given a claim about a population mean \(\mu_{X}\) as a “word problem,” write the claim as \(\mu_{X} \, \, \square \, \, \mu_{0}\) where the “box” \(\square\) contains one of \(=, \neq, \leq, \geq, <,\) or \(>\) and \(\mu_{0}\) is the claimed value of the population mean.
- Given a claim about a population mean \(\mu_{X}\) as a “word problem,” determine the corresponding null and alternative hypotheses.
- State a reasonable test statistic for testing a claim about a population mean.
Chapter 14
The one-sample \(t\) test
- Explain why a \(T\)-score can be used to test a claim about a population mean.
- Compute the observed \(T\)-score, denoted \(t_{\text{obs}}\), from a “word problem” about a population mean.
- Given an observed sample mean, sample standard deviation, and sample size, and a pair of null / alternative hypotheses, determine whether the observed sample mean provides evidence against the null hypothesis (equivalently, for the alternative hypothesis).
- State the sampling distribution of the \(T\)-score under the null hypothesis when the population distribution is Normal (or approximately so).
Chapter 14
\(P\)-value and statistical significance
- State how a \(P\)-value is defined for an observed test statistic and a given null hypothesis.
- State the rejection rule for using the \(P\)-value of an observed test statistic to reject (or not) the null hypothesis at a given significance level \(\alpha\).
Chapter 15
How hypothesis tests behave
- Define the two types of errors one can make in a hypothesis test, and state their names.
- Relate the two types of errors that one can make in a hypothesis test to decisions in legal trials and during warfare.
- Define the two error rates of a hypothesis test.
- Recognize \(\alpha\) as indicating the Type I Error Rate of a hypothesis test.
- Recognize the synonym of “significance level” for the Type I Error Rate \(\alpha\).
- State which of the two error rates are explicitly controlled for in constructing a hypothesis test.
- Define test statistic, and explain the distinction between a test statistic and an observed test statistic.
- Define the rejection region of a hypothesis test, and state how the rejection region is used to make a conclusion about a null hypothesis.
- State the 7 step procedure for testing a claim about a population using a hypothesis test that controls the Type I Error Rate.
Chapter 17
The one-sample \(t\) test
- State the test statistic that is used for a one-sample \(t\)-test, and give its sampling distribution when underlying the population is Normal and the null hypothesis is true.
- State the left-sided, right-sided, and two-sided rejection regions for the one-sample \(t\) test.
- Identify when a left-sided, right-sided, or two-sided rejection region is appropriate to test a given null hypothesis.
- Construct a left-sided, right-sided, or two-sided rejection region given a claim about a population and a significance level \(\alpha\).
- Determine whether an observed test statistic “falls in” a rejection region.
- Determine the appropriate tail of the \(t\)-distribution to use to compute a \(P\)-value for a given null hypothesis.
- Compute the \(P\)-value for an observed \(t\)-statistic, and use the \(P\)-value to conclude whether or not to reject a given null hypothesis.
Chapter 17
Testing a Two-sided Hypothesis Using a Confidence Interval (Lecture Notes for Lecture 16)
- State the procedure for testing a null hypothesis \(\mu_{X} = \mu_{0}\) at significance level \(\alpha\) using a confidence interval.
- State the rationale for why we can test a null hypothesis \(\mu_{X} = \mu_{0}\) at significance level \(\alpha\) using a confidence interval with confidence level \(c = 1 - \alpha\).
- Perform a hypothesis test for the null hypothesis \(\mu_{X} = \mu_{0}\) given all of the relevant information.
One-sample \(t\)-tests in R (Lecture Notes for Lecture 16)
- Perform a one-sample \(t\)-test from a variable in a data frame using
t.test
from mosaic
.
- Perform a one-sample \(t\)-test from summary statistics using
one.sample.t.test
from MUsaic
.
- Construct a confidence level \(c\) confidence interval from a variable in a data frame using
t.test
from mosaic
.
- Construct a confidence level \(c\) confidence interval from summary statistics using
one.sample.t.test
from MUsaic
.
Chapter 18
Comparing two population means
- Give examples of scientific questions that would warrant comparisons of two population means.
- Recognize the Greek letter \(\delta\), the Greek analog to the Roman letter \(d\), which is used to indicate \(\delta\text{ifferences}\) between population parameters.
- State a claim about two population means \(\mu_{X}\) and \(\mu_{Y}\) as an equality / inequality involving the difference \(\delta = \mu_{X} - \mu_{Y}\) between the population means.
Two-sample \(t\) procedures
- State the test statistic used in the two-sample \(t\)-test.
- Determine, qualitatively, whether an observed test statistic from the two-sample \(t\)-test provides evidence against a null hypothesis.
- State the sampling distribution of the test statistic used in the two-sample \(t\)-test, and the assumptions that must hold for that sampling distribution to be correct.
Two-sample \(t\)-tests in R (Lecture Notes for Lecture 17)
- Interpret the output of
two.sample.t.test
in MUsaic
, including:
- where \(t_{\text{obs}}\) is reported
- where the estimated degrees of freedom for the \(t\)-distribution is reported
- where the \(P\)-value for \(t_{\text{obs}}\) is reported
- where the confidence interval is reported
- Use
two.sample.t.test
to perform a two-sample \(t\)-test in R.
- Use
two.sample.t.test
to construct a two-sided confidence interval for the population difference \(\delta\) in R.
- Interpret a two-sided confidence interval for a population difference \(\delta\) in the context of a given problem.
Chapter 17
Matched pairs \(t\) procedures
- Define a matched pairs design.
- State examples of statistical questions where a matched pairs design would be appropriate.
- Describe the setup of a data set collected from a matched pairs design.
- State the sampling distribution of the average difference score from a matched pairs design when we assume the population of difference scores is Normally distributed.
- State the \(T\)-statistic used in a matched pairs design.
- State the sampling distribution of the \(T\)-statistic used in a matched pairs design when we assume the population of difference scores is Normally distributed.
- Perform a hypothesis test for a population difference from data fitting a matched pairs design using the appropriate matched pairs \(t\)-test.
- Construct a confidence interval for a population difference from data fitting a matched pairs design using the appropriate matched pairs \(t\)-test.
- Use
one.sample.t.test
to test a hypothesis or construct a confidence interval using summary data from a matched pairs design.
- Give reasons why we must carefully identify whether an independent samples \(t\)-test or matched pairs \(t\)-test is most appropriate for analyzing a data set.
Chapter 11
Normal Quantile Plots
- Explain why we should check for the Normality of an underlying population before performing any of the \(t\)-tests or constructing any of the \(t\)-based confidence intervals we have developed so far in the class.
- Construct a density plot in R from a data set, and diagnose whether the density plot indicates any clear departures from Normality in the underlying population.
- Explain, loosely, what a Q-Q plot is showing, and what a “good” and “bad” Q-Q plot looks like.
- Construct a Q-Q plot in R from a data set, and diagnose whether the Q-Q plot indicates any clear departures from Normality in the underlying population.
- Given a Q-Q plot, identify whether the Q-Q plot points towards:
- Normality
- Left-skewness
- Right-skewness
- Heavy tails
- Identify which property of a population, overall, is most problematic to \(t\)-based inferential procedures.