Study Guide for Exam 3

This will be a closed-book exam. You will be allowed to use R on a school computer for computations. This means you will not have access to, nor will you need to use, anything beyond Base R (e.g. mosaic, MUsaic, etc.) for the exam.

To do well on the exam, you should be able to do the following:

Chapter 15

How confidence intervals behave

Distinguish between an interval estimator and an interval estimate (aka “confidence interval”).
Explain what the probability \(c\) is associated with for a confidence level \(c\) confidence interval.
Identify how the width of a confidence interval varies as \(n\), \(c\), and \(s_{X}\) vary.

Chapter 17

The \(t\) distributions

State the sampling distribution of a \(Z\)-score for a sample mean from a Normal population.
State the sampling distribution of a \(T\)-score for a sample mean from a Normal population.
Identify the degrees of freedom for the \(t\) distribution resulting from a \(T\)-score for a sample mean.
Compare and contrast the density curve for a \(t\)-distributed random variable to the density curve of a standard Normal random variable.
Identify how the shape of the density curve of a \(t\)-distributed random variable changes as its degrees of freedom increase.
Compute a critical value \(t_{\alpha, n-1}\) for a \(t\)-distribution with \(n - 1\) degrees of freedom using qt in R.

The one-sample \(t\) confidence interval

Compute a \(t\)-based confidence interval given relevant information about a random sample from a Normal population.
Sketch how the \(t\)-based confidence interval for a population mean is related to the sample mean, critical value, and estimate of the standard error of the sample mean.

Chapter 14

Hypothesis testing

Identify what types of quantities a statistical hypothesis always addresses.
Identify the population parameter from a statistical claim.
Given a claim about a population parameter \(\theta\) as a “word problem,” write the claim as \(\theta \, \, \square \, \, \theta_{0}\) where the “box” \(\square\) contains one of \(=, \neq, \leq, \geq, <,\) or \(>\) and \(\theta_{0}\) is the claimed value of the population parameter.
Define null hypothesis and alternative hypothesis.
Given a claim in the form \(\theta \, \, \square \, \, \theta_{0}\), identify whether the claim is a null hypothesis or an alternative hypothesis.
Given a claim in the form \(\theta \, \, \square \, \, \theta_{0}\), identify the associated null and alternative hypotheses for the claim.
Recognize the symbols \(H_{0}\) (“H-naught”) and \(H_{a}\) (“H-a”) as denoting the null and alternative hypotheses, respectively.
Explain why it never (no really, never) makes sense to frame a statistical hypothesis in terms of a sample statistic.

Tests for a population mean

Given a claim about a population mean \(\mu_{X}\) as a “word problem,” write the claim as \(\mu_{X} \, \, \square \, \, \mu_{0}\) where the “box” \(\square\) contains one of \(=, \neq, \leq, \geq, <,\) or \(>\) and \(\mu_{0}\) is the claimed value of the population mean.
Given a claim about a population mean \(\mu_{X}\) as a “word problem,” determine the corresponding null and alternative hypotheses.
State a reasonable test statistic for testing a claim about a population mean.

Chapter 14

The one-sample \(t\) test

Explain why a \(T\)-score can be used to test a claim about a population mean.
Compute the observed \(T\)-score, denoted \(t_{\text{obs}}\), from a “word problem” about a population mean.
Given an observed sample mean, sample standard deviation, and sample size, and a pair of null / alternative hypotheses, determine whether the observed sample mean provides evidence against the null hypothesis (equivalently, for the alternative hypothesis).
State the sampling distribution of the \(T\)-score under the null hypothesis when the population distribution is Normal (or approximately so).

Chapter 14

\(P\)-value and statistical significance

State how a \(P\)-value is defined for an observed test statistic and a given null hypothesis.
State the rejection rule for using the \(P\)-value of an observed test statistic to reject (or not) the null hypothesis at a given significance level \(\alpha\).

Chapter 15

How hypothesis tests behave

Define the two types of errors one can make in a hypothesis test, and state their names.
Relate the two types of errors that one can make in a hypothesis test to decisions in legal trials and during warfare.
Define the two error rates of a hypothesis test.
Recognize \(\alpha\) as indicating the Type I Error Rate of a hypothesis test.
Recognize the synonym of “significance level” for the Type I Error Rate \(\alpha\).
State which of the two error rates are explicitly controlled for in constructing a hypothesis test.
Define test statistic, and explain the distinction between a test statistic and an observed test statistic.
Define the rejection region of a hypothesis test, and state how the rejection region is used to make a conclusion about a null hypothesis.
State the 7 step procedure for testing a claim about a population using a hypothesis test that controls the Type I Error Rate.

Chapter 17

The one-sample \(t\) test

State the test statistic that is used for a one-sample \(t\)-test, and give its sampling distribution when underlying the population is Normal and the null hypothesis is true.
State the left-sided, right-sided, and two-sided rejection regions for the one-sample \(t\) test.
Identify when a left-sided, right-sided, or two-sided rejection region is appropriate to test a given null hypothesis.
Construct a left-sided, right-sided, or two-sided rejection region given a claim about a population and a significance level \(\alpha\).
Determine whether an observed test statistic “falls in” a rejection region.
Determine the appropriate tail of the \(t\)-distribution to use to compute a \(P\)-value for a given null hypothesis.
Compute the \(P\)-value for an observed \(t\)-statistic, and use the \(P\)-value to conclude whether or not to reject a given null hypothesis.

Chapter 17

Testing a Two-sided Hypothesis Using a Confidence Interval (Lecture Notes for Lecture 16)

State the procedure for testing a null hypothesis \(\mu_{X} = \mu_{0}\) at significance level \(\alpha\) using a confidence interval.
State the rationale for why we can test a null hypothesis \(\mu_{X} = \mu_{0}\) at significance level \(\alpha\) using a confidence interval with confidence level \(c = 1 - \alpha\).
Perform a hypothesis test for the null hypothesis \(\mu_{X} = \mu_{0}\) given all of the relevant information.

One-sample \(t\)-tests in R (Lecture Notes for Lecture 16)

Perform a one-sample \(t\)-test from a variable in a data frame using t.test from mosaic.
Perform a one-sample \(t\)-test from summary statistics using one.sample.t.test from MUsaic.
Construct a confidence level \(c\) confidence interval from a variable in a data frame using t.test from mosaic.
Construct a confidence level \(c\) confidence interval from summary statistics using one.sample.t.test from MUsaic.

Chapter 18

Comparing two population means

Give examples of scientific questions that would warrant comparisons of two population means.
Recognize the Greek letter \(\delta\), the Greek analog to the Roman letter \(d\), which is used to indicate \(\delta\text{ifferences}\) between population parameters.
State a claim about two population means \(\mu_{X}\) and \(\mu_{Y}\) as an equality / inequality involving the difference \(\delta = \mu_{X} - \mu_{Y}\) between the population means.

Two-sample \(t\) procedures

State the test statistic used in the two-sample \(t\)-test.
Determine, qualitatively, whether an observed test statistic from the two-sample \(t\)-test provides evidence against a null hypothesis.
State the sampling distribution of the test statistic used in the two-sample \(t\)-test, and the assumptions that must hold for that sampling distribution to be correct.

Two-sample \(t\)-tests in R (Lecture Notes for Lecture 17)

Interpret the output of two.sample.t.test in MUsaic, including:
- where \(t_{\text{obs}}\) is reported
- where the estimated degrees of freedom for the \(t\)-distribution is reported
- where the \(P\)-value for \(t_{\text{obs}}\) is reported
- where the confidence interval is reported
Use two.sample.t.test to perform a two-sample \(t\)-test in R.
Use two.sample.t.test to construct a two-sided confidence interval for the population difference \(\delta\) in R.
Interpret a two-sided confidence interval for a population difference \(\delta\) in the context of a given problem.

Chapter 17

Matched pairs \(t\) procedures

Define a matched pairs design.
State examples of statistical questions where a matched pairs design would be appropriate.
Describe the setup of a data set collected from a matched pairs design.
State the sampling distribution of the average difference score from a matched pairs design when we assume the population of difference scores is Normally distributed.
State the \(T\)-statistic used in a matched pairs design.
State the sampling distribution of the \(T\)-statistic used in a matched pairs design when we assume the population of difference scores is Normally distributed.
Perform a hypothesis test for a population difference from data fitting a matched pairs design using the appropriate matched pairs \(t\)-test.
Construct a confidence interval for a population difference from data fitting a matched pairs design using the appropriate matched pairs \(t\)-test.
Use one.sample.t.test to test a hypothesis or construct a confidence interval using summary data from a matched pairs design.
Give reasons why we must carefully identify whether an independent samples \(t\)-test or matched pairs \(t\)-test is most appropriate for analyzing a data set.

Chapter 11

Normal Quantile Plots

Explain why we should check for the Normality of an underlying population before performing any of the \(t\)-tests or constructing any of the \(t\)-based confidence intervals we have developed so far in the class.
Construct a density plot in R from a data set, and diagnose whether the density plot indicates any clear departures from Normality in the underlying population.
Explain, loosely, what a Q-Q plot is showing, and what a “good” and “bad” Q-Q plot looks like.
Construct a Q-Q plot in R from a data set, and diagnose whether the Q-Q plot indicates any clear departures from Normality in the underlying population.
Given a Q-Q plot, identify whether the Q-Q plot points towards:
- Normality
- Left-skewness
- Right-skewness
- Heavy tails
Identify which property of a population, overall, is most problematic to \(t\)-based inferential procedures.