Study Guide for Exam 1

Exam Format

This will be a closed-book, closed-notes exam. You will need a calculator. A calculator that performs arithmetic operations, reciprocals, square roots, powers and logarithms (base 10) is sufficient. Graphing calculators are permitted.

The exam will consist of 6 problems, 5 of which will be open-ended and 1 of which will be multiple choice. All open-ended problems will be multi-part.

Advice for Preparing for the Exam

This is a mathematics exam. As such, you will be expected to do mathematics. A common misconception made by students is that, in preparing for a mathematics exam, it suffices to make sure you can understand the solution to a given problem with the solution in front of you. The exam, however, tests your ability to solve the problem yourself from scratch. Thus, you should practice solving problems from scratch in preparation for the exam. The homework and quizzes are part of this practice, but only completing those once will likely not lead to a strong performance on the exam.

With the ultimate goal of solving problems yourself from scratch in mind, a possible study check list is below:

  1. Make sure you can correctly answer all of the Anki cards I provided, and any additional Anki cards you made. If you are still unclear on the material on any of the Anki cards, see me.
  2. Exam questions will be modeled off of in-class examples, homework problems, and quiz problems. As such, go through homework and quiz problems, especially those you found difficult or that were marked as incorrect / incomplete, and make sure you can solve them from scratch, without the solution in front of you.
  3. Review the compilation of learning objectives below, and make sure you can perform each action below without your notes or book. If there is a particular learning objective you are unclear on, email or see me.
  4. Cramming (studying right before the exam) combined with massed practice (solving lots of problems all at once) is an ineffective study strategy. Instead, distribute your study sessions over the remaining time until the exam. Study for at most 60 to 90 minutes at a time, and take breaks, whether legitimate breaks (go for a walk, hang out with friends, etc.), or a transition to studying for another course. The sooner before the exam you start studying, the better.

To do well on the exam, you should be able to do the following:

Chapter 1 and Triola & Triola Excerpt

Triola & Triola Section 1.1: Overview

  1. Describe what a statistician does using action verbs.
  2. Distinguish between a population and a sample from a population.
  3. Given a statistical query, determine the relevant population, and propose a useful sample from that population.

Triola & Triola Section 1.2: Types of Data

  1. Distinguish between a parameter of a population and a statistic of a sample.
  2. Specify the characteristics of the two types of data (quantitative / qualitative) and the types of quantitative data (discrete / continuous), and identify the type of a given measurement.

Triola & Triola Section 1.3: Design of Experiments (or Where Do Data Come From?)

  1. Distinguish between an observational study and an experimental study, and identify which category a study falls into given its description.
  2. Describe how an unobserved mechanism can cause confounding between two observed outcomes.
  3. Explain the slogan “association does not imply causation,” and relate the slogan to confounding.

Section 1.1: Populations and Samples

  1. Distinguish between descriptive and inferential statistics.

Section 1.2: Pictorial and Tabular Methods in Descriptive Statistics

  1. Construct a rug plot given a (small) data set. Note: rug plots are not covered in the text. See here for a reminder about rug plots.
  2. Construct a histogram by hand given a (small) data set.
  3. Identify the “5 number summary” that is used to construct a boxplot.
  4. Construct a boxplot given a 5 number summary of a data set.
  5. Use rug plots, histograms, and boxplots to compare data sets from two or more groups.
  6. Construct rug plots, histograms, and boxplots using R.
  7. Construct a frequency, relative frequency, or density histogram given a (small) data set.
  8. Compare and contrast frequency, relative frequency, and density histograms in terms of how they represent the distribution of a data set on the real line.

General Note: Below, where it says ‘define the [statistic] of a data set,’ this means you should know the procedure for computing a given statistic, and be able to express that procedure as a mathematical expression. Generally, you will not be expected to compute sample means, medians, variances, etc., by hand, but I do expect you to know how to compute them by hand.

Section 1.3: Measures of Location

  1. Define the sample mean of a data set, and specify how we will denote the sample mean.
  2. Explain how the sample mean is related to the ‘center of mass’ of a data set.
  3. Define the sample median of a data set.
  4. Compare and contrast the sample mean and sample median in terms of what sense of ‘center’ of a data set they summarize.

Section 1.4: Measures of Variability

  1. Define the sample variance and sample standard deviation of a data set, and specify how we will denote the sample variance and sample standard deviation.
  2. Explain why the sample standard deviation is more interpretable than the sample variance.
  3. Define the mean absolute deviation of a data set.

R:

  1. Construct a vector x in R containing a list of numbers.
  2. Compute the mean, median, variance, and standard deviation of a data set using R.

Chapter 2

Section 2.1: Sample Spaces and Events

  1. Define an experiment, outcome, and sample space in the context of the mathematical model of an experiment.
  2. State the sample space of an experiment given a verbal description of the experiment.
  3. Compare and contrast discrete and continuous sample spaces.
  4. Define an event in the context of the mathematical model of an experiment.
  5. Describe what it means for an event to occur.
  6. State meaning of set-union \(\cup\), set-intersect \(\cap\), and set-complement \('\) from set theory.
  7. Use a Venn diagram to reason about set operations.
  8. State the meaning of set-union \(\cup\), set-intersect \(\cap\), and set-complement \('\) when applied to events.
  9. State what it means for two sets / events to be disjoint / mutually exclusive.

Section 2.2: Axioms, Interpretations, and Properties of Probability

  1. Define a set function and give examples from earlier courses in mathematics.
  2. State the three axioms that a set function must satisfy to be a probability set function.
  3. Compute probabilities of events defined using set operations using the results derived from the three axioms of probability theory.
  4. Define a simple event and use simple events to compute the probability of a non-simple event.

Section 2.4: Conditional Probability

  1. Define the conditional probability of an event \(A\) given an event \(B\): \(P(A \mid B)\)
  2. Identify conditional probabilities using keywords such as if and given.
  3. Explain how a conditional probability is a regular probability with a reduced sample space.
  4. State and use the product rule to factor probabilities of events constructed as intersections (“ANDs”) of other events.
  5. Define a partition of a set.
  6. Define a partition of a sample space.
  7. State the Law of Total Probability, and use the law to solve problems given the relevant information.
  8. State Bayes’ Theorem, and use Bayes’ Theorem to solve problems given the relevant information.
  9. Construct a tree diagram as a visual representation of the product rule for outcomes that occur in stages, and use tree diagrams to reason about the probabilities of sequences of events.

Chapter 3

Section 3.1: Random Variables

  1. Give the definition of a random variable as a function from a sample space to the real numbers.
  2. Properly use the convention of denoting a random variable by an uppercase Roman letter (e.g. \(X, Y, Z\)), a particular value that random variable can take by the corresponding lowercase Roman letter (e.g. \(x, y, z\)), and its range by the corresponding calligraphic Roman letter (e.g. \(\mathcal{X}, \mathcal{Y}, \mathcal{Z}\)).
  3. Distinguish between a discrete random variable and a continuous random variable.
  4. Explain in what sense a random variable is a ‘‘number that could have been otherwise.’’
  5. Given a sample space and a description of a random variable, determine the range of the random variable, and determine what values of the sample space map to each value of the random variable’s range.

Section 3.2: Probability Distributions for Discrete Random Variables

  1. Define the probability mass function \(p(x) = P(X = x)\) for a random variable \(X\) in terms of the underlying sample space \(\mathcal{S}\).
  2. State the two characteristics a function \(p\) must have to be a valid probability mass function, and identify when a given function \(p\) does or does not satisfy these conditions.
  3. Use a probability mass function \(p\) to compute the probability \(P(X \in Q)\) that \(X\) falls in some query set \(Q \subseteq \mathbb{R}\).
  4. Use the “Probability Dictionary” to map from statements such as “at least \(x\),” “at most \(x\),” etc., to the corresponding query set \(Q\).
  5. Define the cumulative distribution function \(F(x) = P(X \leq x)\) for a random variable \(X\) in terms of the underlying sample space \(\mathcal{S}\).
  6. State the four properties a function \(F\) must have to be a valid cumulative distribution function.
  7. Define what it means for a function \(f\) to be continuous and right-continuous.
  8. Sketch the graph of a cumulative distribution function \(F\) for a discrete random variable \(X\) given either its \(p\) or \(F\).
  9. Compute \(F\) from \(p\), and vice versa.
  10. Compute the probability \(P(a < X \leq b)\) of a standard query using \(F\).
  11. Compute the probability \(P(a \square X \square b)\) of a non-standard query using \(F\).