Here's the official description:

To provide an axiomatic, calculus-based approach to probability and introductory statistics. The course is built around the process of performing a statistical analysis; posing the question, collecting the data, describing the data, analyzing and modeling the data, and drawing inferences from the data regarding the original question. Specific topics covered include sampling, descriptive analysis of data, probability, random variables, discrete and continuous distributions, expectation, confidence intervals, one sample hypothesis testing, chi-square analyses, correlation and regression.

This course covers the process of statistical analysis from beginning to end. That process, in broad strokes, is as follows: we pose a scientific question, determine what experiments or observations might provide data towards answering that question, develop approaches to collecting that data, summarize the resulting data, and derive inferences relevant to the original scientific question. In the process, we will learn about sampling, descriptive statistics, probability, probability models, inferential statistics, confidence intervals, hypothesis tests, and regression. A special emphasis will be given to common pitfalls in statistical analysis you are likely to see 'in the wild', including mis-interpretations / mis-understandings of statistical procedures, the conflation of association and causation, and the reproducibility crisis in psychology, medicine, and nutrition.

MA 116 or MA 118 or MA 126 passed with a grade of C- or higher.

Dr. David Darmon | ddarmon [at] monmouth.edu | |

Howard Hall 241 |

This is currently a *tentative* listing of topics, in order.

*Introduction:*What is statistics? Where do data come from? Experimental versus observational studies. Types of data.*Descriptive statistics as summaries of data:*Summaries of the entire data distribution: rug plot, dot plot, histogram, box plot. Measures of center: mean, median, mode. Measures of variation: range, standard deviation, mean absolute deviation.*Probability:*The origin of probability in games of chance. The interpretation of probability. Computing probabilities involving two or more events using the addition and multiplication rules. Conditional probability and independence.*Discrete random variables:*Random variables as 'numbers that could have been otherwise.' Probability mass functions. Parametric models for discrete random variables: uniform, Bernoulli, binomial, geometric, and Poisson. Moments of discrete random variables.*Continuous random variables:*Summarizing quantitative data with histograms. From histograms to probability density functions. Parametric models for continuous random variables: uniform, exponential, normal. Moments of continuous random variables. The normal distribution and its properties. Using the normal distribution to approximate the binomial distribution.*Statistics and their sampling distributions:*From forward probability to inverse probability. Statistics as functions of a data set. The sampling distribution of the sample mean. The Law of Large Numbers and the Central Limit Theorem.*Estimators and point estimation:*Making an educated guess at a population parameter using a point estimator. Desirable properties of point estimators. Methods for deriving point estimators: method of moments and maximum likelihood estimation.*Confidence intervals:*Confidence intervals as interval estimators. The interpretation of confidence intervals. Confidence intervals for population means and proportions.*Hypothesis tests:*Hypothesis tests for population means and proportions. Components of a hypothesis test. Types of error in hypothesis testing. Scientific hypotheses and statistical hypotheses. Statistical significance and practical significance. Using confidence intervals for hypothesis tests. Using confidence intervals to distinguish between practical and statistical significance.*P*-values.*Everything but the kitchen sink:*Two-sample tests for population means and their associated confidence intervals. Two-sample tests for population proportions and their associated confidence intervals.*Correlation and regression:*Exploratory data analysis for two or more quantitative variables. Pearson's correlation coefficient. Confidence interval and hypothesis test for the Pearson correlation between two quantitative variables. Association does not imply causation. Simple linear regression. Interpretation of regression coefficients. Multiple linear regression. Predictive and causative statements.

I will have office hours at the following four times each week:

Monday, 10:00—11:00 AM | Howard Hall 241 |

Tuesday, 03:00—04:00 PM | Howard Hall 241 |

Thursday, 10:00—11:00 AM | Howard Hall 241 |

Thursday, 01:30—02:30 PM | Howard Hall 241 |

I have an open-door policy during those times: you can show up unannounced. If you cannot make the scheduled office hours, please e-mail me about making an appointment.

If you are struggling with the homework, having difficulty with the quizzes, or just want to chat, please visit me during my office hours. I am here to help.

- 45% for 2 in-class exams (22.5% each)
- 25% for a cumulative final exam
- 15% for homework problem sets
- 5% for pre-class preparation
- 5% for quizzes
- 5% for class participation

- \([90, 100] \to \text{A}\)
- \([80, 90) \,\,\, \to \text{B}\)
- \([70, 80) \,\,\, \to \text{C}\)
- \([60, 70) \,\,\, \to \text{D}\)
- \([0, 60) \,\,\,\,\,\, \to \text{F}\)

One of the most important skills needed to master mathematics is memory. To do mathematics, you need to make connections between concepts stored in your long-term memory, and before you can do that, you need to store those memories in the first place. One of the best methods for strengthening long-term memory is **retrieval practice** (think flash cards) combined with **spaced repetition** (think reviewing flash cards on an intelligent schedule). This is the exact opposite of how many students study, which typically takes the form of browsing notes (and thus skipping over retrieving the information from their own memories) immediately before the information is needed (i.e. 'cramming'). Unfortunately, this is one of the worst ways to commit information to long-term memory, despite the fact that cramming *feels* effective in the short-term. Retrieval practice with spaced repetition is more effective than the browse-and-cram approach, takes less time, and is more enjoyable!

As part of pre-class preparation, you are required to regularly use Anki, and submit your Anki decks via eCampus. See below for details on Anki.

See here for the instructions on submitting your Anki decks via eCampus.

There will be a 10-minute quiz at the beginning of every class. The quizzes will be cumulative, covering material from the first lecture through the most recent lecture. These quizzes are meant to be diagnostic: they direct you to gaps in your understanding of the material so that you can course correct before exams. As long as you are on time for a quiz and make a good-faith attempt to complete the quiz, you will receive full credit for that quiz.

Up to 4 missed quizzes will be dropped. As such, there will be no make-up quizzes.

The **required** textbook is:

- Jay L. Devore and Kenneth N. Berk.
*Modern mathematical statistics with applications*, 1st Edition (Cengage Learning, 2007). Link to University Store

We will use R, a programming language for statistical computing, throughout the semester for in-class activities and homework assignments. I will cover the relevant features of R throughout the course.

R will be installed on all of the classroom computers. You should also install R on your personal computer, if you have one. You can install R by following the instructions for Windows here, for macOS here, or for Linux here. You will also want to install RStudio, and Integrated Development Environment for R, which you can find here.

We will use R as a scripting language and statistical calculator, and thus will not get into the nitty-gritty of programming in R. The R Tutorial by Kelly Black is a good reference for the basics of using R. I will demonstrate R's functionality in class and handouts as we need it.

We will use Anki for spaced retrieval practice throughout the semester. Anki is open-source, free (as in both *gratis* and *libre*) software. You can download Anki to your personal computer from this link. If you have ever used flashcards, then Anki should be fairly intuitive. If you would like more details you can find Anki's User Manual here.

**Note:** Anki has both desktop and mobile phone variants. Please use the desktop variant.

- Prior to January 23, Lecture 0:
**Topics:**Spaced retrieval practice. Pre-class assessment.- Pre-class Assignments
- January 23, Lecture 1:
**Topics:**Introduction to class. What is statistics? Where do data come from? Types of data.~~Sampling techniques. Collecting data. Bias in data collection.~~**Sections:**1.1, Excerpt from Triola & Triola- Learning Objectives
- Examples
- Homework 1
- January 28, Lecture 2:
**Topics:**Collecting data. Experimental versus observational studies. Confounding in observational studies. Summarizing data. Rug plots, histograms, box plots. Introduction to R.**Sections:**1.1, 1.2- Learning Objectives
- Lab 1
- Lab 1 Data
- Homework 2
- January 30, Lecture 3:
**Topics:**Summaries of center: mean, median, mode. Summaries of spread: standard deviation, range, mean absolute error.**Sections:**1.3, 1.4- Learning Objectives
- Homework 3
- February 4, Lecture 4:
**Topics:**Random chance and probability. Probabilities and their interpretation. The addition rule for probabilities. Using Venn diagrams to reason about probabilities.**Sections:**2.1, 2.2- Learning Objectives
- Homework 4
- February 6, Lecture 5:
**Topics:**Conditional probability. The multiplication rule for probabilities.**Sections:**2.4- Learning Objectives
- Homework 5
- February 11, Lecture 6:
**Topics:**Law of Total Probability. Bayes' Rule. Independence of events.**Sections:**2.3, 2.5- Learning Objectives
- Homework 6
- February 13, Lecture 7:
**Topics:**Random variables: numbers that could have been otherwise. Discrete random variables. The probability mass function \(p\). The cumulative distribution function \(F\).**Sections:**3.1, 3.2- Learning Objectives
- Homework 7
- February 18, Lecture 8:
**Topics:**More on discrete random variables. Some important discrete random variables: discrete uniform, Bernoulli, and geometric. Simulating random variables using R. Expectation of a discrete random variable.**Sections:**3.2, 3.3- Learning Objectives
- Homework 8
- February 20, Lecture 9:
**Topics:**More on discrete random variables. Interpreting the expected value of a random variable. Expectation as a linear operator: \(E(a X + b) = a E(X) + b\). Expectation of functions of a discrete random variable. Variance of a discrete random variable. Binomial experiments.**Sections:**3.3, 3.5- Learning Objectives
- Homework 9
- February 25, Lecture 10:
**Topics:**Exam 1.**Sections:**Chapters 1 - 3- Exam 1 Study Guide
- Exam 1 Regrade Procedure
- February 27, Lecture 11:
**Topics:**Binomial experiments. Binomial random variables. Computing binomial probabilities with tables and R. Mean and variance of a binomial random variable.**Sections:**3.5- Learning Objectives
- Homework 10
- Binomial Tables (PMF),
- Binomial Tables (CDF)
- March 4, Lecture 12:
**Topics:**Continuous random variables. The probability density function \(f\). The cumulative distribution function \(F\). Expectation and variance for continuous random variables.**Sections:**4.1, 4.2- Learning Objectives
- Homework 11
- Intro to Shiny Apps
- Shiny app demonstrating correspondence between density histograms and probability density functions
- Shiny app showing the relationship between the area under a probability density function and the corresponding cumulative distribution function
- March 6, Lecture 13:
**Topics:**Standard Gaussian (normal) random variables. Non-standard Gaussian (normal) random variables. Why oh why couldn't the Gaussian cumulative distribution function be elementary? Computing Gaussian probabilities from tables. Percentiles of Gaussian random variables.**Sections:**4.3- Learning Objectives
- Homework 12
- Shiny app for thinking about tables of Gaussian probabilities
- March 11, Lecture 14:
**Topics:**Statistics and their sampling distributions. The sample mean. The central limit theorem.**Sections:**6.1, 6.2- Learning Objectives
- Homework 13
- Shiny app demonstrating the sampling distribution of a statistic via enumeration
- Shiny app demonstrating the sampling distribution of a statistic via simulation
- March 13, Lecture 15:
**Topics:**Point estimators. Example point estimators. Desirable properties of point estimators. The standard error of a point estimator.**Sections:**7.1**Sections:**6.1, 6.2- Learning Objectives
- Homework 14
- Shiny app motivating inferential statistics
- March 25, Lecture 16:
**Topics:**Margin of error for the sample mean. Interval estimators. Confidence intervals. Confidence interval for a population mean: population standard deviation known and unknown.**Sections:**8.1, 8.3- Learning Objectives
- Homework 15
- March 27, Lecture 17:
**Topics:**Two-sided and one-sided confidence intervals for a population mean. Interpreting confidence intervals. Confidence interval for a population proportion. Expected precision of a confidence interval. Sample size necessary to obtain a confidence interval with a desired precision.**Sections:**8.2- April 1, Lecture 18:
**Topics:**Hypothesis tests. Stating a claim. Jargon of hypothesis testing. Performing a hypothesis test using a confidence interval. Two-sided hypothesis tests for population means and proportions.**Sections:**9.1, 9.2- April 3, Lecture 19:
**Topics:**Exam 2.**Sections:**Chapters 3, 4, 6, 7, 8- Exam 2 Study Guide
- April 8, Lecture 20:
**Topics:**One-sided hypothesis tests using one-sided confidence intervals. Types of errors in hypothesis testing. The logic of hypothesis tests. Power of a hypothesis test.**Sections:**9.1, 9.2- April 10, Lecture 21:
**Topics:**\(P\)-values. Statistical and practical significance. More practice with hypothesis tests.**Sections:**9.1, 9.2, 9.4- April 15, Lecture 22:
**Topics:**Inferences about two populations. Inferences with independent samples. Inferences involving two population means. Power.**Sections:**10.1, 10.2- April 17, Lecture 23:
**Topics:**Inferences about two populations. Inferences involving two population proportions. Power.**Sections:**10.4- April 22, Lecture 24:
**Topics:**Association between two quantitative variables. Scatterplots. Correlation vs. causation. Population and sample Pearson correlations. Confidence intervals and hypothesis tests for a population correlation. Inferences from paired samples. Comparison to inferences from independent samples.**Sections:**12.5, 10.3- April 24, Lecture 25:
**Topics:**Simple linear regression. Trendlines and interpreting the slope and intercept. Evaluating a regression model: the good, the bad, and the \(R^{2}\).**Sections:**12.1, 12.2- April 29, Lecture 26:
**Topics:**Simple linear regression. Whence and wherefore the 'line of best fit'?. Confidence intervals for regression parameters. Multiple linear regression.**Sections:**12.3, 12.4, 12.6