Here's the official description:

Analysis of data, probability, random variables, normal distribution, sampling theory, confidence intervals, and statistical inference.

This course covers the process of statistical analysis from beginning to end. That process, in broad strokes, is as follows: we pose a scientific question, determine what experiments or observations might provide data towards answering that question, develop approaches to collecting that data, summarize the resulting data, and derive inferences relevant to the original scientific question. In the process, you will learn about sampling, descriptive statistics, probability, probability models, inferential statistics, confidence intervals, hypothesis tests, and regression. You will also learn how to analyze data using R, a programming language ideally suited for statistical computing.

MA 101 or MA 105 passed with a grade of C- or higher, or Math Placement Level 3 or 4. Not open to computer science majors or to students required to complete MA 125, except software engineering majors.

Dr. David Darmon | ddarmon [at] monmouth.edu | |

Howard Hall 241 |

This is currently a *tentative* listing of topics, in order.

*Introduction:*What is statistics? What types of questions can statistics answer? Types of data.*Descriptive statistics for quantitative data:*Summaries of the entire data distribution: rug plot, dot plot, histogram, box plot. Measures of center: mean, median, mode. Measures of variation: range, standard deviation, quartiles.*Descriptive statistics for two quantitative variables:*Scatterplots. Trendlines to summarize an association. Regression. Correlation.*Descriptive statistics for categorical data:*Two-way tables. Marginals and conditionals of two-way tables. Lurking variables and confounding via three-way tables.*Origins of data:*Experimental versus observational studies. Causation versus association. Methods of data collection and biases in data collection.*Probability:*The origin of probability in games of chance. The interpretation of probability. Probabilities from random sampling of a two-way table. Probabilities from Venn diagrams.*Random variables:*Random variables as 'numbers that could have been otherwise.' Random variables as an idealization of the data collection process. Relationship between random variables and histograms / density plots.*The normal distribution:*The normal distribution and its properties. The normal distribution as an idealized model for a population distribution. Putting a population variable on a standard scale. Probabilities and quantiles for normally distributed populations.*Sampling distributions:*The connection between a population distribution and the distribution of a sample statistic. Statistical properties of the sample mean under simple random sampling. The central limit theorem.*Confidence intervals:*Confidence intervals for population means. Confidence intervals as interval estimators. The interpretation of confidence intervals. Using confidence intervals to distinguish between practical and statistical significance.*Hypothesis tests:*Hypothesis tests for population means. Components of a hypothesis test. Types of error in hypothesis testing. Scientific hypotheses and statistical hypotheses. Statistical significance and practical significance.*P*-values.*Confidence intervals and hypothesis tests for two-sample problems:*Two-sample tests for population means and their associated confidence intervals. One-sample and two-sample tests for population proportions. Tests for independence and homogeneity using two-way tables.*Correlation and regression:*Simple linear regression. Interpretation of regression coefficients in the context of a population. Diagnostic plots and model checking for simple linear regression. Statistical properties of estimates of regression coefficients. Hypothesis tests and confidence intervals for population regression coefficients.*Analysis of Variance:*One-way ANOVA. Relationship of one-way ANOVA to the two-sample \(t\)-test. The problem of multiple comparisons. Diagnostic plots for one-way ANOVA. Contrasts from one-way ANOVA.*Nonparametric tests for population means:*Violations of the distributional assumptions of \(t\)-tests and ANOVA. Rank-based test statistics. Mann-Whitney-Wilcoxon test for two independent samples. Kruskal-Wallis test for one-way ANOVA.

Tuesday, 03:00–04:00 PM | Howard Hall 241 |

Thursday, 10:00–11:00 AM | Howard Hall 241 |

Thursday, 01:30–02:30 PM | Howard Hall 241 |

Friday, 09:00–10:00 AM | Howard Hall 241 |

If you cannot make the scheduled office hours, please e-mail me about making an appointment.

- 60% for 3 in-class exams (20% each)
- 25% for a non-cumulative final exam
- 12% for homework problem sets
- 3% for class participation

In addition to the main categories above, there are **two** opportunities for extra credit:

- +5% for use of Anki (Instructions)
- +5% for post-class reflections (Instructions)

**Note:** These are the **only** opportunities for extra credit in this course.

The **required** textbook is:

- Brigitte Baldi and David S. Moore,
*The Practice of Statistics in the Life Sciences*, 4th Edition (W. H. Freeman and Company, 2018, ISBN: 9781319013370).

We will use R, a programming language for statistical computing, throughout the semester for in-class activities and homework assignments. I will cover the relevant features of R throughout the course.

You can access R from any web accessible computer using RStudio Cloud. You will need to create an account on RStudio Cloud from their Registration page. I will send out a link via email for you to join a Space on RStudio Cloud for this course. Resources for homeworks, labs, etc., will be hosted on RStudio Cloud for easy access.

You can also install R on your personal computer, if you have one. You can install R by following the instructions for Windows here, for macOS here, or for Linux here. You will also want to install RStudio, and Integrated Development Environment for R, which you can find here.

We will use R as a scripting language and statistical calculator, and thus will not get into the nitty-gritty of programming in R. We will largely use functionality built into the `mosaic` library in R. You can find a comprehensive tutorial to using R and `mosaic` here.

As stated in the **Extra Credit** section, you will have the opportunity to use Anki for spaced retrieval practice throughout the semester. Anki is open-source, free (as in both *gratis* and *libre*) software. You can download Anki to your personal computer from this link. If you have ever used flashcards, then Anki should be fairly intuitive. If you would like more details you can find Anki's User Manual here.

- September 3, Lecture 1:
**Topics:**Introduction to class. What is statistics? Types of data. Visual summaries of data: rug plots, histograms, and densities. Using R to generate visual summaries.**Sections:**Chapter 1- Assigned Reading and Learning Objectives
**Lab 1.**Due Lecture 2.- September 5, Lecture 2:
**Topics:**Numerical summaries of data. Measures of center: mean, median, mode. Measures of spread: standard deviation, percentiles / quantiles, and quartiles. Boxplots.**Sections:**Chapter 2- Assigned Reading and Learning Objectives
- Demo for Numerical Summaries of Center and Spread
- September 10, Lecture 3:
**Topics:**Summaries of two quantitative variables. Scatterplots. Including a categorical variable using color. Correlation.**Sections:**Chapter 3- Assigned Reading and Learning Objectives
- Demo for the Properties of the Sample Correlation
- September 12, Lecture 4:
**Topics:**Summaries of two quantitative variables. Trendlines as a data summary device. Refresher on the equation of a line: \(y = mx + b\). Trendlines and interpreting the slope and intercept. Trendlines as a prediction. How well does a trendline summarize the data?**Sections:**Chapter 4- Assigned Reading and Learning Objectives
- Demo for Refresher on Equation of a Line
- September 17, Lecture 5:
**Topics:**Summaries of two categorical variables. Two-way tables. Marginals of a two-way table. Conditionals of a two-way table. Lurking variables and confounding via a three-way table.**Sections:**Chapter 5- Assigned Reading and Learning Objectives
- September 19, Lecture 6:
**Topics:**Where do data come from? Experimental versus observational studies. Causation versus association. Methods of data collection. Bias in data collection.**Sections:**Chapter 6- Assigned Reading and Learning Objectives
- September 24, Lecture 7:
**Topics:**Exam 1.**Sections:**Exam on Chapters 1-6- Exam 1 Study Guide
- September 26, Lecture 8:
**Topics:**Random chance and probability, and their relation to random sampling. Probabilities and their interpretation. Random variables: discrete and continuous. Probability distributions for discrete random variables. Density curves for continuous random variables. Querying probability distributions and density curves to determine the probability that a random variable \(X\) takes a value.**Sections:**Chapter 9- Assigned Reading and Learning Objectives
- Demo for Simple Random Sampling and Random Variables
- October 1, Lecture 9:
**Topics:**Normal random variables.*A*bell-shaped curve. Examples of quantities distributed according to a bell-shaped curve. The mean and standard deviation of a normal distribution. The standard normal distribution via re-centering and scaling. \(Z\)-scores for standardizing bell-shaped data. Probabilities and quantiles for normally distributed data using R.**Sections:**Chapter 11- Assigned Reading and Learning Objectives
- Demo for Properties of a Normal Distribution
- Demo for Computing Normal Probability Queries Using R
- Practice Computing Normal Probability Queries Using R
- October 3, Lecture 10:
**Topics:**Connecting data to statistical models. Populations and samples. Parameters and statistics. The sampling distribution of a statistic. The sample mean. The mean and standard deviation of the sample mean. The central limit theorem.**Sections:**Chapter 13- Assigned Reading and Learning Objectives
- Demo of a Sampling Distribution via Enumeration
- Demo of the Sampling Distribution of the Sample Mean \(\bar{X}\) Under Random Sampling from a Population
- Age at Time of Death By Current Age and Other Demographics by Kevin Stadler
- October 10, Lecture 11:
**Topics:**Estimation of population parameters using a sample statistic. The sample mean as a procedure for estimating the population mean. Standard error of the sample mean. Estimating the standard error of the sample mean. Standardizing the sample mean. \(T\)-scores in place of \(Z\)-scores.**Sections:**Chapter 14, Chapter 17- Assigned Reading and Learning Objectives
**Homework 10**Additional Problems- Demo of the "Black Box" Model for Inferential Statistics
- October 15, Lecture 12:
**Topics:**Added uncertainty from estimating the standard error. The \(t\)-distribution. Degrees of freedom. \(t\)-values using R. A reasonable guess at the population mean. The \(T\)-based confidence interval for a population mean. Interpreting the confidence level of an interval estimator. How the width of a confidence interval depends on the sample size, precision of measurement, and confidence level.**Sections:**Chapter 14, Chapter 15, Chapter 17- Assigned Reading and Learning Objectives
- Demo of the Density Curve for the \(t\)-distribution with Varying Degrees of Freedom
- Demo of the Dependence of a Confidence Interval for the Population Mean on \(\bar{x}, s_{X}, n\) and \(c\)
- Demo on Interpretation of the Confidence Level of an Interval Estimator
- October 17, Lecture 13:
**Topics:**Exam 2.**Sections:**Exam on Chapters 9, 11, 13, 14, and 17- Exam 2 Study Guide
- October 22, Lecture 14:
**Topics:**Hypothesis testing. Making a claim about a population parameter. Identifying the null and alternative hypothesis based on the claim. Statistical hypotheses are always about populations. A rough hypothesis test for a population mean: how many standard errors is the sample mean from the null value?**Sections:**Chapter 14, Chapter 17- Assigned Reading and Learning Objectives
**Homework 12**Additional Problems- Demo on the Rationale Behind Using a \(T\)-score to Test a Claim About a Population Mean
- October 24, Lecture 15:
**Topics:**Hypothesis testing. Types of errors in hypothesis testing. The logic of hypothesis testing: all is fair in law and war. How likely is the observed result when the null hypothesis is true? Rejection regions for the one-sample \(t\)-test.**Sections:**Chapter 14, Chapter 15, Chapter 17- Assigned Reading and Learning Objectives
**Homework 13**- Demo of Rejection Regions and \(P\)-values for a One-sample \(t\)-test
- October 29, Lecture 16:
**Topics:**Hypothesis testing using \(P\)-values. Hypothesis testing using confidence intervals. The one-sample \(t\)-test using R. Practice with the one-sample \(t\)-test.**Sections:**Chapter 17- Assigned Reading and Learning Objectives
- Statistical Inference Worksheet.
**Due at beginning of Lecture 17.** - October 31, Lecture 17:
**Topics:**Statistical hypotheses about two population means. Differences between population means. Unmatched populations. Estimating differences between populations with independent samples. Confidence intervals for differences of unmatched population means. The unpaired (independent) two-sample \(t\)-test.**Sections:**Chapter 18- Assigned Reading and Learning Objectives
- November 5, Lecture 18:
**Topics:**Matched populations. Estimating differences between matched population means using average difference scores. Confidence intervals for differences of matched population means. The paired two-sample \(t\)-test. Reminder of the assumptions of \(T\)-based inferential procedures. Diagnosing violations of Normality assumptions: density plots, Q-Q plots, and formal hypothesis tests. The perils of Normality testing.**Sections:**Chapter 11, Chapter 17- Assigned Reading and Learning Objectives
- Demo of Normality Diagnostics
- November 7, Lecture 19:
**Topics:**Rank-based methods for testing population centers. Rank-based methods for one-sample and two-sample problems.**Sections:**Chapter 27- Assigned Reading and Learning Objectives
- Demo of Rank-based Tests (Wilcoxon's Rank-Sum and Signed-Rank Tests)
- November 12, Lecture 20:
**Topics:**Exam 3.**Sections:**Exam on Chapters 14, 15, 17, and 18- Exam 3 Study Guide
- November 14, Lecture 21:
**Topics:**Parametric one-way ANOVA. Nonparametric one-way ANOVA.**Sections:**Chapter 24, Chapter 27- Assigned Reading and Learning Objectives
- One-way ANOVA Applet from Sapling
- November 19, Lecture 22:
**Topics:**Hypothesis testing for categorical data in one-way tables. Statistical hypotheses about categorical data. The \(\chi^{2}\) (chi-squared) statistic for testing claims about categorical data.**Sections:**Chapter 21- Assigned Reading and Learning Objectives
- November 21, Lecture 23:
**Topics:**Hypothesis testing for categorical data in two-way tables. Independence and association between two categorical variables. Tests of association between two categorical variables using the chi-squared statistic. Tests of association between two categorical variables using Fisher's exact test.**Sections:**Chapter 22- Assigned Reading and Learning Objectives
- November 26, Lecture 24:
**Topics:**Inferences about proportions for binary outcomes in two populations. Binary variables and binomial counts. Binomial proportions. Relative risks, odds, and odds ratios. Interpreting relative risks and odds ratios.**Not**interpreting the odds ratio as a relative risk. Confidence intervals for the relative risk and the odds ratio.**Sections:**Chapter 9, Chapter 20- Assigned Reading and Learning Objectives
- Demo of Relative Risk, Odds, and Odds Ratios
- December 3, Lecture 25:
**Topics:**Statistical association between two quantitative variables. Simple linear regression. The simple linear regression model. The population intercept and slope. Assumptions of the simple linear regression model. Checking the validity of the simple linear regression model with diagnostic plots.**Sections:**Chapter 23- Assigned Reading and Learning Objectives
- Demo of Simple Linear Regression with Normal Noise (SLRNN) Model
- Demo of Diagnostic Plots for Simple Linear Regression with Normal Noise (SLRNN) Model
- December 5, Lecture 26:
**Topics:**The \(T\)-statistic for the sample slope. The estimate of the standard error of the sample slope. Confidence intervals for population slopes. Hypothesis tests for population slopes.**Sections:**Chapter 26- Assigned Reading and Learning Objectives
- December 17, Final Exam:
**Time:**11:35 AM - 2:25 PM**Location:**Howard Hall 308 (HH 308)**Sections:**Exam on Chapters 21, 22, 23, 24, and 27- Final Exam Study Guide