David Darmon

# MA 151-01, Statistics with Applications

## Fall 2018

Section 01: Tuesday, 1:15 PM — 2:35 PM; Thursday, 11:40 AM — 1:00 PM, Howard Hall 308

Here's the official description:

Analysis of data, probability, random variables, normal distribution, sampling theory, confidence intervals, and statistical inference.

This course covers the process of statistical analysis from beginning to end. That process, in broad strokes, is as follows: we pose a scientific question, determine what experiments or observations might provide data towards answering that question, develop approaches to collecting that data, summarize the resulting data, and derive inferences relevant to the original scientific question. In the process, we will learn about sampling, descriptive statistics, probability, probability models, inferential statistics, confidence intervals, hypothesis tests, and regression. A special emphasis will be given to common pitfalls in statistical analysis you are likely to see 'in the wild', including mis-interpretations / mis-understandings of statistical procedures, the conflation of association and causation, and the reproducibility crisis in psychology, medicine, and nutrition.

#### Prerequisites

MA 101 or MA 105 passed with a grade of C- or higher, or Math Placement Level 3 or 4. Not open to computer science majors or to students required to complete MA 125, except software engineering majors.

#### Professor

 Dr. David Darmon ddarmon [at] monmouth.edu Howard Hall 241

This is currently a tentative listing of topics, in order.

Introduction: What is statistics? Where do data come from? Experimental versus observational studies. Types of data.
Descriptive statistics as summaries of data: Summaries of the entire data distribution: rug plot, dot plot, histogram, box plot. Measures of center: mean, median, mode. Measures of variation: range, standard deviation, mean absolute deviation.
Probability: The origin of probability in games of chance. The interpretation of probability. Computing probabilities involving two or more events using the addition and multiplication rules. Conditional probability and independence.
Discrete random variables: Random variables as 'numbers that could have been otherwise.' Probability mass functions. The binomial distribution and its properties.
Continuous random variables: Summarizing quantitative data with histograms. From histograms to probability density functions. The normal distribution and its properties. Using the normal distribution to approximate the binomial distribution.
Hypothesis tests: Hypothesis tests for population means and proportions. Components of a hypothesis test. Types of error in hypothesis testing. Scientific hypotheses and statistical hypotheses. Statistical significance and practical significance. P-values.
Confidence intervals: Confidence intervals for population means and proportions. Confidence intervals as interval estimators. The interpretation of confidence intervals. Using confidence intervals for hypothesis tests. Using confidence intervals to distinguish between practical and statistical significance.
Everything but the kitchen sink: Two-sample tests for population means and their associated confidence intervals. Two-sample tests for population proportions and their associated confidence intervals. Tests for independence and homogeneity using contingency tables. Two-sample tests for equality of population variances.
Correlation and Regression: Exploratory data analysis for two or more quantitative variables. Pearson's correlation coefficient. Hypothesis test for zero correlation between two quantitative variables. Association does not imply causation. Simple linear regression. Interpretation of regression coefficients. Multiple linear regression. Predictive and causative statements.
See the end for the current lecture schedule, subject to revision. Homework and additional resources will be linked there, as available.

## Course Mechanics

#### Office Hours

 Tuesday,   03:00—04:00 PM Howard Hall 241 Thursday, 10:00—11:00 AM Howard Hall 241 Thursday, 01:30—02:30 PM Howard Hall 241 Friday,     09:00—10:00 AM Howard Hall 241

If you cannot make the scheduled office hours, please e-mail me about making an appointment.

60% for 3 in-class exams (20% each)
25% for a cumulative final exam
12% for homework problem sets
3% for class participation

#### Homework

Homework will be assigned at the end of every class meeting, and listed in the Schedule section of this page. Homework assignments are due at the beginning of the next class meeting.

#### Attendance

Required. If you expect to miss 2-3 sessions of the course, you should take the course during another semester.

#### Examination Absences

If you miss an examination your grade will be zero for that exam. If you know you will be absent for an exam you must let me know at least one week in advance to schedule a make-up exam.

#### Textbook

The required textbook is:

• Marc M. Triola and Mario F. Triola, Biostatistics for the Biological and Health Sciences, 1st Edition (Pearson, 2005, ISBN: 9780321194367). Link to University Store

#### Collaboration, Cheating, and Plagiarism

All submitted work should be your own. You are welcome and encouraged to consult with others while working on an assignment, including other students in the class and tutors in the Mathematics Learning Center. However, whenever you have had assistance with a problem, you must state so at the beginning of the problem solution. Unless this mechanism is abused, there will be no reduction in credit for using and reporting such assistance. This policy applies to both individual and group work. In group work, you only need to acknowledge help from outside the group. This policy does not apply to examinations.

#### Statement on Special Accommodations

Students with disabilities who need special accommodations for this class are encouraged to meet with me or the appropriate disability service provider on campus as soon as possible. In order to receive accommodations, students must be registered with the appropriate disability service provider on campus as set forth in the student handbook and must follow the University procedure for self-disclosure, which is stated in the University Guide to Services and Accommodations for Students with Disabilities. Students will not be afforded any special accommodations for academic work completed prior to the disclosure of the disability, nor will they be afforded any special accommodations prior to the completion of the documentation process with the appropriate disability office.

## Minitab

We will be using Minitab throughout the semester for in-class activities and homework assignments. You can access Minitab from any on-campus computer. I will cover the relevant features of Minitab throughout the course. You can find a guide for getting started with Minitab here.

## Schedule

Subject to revision. Assignments and solutions will all be linked here, as they are available. All readings are from the textbook by Triola and Triola unless otherwise noted.
September 4, Lecture 1:
Topics: Introduction to class. What is statistics? Where do data come from? Types of data. Sampling techniques. Collecting data. Bias in data collection.
Sections: 1.1, 1.2
Learning Objectives
Homework
Age Guessing Activity
September 6, Lecture 2:
Topics: Data collection and bias. Summarizing data: Rug plots, dot plots, box plots, histograms. Using Minitab Handout. Summarizing data: mean, median, mode.
Sections: 1.3, 2.1, 2.2, 2.3, 2.4, 2.7
Learning Objectives
Homework
Minitab Handout
Minitab File
September 11, Lecture 3:
Topics: Summarizing data: Standard deviation, range, mean absolute error.
Sections: 2.5, 2.7
Learning Objectives
Homework
September 13, Lecture 4:
Topics: Random chance and probability. Probabilities and their interpretation. The addition rule for probabilities. Using Venn diagrams to represent probabilities.
Sections: 3.1, 3.2, 3.3
Learning Objectives
Homework
September 18, Lecture 5:
Topics: Conditional probability. The multiplication rule for probabilities. Independence of events. P(at least 1 event occurs).
Sections: 3.4, 3.5
Learning Objectives
Homework
September 20, Lecture 6:
Topics: Medical screening tests. Random variables. The binomial distribution.
Sections: 4.1, 4.2, 4.3
Learning Objectives
Homework
Table of Binomial Probability Distributions
September 25, Lecture 7:
Topics: Exam.
Sections: Exam on Chapters 1-3
Exam 1 Study Guide
September 27, Lecture 8:
Topics: Mean and variance of the binomial distribution. Continuous random variables. The standard normal distribution.
Sections: 4.4, 5.1, 5.2
Learning Objectives
Homework
October 2, Lecture 9:
Topics: General (non-standard) normal distributions.
Sections: 5.3
Learning Objectives
Homework
October 4, Lecture 10:
Topics: Statistics and their sampling distributions. The sample mean. The central limit theorem.
Sections: 5.4, 5.5
Learning Objectives
Homework
Demonstrations of Central Limit Theorem
October 9, Lecture 11:
Topics: Hypothesis tests. Hypothesis tests for population means.
Sections: 7.1, 7.2, 7.4
Learning Objectives
Homework, Due at the Beginning of Class on 10/18/2018
October 11, Lecture 12:
Topics: Exam.
Sections: Exam on Chapters 4-5
Exam 2 Study Guide
October 18, Lecture 13:
Topics: T-tests for population means. Dependence of hypothesis test properties on: sample size $$n$$, size / significance level $$\alpha$$, population standard deviation $$\sigma$$, and effect size $$\delta$$. Types of error in hypothesis testing.
Sections: 7.2, 7.5
Learning Objectives
Homework
T-test Demo (On Campus) , T-test Demo (Off Campus)
Table of T-values
T-test Practice Problems
October 23, Lecture 14:
Topics: Where to put claims in a hypothesis test. Types of error and their relationship to the power of a hypothesis test. $$P$$-values.
Sections: 7.2
Learning Objectives
Homework
Size / Power Tradeoff Demo (On Campus) , Size / Power Tradeoff Demo (Off Campus)
October 25, Lecture 15:
Topics: Interval estimators and confidence intervals. Confidence intervals for population means. Dependencies between width (precision), confidence level, and sample size for confidence intervals.
Sections: 6.3, 6.4
Learning Objectives
Homework
Confidence Interval Demo (On Campus) , Confidence Interval Demo (Off Campus)
October 30, Lecture 16:
Topics: The normal approximation to the binomial distribution. Hypothesis tests for proportions. Confidence intervals for proportions.
Sections: 5.6, 7.3
Learning Objectives
Homework
Normal Approximation to Binomial Demo (On Campus) , Normal Approximation to Binomial Demo (Off Campus)
Presidential Approval Rating Example
Monmouth Polling Institute Example
November 1, Lecture 17:
Topics: Sample size and precision for population proportions. Categorical data and the multinomial distribution. The $$\chi^{2}$$ (chi-squared) statistic for testing claims about multinomial proportions. Using Minitab to perform one-sample $$\chi^{2}$$ goodness-of-fit tests.
Sections: 6.2, 10.1, 10.2
Learning Objectives
Homework
Demo of $$\chi^{2}$$ (chi-squared) density histogram
Table of Critical Values for $$\chi^{2}$$ Statistic
November 6, Lecture 18:
Topics: Tests involving two proportions with independent samples. Using Minitab. Review for Exam 3.
Sections: 8.2
Learning Objectives
Homework, Due November 13
November 8, Lecture 19:
Topics: Exam.
Sections: Exam on Chapters 6-7
Exam 3 Study Guide
Exam 3 Take Home Procedure
November 13, Lecture 20:
Topics: Tests of independence.
Sections: 10.3
Learning Objectives
Homework
November 15, Lecture 21:
Topics: Assessing normality from sample data. Tests involving two population means with independent samples. Using Minitab.
Sections: 5.7, 8.3
Learning Objectives
Homework
Minitab Project to Demonstrate Assessing Normality
Sections: 8.3
November 20, Lecture 22:
Topics: Tests involving two population means with paired samples. Power compared to two sample tests with independent samples.
Sections: 8.4
Learning Objectives
Homework
Minitab Project for Weight Loss Study
Minitab Project for In-class Practice
November 27, Lecture 23:
Topics: Association between two quantitative variables. Scatterplots. Tests for correlation between two quantitative variables. Correlation vs. Causation: Redux.
Sections: 9.1, 9.2
Learning Objectives
Homework
Linear Correlation Demo (On Campus) , Linear Correlation Demo (Off Campus)
Minitab Project for Body Fat Prediction
Minitab Project for Twin IQ Prediction
Tyler Vigen's Spurious Correlations
November 29, Lecture 24:
Topics: Simple linear regression. Trendlines and interpreting the slope and intercept. Whence and wherefore the 'line of best fit'? Evaluating a model: the good, the bad, and the R2.
Sections: 9.3, 9.4
Learning Objectives
Homework
Demo for Refresher on Equation of a Line
Linear Regression Demo (On Campus) , Linear Regression Demo (Off Campus)
December 4, Lecture 25:
Topics: Simple linear regression continued. Assumptions of simple linear regression. Testing the assumptions of simple linear regression. Confidence intervals for coefficients. Hypothesis tests for coefficients.
Sections: 9.3, 9.4
Learning Objectives
Homework
Handout on Simple Linear Regression Model
Demo of Diagnostic Plots for Simple Linear Model (On Campus) , Demo of Diagnostic Plots for Simple Linear Model (Off Campus)
Minitab Project for College GPA Prediction. Data from this textbook.
December 6, Lecture 26:
Topics: Multiple linear regression. Interpreting the coefficients of a multiple linear regression. Including binary (i.e. 0/1) variables in a multiple linear regression. Model selection.
Sections: 9.5
Learning Objectives
Homework, Note: This homework is due at the start of the final on December 18.
Minitab Project with Subset of SAT+GPA Data
Minitab Project with Brain Size and IQ Data
December 18, Final Exam:
Time: 11:35 AM - 2:25 PM
Location: Howard Hall 308 (HH 308)
Final Exam Study Guide