Problems

Instructions

For each of the following problems, answer all parts. Complete the problems in R using an R Markdown Notebook. Submit both your .Rmd and .nb.html file to eCampus by the beginning of Lecture 9.

Problem 1. Discrete Uniform Random Variable

Experiment: We place five marbles in a jar, where there is one marble each of red, orange, yellow, green, and blue. We then shake the jar, and extract a single marble.

  1. Write out the sample space for this experiment, and assign the appropriate probability to each simple event from the sample space.

  2. Let \(U\) be position of the extracted marble’s color in the standard ordering of the colors, i.e. its position in ROYGBIV. State how \(U\) maps each element of the sample space to the element in its range.

  3. State the probability mass function and cumulative distribution function for \(U\), and sketch a probability histogram for \(U\).

  4. Simulate three draws from the jar using the following command in R. Write down the resulting value of \(U(s)\), and give the corresponding element(s) \(s\) of the sample space

sample(1:5, size = 1)
  1. Now imagine that you repeat this procedure a large number of time. The following code will repeatedly sample from the jar, with replacement, and record the outcome of each experiment. Let \(U_{i}\) be the measurement on the \(i\)-th experiment. For any \(n\), define the cumulative sum of the first \(n\) experiments as \[ S_{n} = \sum_{i = 1}^{n} U_{i},\] and the sample mean of the first \(n\) experiments as \[ \bar{U}_{n} = S_{n}/n.\] The function cumsum (for cumulative sum) will compute \(S_{n}\), and then dividing by \(n\) gives the the sample mean. Describe and sketch the output of the plot of \(\bar{U}_{n}\) versus \(n\). What happens for small \(n\)? What value does the sample mean approach for large \(n\)? Do you recognize this value?
N = 10000

u = sample(1:5, size = N, replace = TRUE)

n = 1:N

S = cumsum(u)

Ubar = S/n

plot(n, Ubar, type = 'l')

Problem 2. Bernoulli Random Variable

Experiment: A baseball player has a batting average of 0.200. This means that for each at bat, he hits the ball 20% of the time. Suppose we denote a hit by H and a miss by M. We record the players performance on a single at bat.

  1. Write out the sample space for this experiment, and assign the appropriate probability to each simple event from the sample space.

  2. Let \(B\) be a binary coding of the player’s performance on a single at bat, where a hit is recorded as a 1 and a miss is recorded as a 0. State how \(B\) maps each element of the sample space to the element in its range.

  3. State the probability mass function and cumulative distribution function for \(B\), and sketch a probability histogram for \(B\).

  4. Simulate three at bats using the following command in R. Write down the resulting value of \(B(s)\), and give the corresponding element(s) \(s\) of the sample space

rbinom(n = 1, size = 1, prob = 0.2)
  1. Now imagine that you watch the player for a large number at bats. The following code simulates those at bats and records the outcome. Let \(B_{i}\) be the measurement on the \(i\)-th at bat. For any \(n\), define the cumulative sum of the first \(n\) experiments as \[ S_{n} = \sum_{i = 1}^{n} B_{i},\] and the sample mean of the first \(n\) experiments as \[ \bar{B}_{n} = S_{n}/n.\] The function cumsum (for cumulative sum) will compute \(S_{n}\), and then dividing by \(n\) gives the the sample mean. Describe and sketch the output of the plot of \(\bar{B}_{n}\) versus \(n\). What happens for small \(n\)? What value does the sample mean approach for large \(n\)? Do you recognize this value?
N = 10000

b = rbinom(n = N, size = 1, prob = 0.2)

n = 1:N

S = cumsum(b)

Bbar = S/n

plot(n, Bbar, type = 'l')

Problem 3. Geometric Random Variable

Experiment: We observe the pollination pattern of a bee as it flies from flower to flower. The bee only occasionally stays on a certain flower long enough to pollinate it. Suppose we observe the flower visitation pattern of the bee until it stays on a flower to pollinate it. Denote the outcome that it does not stay on a flower by N and the outcome that it does stay on a flower by S. Suppose from prior fieldwork, we know that bees stays at 1% of flowers.

  1. Write out the sample space for this experiment, and assign the appropriate probability to each simple event from the sample space.

  2. Let \(C\) be the number of flowers that the bee lands on before it stays to pollinate a flower. State how \(C\) maps each element of the sample space to the element in its range.

  3. State the probability mass function and cumulative distribution function for \(C\), and sketch a probability histogram for \(C\).

  4. Simulate three observations of the bee using the following command in R. Write down the resulting value of \(C(s)\), and give the corresponding element(s) \(s\) of the sample space

rgeom(n = 1, prob = 0.01)
  1. Now imagine that you observe the bee over several days. The following code simulates several days of observations, and records their outcomes. Let \(C_{i}\) be the measurement on the \(i\)-th day. For any \(n\), define the cumulative sum of the first \(n\) experiments as \[ S_{n} = \sum_{i = 1}^{n} C_{i},\] and the sample mean of the first \(n\) experiments as \[ \bar{C}_{n} = S_{n}/n.\] The function cumsum (for cumulative sum) will compute \(S_{n}\), and then dividing by \(n\) gives the the sample mean. Describe and sketch the output of the plot of \(\bar{C}_{n}\) versus \(n\). What happens for small \(n\)? What value does the sample mean approach for large \(n\)? Do you recognize this value?
N = 10000

c = rgeom(n = N, prob = 0.01)

n = 1:N

S = cumsum(c)

Cbar = S/n

plot(n, Cbar, type = 'l')